Datasets in Visual Studio Overview
Datasets are objects that contain data tables where you can temporarily store the data for use in your application. If your application requires working with data, you can load the data into a dataset, which provides your application with a local in-memory cache of the data to work with. You can work with the data in a dataset even if your application becomes disconnected from the database. The dataset maintains information about changes to its data so updates can be tracked and sent back to the database when your application becomes reconnected.
The structure of a DataSet is similar to that of a relational database; it exposes a hierarchical object model of tables, rows, columns, constraints, and relationships.
Datasets can be typed or untyped. (For more information, see the section below titled "Typed Versus Untyped Datasets.") Typed datasets derive their schema (table and column structure) from .xsd files and are easier to program against. You can use either typed or untyped datasets in your applications. However, Visual Studio has more tool support for typed datasets, and programming with them is easier and less error-prone.
You create typed datasets by running the Data Source Configuration Wizard, or by adding a DataSet item through the Add New Item command on the Project menu. For more information, see How to: Create a Typed Dataset.
After creating datasets, edit them in the Dataset Designer.
You create and work with typed and untyped datasets using the following portions of the .NET Framework namespaces.
The objects of a dataset are exposed to you through standard programming constructs such as properties and collections. For example:
The DataTable class includes the DataRowCollection collection of table rows, the DataColumnCollection collection of data columns, and the ChildRelations and ParentRelations collections of data relations.
The DataRow class includes the RowState property, whose values indicate whether and how the row has been changed since the data table was first loaded from the database. Possible values for the RowState property include Deleted, Modified, Added, and Unchanged.
A dataset contains no actual data by default. Filling a dataset with data actually refers to loading data into the individual DataTable objects that make up the dataset. You fill the data tables by executing TableAdapter queries, or executing data adapter (for example, SqlDataAdapter) commands. When you fill a dataset with data, various events are raised, constraints are checked, and so on. For more information about loading data into a dataset, see Fetching Data into Your Application.
The code to fill a dataset is automatically added to the form load event handler when you drag items from the Data Sources Window onto a form in your Windows application. For more information, complete the following walkthrough: Walkthrough: Displaying Data on a Form in a Windows Application.
An example of filling a dataset with a TableAdapter:
You can populate a dataset in a variety of ways:
If you created the dataset using design-time tools such as one of the data wizards, call the Fill method of a TableAdapter. (TableAdapters are created with a default Fill method, but you are given the opportunity to change the name, so the actual method name may be different.) For more information, see the "Filling a Dataset using a TableAdapter" section of How to: Fill a Dataset with Data.
Manually populate tables in the dataset by creating DataRow objects and adding them to the table's DataRowCollection collection. (You can only do this at run time; you cannot set the DataRowCollection collection at design time.) For more information, see Adding Data to a DataTable (ADO.NET).
Merge (copy) the contents of one dataset with another. This scenario can be useful if your application gets datasets from different sources (different XML Web services, for example), but needs to consolidate them into a single dataset. For more information, see the Merging DataSet Contents (ADO.NET).
Merge (copy) the contents of one DataTable with another.
When changes are made to records in the dataset, the changes have to be written back to the database. To write changes from the dataset to the database, you call the Update method of the TableAdapter or DataAdapter that communicates between the dataset and its corresponding database.
When using the data design tools in Visual Studio send data back to a database by calling the TableAdapter's Update method and passing in the data table you want to save. For example:
For finer control of the update process call one of the TableAdapter DBDirect methods where you can pass in individual values for each data row. For more information, see How to: Update Data by Using a TableAdapter, and Walkthrough: Saving Data with the TableAdapter DBDirect Methods.
The DataRow class used to manipulate individual records includes the RowState property, whose values indicate whether and how the row has been changed since the data table was first loaded from the database. Possible values include Deleted, Modified, Added, and Unchanged. The Update methods of the TableAdapter and DataAdapter examine the value of the RowState property to determine which records need to be written to the database and what specific database command (InsertCommand, UpdateCommand, and DeleteCommand) should be invoked.
For more information about updating data, see Saving Data.
Because a dataset is a fully disconnected container for data, datasets (unlike ADO recordsets) do not support the concept of a current record. Instead, all records in the dataset are available at any time.
Because there is no current record, there is no specific property that points to a current record and there are no methods or properties for moving from one record to another (see the note below). You can access individual tables in the dataset as objects; each table exposes a collection of rows. You can treat this like any collection, accessing rows by means of the collection's index or using collection-specific statements in your programming language.
For example, you can get the fourth row of the Customers table with the following code:
A dataset is a relational view of data that can be represented in XML. This relationship between datasets and XML enables you to take advantage of the following features of datasets:
The structure of a dataset — its tables, columns, relationships, and constraints — can be defined in an XML schema. Datasets can read and write schemas that store structured information using the ReadXmlSchema and WriteXmlSchema methods. If no schema is available, the dataset can infer one (through its InferXmlSchema method) from data in an XML document that is structured in a relational way. For more information about XML schemas, see Building XML Schemas.
You can generate a dataset class that incorporates schema information to define its data structure. This is known as a typed dataset. For information on creating a typed dataset, see How to: Create a Typed Dataset.
You can read an XML document or stream into a dataset using the dataset's ReadXml method and write a dataset out as XML using the dataset's WriteXml method. Because XML is a standard interchange format for data between different applications, this means that you can load a dataset with XML-formatted information sent by other applications. Similarly, a dataset can write out its data as an XML stream or document, to be shared with other applications or simply stored in a standard format.
You can create an XML view (an XmlDataDocument object) of the contents of a dataset or data table, and then view and manipulate the data using either relational methods (by means of the dataset) or XML methods. The two views are automatically synchronized as they are changed.
A typed dataset is a dataset that is first derived from the base DataSet class and then uses information from the Dataset Designer, which is stored in an .xsd file, to generate a new strongly-typed dataset class. Information from the schema (tables, columns, and so on) is generated and compiled into this new dataset class as a set of first-class objects and properties. Because a typed dataset inherits from the base DataSet class, the typed class assumes all of the functionality of the DataSet class and can be used with methods that take an instance of a DataSet class as a parameter
An untyped dataset, in contrast, has no corresponding built-in schema. As in a typed dataset, an untyped dataset contains tables, columns, and so on — but those are exposed only as collections. (However, after manually creating the tables and other data elements in an untyped dataset, you can export the dataset's structure as a schema using the dataset's WriteXmlSchema method.)
Contrasting Data Access in Typed and Untyped Datasets
The class for a typed dataset has an object model in which its properties take on the actual names of the tables and columns. For example, if you are working with a typed dataset, you can reference a column using code such as the following:
In contrast, if you are working with an untyped dataset, the equivalent code is:
Typed access is not only easier to read, but is fully supported by IntelliSense in the Visual Studio Code Editor. In addition to being easier to work with, the syntax for the typed dataset provides type checking at compile time, greatly reducing the possibility of errors in assigning values to dataset members. If you change the name of a column in your DataSet and then compile your application, you receive a build error. By double-clicking the build error in the Task List, you can go directly to the line or lines of code that reference the old column name. Access to tables and columns in a typed dataset is also slightly faster at run time because access is determined at compile time, not through collections at run time.
Even though typed datasets have many advantages, there are a variety of circumstances under which an untyped dataset is useful. The most obvious scenario is when no schema is available for the dataset. This might occur, for example, if your application is interacting with a component that returns a dataset, but you do not know in advance what its structure is. Similarly, there are times when you are working with data that does not have a static, predictable structure; in that case, it is impractical to use a typed dataset, because you would have to regenerate the typed dataset class with each change in the data structure.
More generally, there are many times when you might create a dataset dynamically without having a schema available. In that case, the dataset is simply a convenient structure in which you can keep information, as long as the data can be represented in a relational way. At the same time, you can take advantage of the dataset's capabilities, such as the ability to serialize the information to pass to another process, or to write out an XML file.
Within a dataset, table and column names are by default case-insensitive — that is, a table in a dataset called "Customers" can also be referred to as "customers." This matches the naming conventions in many databases, including the default behavior of SQL Server, where the names of data elements cannot be distinguished only by case.
Unlike datasets, XML documents are case-sensitive, so the names of data elements defined in schemas are case-sensitive. For example, schema protocol allows the schema to define a table called "Customers" and a different table called "customers." This can result in name collisions when a schema that contains elements that differ only by case is used to generate a dataset class.
However, case sensitivity can be a factor in how data is interpreted within the dataset. For example, if you filter data in a dataset table, the search criteria might return different results depending on whether the comparison is case-sensitive or not. You can control the case sensitivity of filtering, searching, and sorting by setting the dataset's CaseSensitive property. All the tables in the dataset inherit the value of this property by default. (You can override this property for each individual table by setting the table's CaseSensitive property.)
If you have multiple tables in a dataset, the information in the tables might be related. A dataset has no inherent knowledge of such relationships; to work with data in related tables, therefore, you can create DataRelation objects that describe the relations between the tables in the dataset. For more information, see How to: Access Records in Related DataTables. DataRelation objects can be used to programmatically fetch related child records for a parent record and a parent record from a child record. For more information, see Relationships in Datasets. If your database contains relationships between two or more tables, the design tools will automatically create the DataRelation objects for you.
For example, imagine customer and order data such as that in the Northwind database. The Customers table might contain records such as the following:
CustomerID CompanyName City ALFKI Alfreds Futterkiste Berlin ANTON Antonio Moreno Taquerias Mexico D.F. AROUT Around the Horn London
The dataset might also contain another table with order information. The Orders table contains a customer ID as a foreign key column. Selecting only some of the columns in the Orders table, it might look like the following:
OrderId CustomerID OrderDate 10692 ALFKI 10/03/1997 10702 ALFKI 10/13/1997 10365 ANTON 11/27/1996 10507 ANTON 4/15/1997
Because each customer can have more than one order, there is a one-to-many relationship between customers and orders. For example, in the table above, the customer ALFKI has two orders.
You can use a DataRelation object to get related records from a child or parent table. For example, when working with the record describing the customer ANTON, you can get the collection of records describing the orders for that customer. For more information, see GetChildRows. Similarly, when working with the record describing the OrderId 10507, you can navigate up the relation object to get the record for its parent, ANTON. For more information, see GetParentRow.
As in most databases, datasets support constraints as a way to ensure the integrity of data. Constraints are rules that are applied when rows are inserted, updated, or deleted in a table. You can define two types of constraints:
A unique constraint that checks that the new values in a column are unique in the table.
A foreign-key constraint that defines rules for how related child records should be updated when a record in a master table is updated or deleted. For example, a foreign-key constraint verifies that there is a parent record before allowing the creation of any child records.
In a dataset, constraints are associated with individual tables (foreign-key constraints) or columns (a unique constraint, one that guarantees that column values are unique). Constraints are implemented as objects of type UniqueConstraint or ForeignKeyConstraint. They are then added to the Constraints collection of a DataTable. A unique constraint can alternatively be specified by simply setting the Unique property of a DataColumn to true.
The dataset itself supports a Boolean EnforceConstraints property that specifies whether constraints will be enforced or not. By default, this property is set to true. However, there are times when it is useful to temporarily turn constraints off. Most often, this is when you are changing a record in such a way that it will temporarily cause an invalid state. After completing the change (and thereby returning to a valid state), you can re-enable constraints.
In Visual Studio, you create constraints implicitly when you define a dataset. By adding a primary key to a dataset, you implicitly create a unique constraint for the primary-key column. You can specify a unique constraint for other columns by setting their Unique property to true.
You create foreign-key constraints by creating a DataRelation object in a dataset. In addition to allowing you to programmatically get information about related records, a DataRelation object allows you to define foreign-key constraint rules.
Extended properties provide name mappings when naming conflicts are encountered during the process of generating the dataset from a .xsd file. When an identifier in the .xsd file is different from the computed name created by the dataset generator, an extended property is added to the dataset in the msprop namespace. The following table displays the possible extended properties that can be generated: