Introduction to Datasets
Datasets store data in a disconnected cache. The structure of a dataset is similar to that of a relational database; it exposes a hierarchical object model of tables, rows, and columns. In addition, it contains constraints and relationships defined for the dataset.
Note You use datasets if you want to work with a set of tables and rows while disconnected from the data source. Using a dataset is not always an optimal solution for designing data access. For more information, see Recommendations for Data Access Strategies.
You can create and manipulate datasets using the following portions of the .NET Framework namespaces.
The fundamental parts of a dataset are exposed to you through standard programming constructs such as properties and collections. For example:
- The DataSet class includes the Tables collection of data tables and the Relations collection of DataRelation objects.
- The DataTable class includes the Rows collection of table rows, the Columns collection of data columns, and the ChildRelations and ParentRelations collections of data relations.
- The DataRow class includes the RowState property, whose values indicate whether and how the row has been changed since the data table was first loaded from the database. Possible values for the RowState property include Deleted, Modified, New, and Unchanged.
Datasets, Schemas, and XML
An ADO.NET dataset is one view — a relational view — of data that can be represented in XML. In Visual Studio and the .NET Framework, XML is the format for storing and transmitting data of all kinds. As such, datasets have a close affinity with XML. This relationship between datasets and XML enables you to take advantage of the following features of datasets:
- The structure of a dataset — its tables, columns, relationships, and constraints — can be defined in an XML Schema. XML Schemas are a standards-based format of the W3C (World Wide Web Consortium) for defining the structure of XML data. Datasets can read and write schemas that store structured information using the ReadXmlSchema and WriteXmlSchema methods. If no schema is available, the dataset can infer one (via its InferXmlSchema method) from data in an XML document that is structured in a relational way. For more information about datasets and schemas, see Introduction to XML Schemas.
- You can generate a dataset class that incorporates schema information to define its data structures (such as tables and columns) as class members. (See "Typed versus Untyped Datasets" below).
- You can read an XML document or stream into a dataset using the dataset's ReadXML method and write a dataset out as XML using the dataset's WriteXML method. Because XML is a standard interchange format for data between different applications, this means that you can load a dataset with XML-formatted information sent by other applications. Similarly, a dataset can write out its data as an XML stream or document, to be shared with other applications or simply stored in a standard format.
- You can create an XML view (an XMLDataDocument object) of the contents of a dataset, and then view and manipulate the data using either relational methods (via the dataset) or XML methods. The two views are automatically synchronized as they are changed.
Typed versus Untyped Datasets
Datasets can be typed or untyped. A typed dataset is a dataset that is first derived from the base DataSet class and then uses information in an XML Schema file (an .xsd file) to generate a new class. Information from the schema (tables, columns, and so on) is generated and compiled into this new dataset class as a set of first-class objects and properties.
Note For more information about dataset schemas, see XML Schemas and Data.
Because a typed DataSet class inherits from the base DataSet class, the typed class assumes all of the functionality of the DataSet class and can be used with methods that take an instance of a DataSet class as a parameter
An untyped dataset, in contrast, has no corresponding built-in schema. As in a typed dataset, an untyped dataset contains tables, columns, and so on — but those are exposed only as collections. (However, after manually creating the tables and other data elements in an untyped dataset, you can export the dataset's structure as a schema using the dataset's WriteXmlSchema method.)
You can use either type of dataset in your applications. However, Visual Studio has more tool support for typed datasets, and they make programming with the dataset easier and less error-prone.
Contrasting Data Access in Typed and Untyped Datasets
The class for a typed dataset has an object model in which its tables and columns become first-class objects in the object model. For example, if you are working with a typed dataset, you can reference a column using code such as the following:
' Visual Basic ' This accesses the CustomerID column in the first row of ' the Customers table. Dim s As String s = dsCustomersOrders1.Customers(0).CustomerID // C# // This accesses the CustomerID column in the first row of // the Customers table. string s; s = dsCustomersOrders1.Customers.CustomerID;
In contrast, if you are working with an untyped dataset, the equivalent code is:
' Visual Basic Dim s As String s = CType(dsCustomersOrders1.Tables("Customers").Rows(0).Item("CustomerID"), String) // C# string s = (string) dsCustomersOrders1.Tables["Customers"].Rows["CustomerID"];
Typed access is not only easier to read, but is fully supported by IntelliSense in the Visual Studio Code Editor. In addition to being easier to work with, the syntax for the typed dataset provides type checking at compile time, greatly reducing the possibility of errors in assigning values to dataset members. Access to tables and columns in a typed dataset is also slightly faster at run time because access is determined at compile time, not through collections at run time.
Even though typed datasets have many advantages, there are a variety of circumstances under which an untyped dataset is useful. The most obvious scenario is that no schema is available for the dataset. This might occur, for example, if your application is interacting with a component that returns a dataset, but you do not know in advance what its structure is. Similarly, there are times when you are working with data that does not have a static, predictable structure; in that case, it is impractical to use a typed dataset, because you would have to regenerate the typed dataset class with each change in the data structure.
More generally, there are many times when you might create a dataset dynamically without having a schema available. In that case, the dataset is simply a convenient structure in which you can keep information, as long as the data can be represented in a relational way. At the same time, you can take advantage of the dataset's capabilities, such as the ability to serialize the information to pass to another process, or to write out an XML file.
Dataset Case Sensitivity
Within a dataset, table and column names are by default case-insensitive — that is, a table in a dataset called "Customers" can also be referred to as "customers." This matches the naming conventions in many databases, including SQL Server, where the names of data elements cannot be distinguished only by case.
Note Unlike datasets, XML documents are case-sensitive, so the names of data elements defined in schemas are case-sensitive. For example, schema protocol allows the schema to contain define a table called "Customers" and a different table called "customers." This can result in name collisions when a schema is used to generate a dataset class. For more information, see XML Elements, Attributes, and Types.
However, case sensitivity can be a factor in how data is interpreted within the dataset. For example, if you filter data in a dataset table, the search criteria might return different results depending on whether the comparison is case-sensitive or not. You can control the case sensitivity of filtering, searching, and sorting by setting the dataset's CaseSensitive property. All the tables in the dataset inherit the value of this property by default. (You can override this property for each individual table.)
A dataset is a container; therefore, you need to fill it with data. When you populate a dataset, various events are raised, constraint checking applies, and so on. For more information about updating a dataset and the issues surrounding updates, see Dataset Updates in Visual Studio .NET.
You can populate a dataset in a variety of ways:
- Call the Fill method of a data adapter. This causes the adapter to execute an SQL statement or stored procedure and fill the results into a table in the dataset. If the dataset contains multiple tables, you probably have separate data adapters for each table, and must therefore call each adapter's Fill method separately.
For more information about how datasets are filled, see Introduction to Data Adapters and Creating Data Adapters. For more information on how to use a data adapter to populate a dataset, see Populating a DataSet from a DataAdapter.
- Manually populate tables in the dataset by creating DataRow objects and adding them to the table's Rows collection. (You can only do this at run time; you cannot set the Rows collection at design time.) For more information, see Adding Data to a Table.
- Read an XML document or stream into the dataset. For more information, see the ReadXml method.
- Merge (copy) the contents of another dataset. This scenario can be useful if your application gets datasets from different sources (different XML Web services, for example), but needs to consolidate them into a single dataset. For more information, see the DataSet.Merge method.
Record Position and Navigation in Datasets
Because a dataset is a fully disconnected container for data, datasets (unlike ADO recordsets) do not need or support the concept of a current record. Instead, all records in the dataset are available.
Because there is no current record, there is no specific property that points to a current record and there are no methods or properties for moving from one record to another. (In contrast, ADO recordsets support an absolute record position and methods to move from one record to the next.) You can access individual tables in the dataset as objects; each table exposes a collection of rows. You can treat this like any collection, accessing rows via the collection's index or using collection-specific statements in your programming language.
Note If you are binding controls in a Windows Forms to a dataset, you can use the form's binding architecture to simplify access to individual records. For more information, see Navigating Data in Windows Forms.
Related Tables and DataRelation Objects
If you have multiple tables in a dataset, the information in the tables might be related. A dataset has no inherent knowledge of such relationships; to work with data in related tables, therefore, you can create DataRelation objects that describe the relations between the tables in the dataset. DataRelation objects can be used to programmatically fetch related child records for a parent record, and a parent record from a child record.
For example, imagine customer and order data such as that in the Northwind database. The Customers table might contain records such as the following:
CustomerID CompanyName City ALFKI Alfreds Futterkiste Berlin ANTON Antonio Moreno Taquerias Mexico D.F. AROUT Around the Horn London
The dataset might also contain another table with order information. The Orders table contains a customer ID as a foreign key column. Selecting only some of the columns in the Orders table, it might look like the following:
OrderId CustomerID OrderDate 10692 ALFKI 10/03/1997 10702 ALFKI 10/13/1997 10365 ANTON 11/27/1996 10507 ANTON 4/15/1997
Because each customer can have more than one order, there is a one-to-many relationship between customers and orders. For example, in the table above, the customer ALFKI has two orders.
You can use a DataRelation object to get related records from a child or parent table. For example, when working with the record describing the customer ANTON, you can get the collection of records describing the orders for that customer. Similarly, if you are working with the record describing order number 10507, you can use a DataRelation object to get the record describing the customer for that order (ANTON).
As in most databases, datasets support constraints as a way to ensure the integrity of data. Constraints are rules that are applied when rows are inserted, updated, or deleted in a table. You can define two types of constraints:
- A unique constraint that checks that the new values in a column are unique in the table.
- A foreign-key constraint that defines rules for how related child records should be updated when a record in a master table is updated or deleted.
In a dataset, constraints are associated with individual tables (foreign-key constraints) or columns (a unique constraint, one that guarantees that column values are unique). Constraints are implemented as objects of type UniqueConstraint or ForeignKeyConstraint. They are then added to a table's Constraints collection. A unique constraint can alternatively be specified by simply setting a data column's Unique property to true.
The dataset itself supports a Boolean EnforceConstraints property that specifies whether constraints will be enforced or not. By default, this property is set to true. However, there are times when it is useful to temporarily turn constraints off. Most often, this is when you are changing a record in such a way that it will temporarily cause an invalid state. After completing the change (and thereby returning to a valid state), you can re-enable constraints.
In Visual Studio, you create constraints implicitly when you define a dataset. By adding a primary key to a dataset, you implicitly create a unique constraint for the primary-key column. You can specify a unique constraint for other columns by setting their Unique property to true.
You create foreign-key constraints by creating a DataRelation object in a dataset. In addition to allowing you to programmatically get information about related records, a DataRelation object allows you to define foreign-key constraint rules.
For more information about using DataRelation objects as foreign-key constraints, see Introduction to DataRelation Objects. For more information about creating constraints programmatically, see Adding Constraints to a Table.
Updating Datasets and Data Stores
When changes are made to records in the dataset, the changes have to be written back to the database. To write changes from the dataset to the database, you call the Update method of the data adapter that communicates between the dataset and its corresponding data source.
The DataRow class used to manipulate individual records includes the RowState property, whose values indicate whether and how the row has been changed since the data table was first loaded from the database. Possible values include Deleted, Modified, New, and Unchanged. The Update method examines the value of the RowState property to determine which records need to be written to the database and what specific database command (add, edit, delete) should be invoked.
For more information about updating data, see Dataset Updates in Visual Studio .NET.
Introduction to Data Access with ADO.NET | Introduction to Data Adapters | Visual Studio Tools for Creating Datasets | Creating Typed Datasets with the Component Designer | Creating XML Schemas and Datasets | Adding Existing Typed Datasets to a Form or Component | Adding Untyped Datasets to a Form or Component | Data Walkthroughs