October 2010

Volume 25 Number 10

Data Points - Entity Framework Preview: code first, ObjectSet and DbContext

By Julie Lerman | October 2010

While the Entity Framework (EF) 4 was still in beta, the development team began work on another way to use it. We had the database-first way of creating a model (by reverse engineering a database into an Entity Data Model) and the new EF 4 model-first feature (define an Entity Data Model, then create the database from it). In addition, the team decided to create something called “code first.”

With code first you begin with code—not the database and not the model. In fact, with code first there’s no visual model and no XML describing that model. You simply create the classes for your application domain and code first will enable them to participate in the EF. You can use a context, write and execute LINQ to Entities queries and take advantage of EF change-tracking. But there’s no need to deal with a model in a designer. When you aren’t building highly architected apps, you can almost forget that the EF is even there, handling all of your database interaction and change-tracking for you.

Domain-driven development (DDD) fans seem to love code first. In fact, it was a number of DDD gurus who helped the EF team understand what code first could do for DDD programmers. Check out Danny Simmons’ blog post about the Data Programmability Advisory Council for more about that (blogs.msdn.com/b/dsimmons/archive/2008/06/03/dp-advisory-council.aspx).

But code first wasn’t ready for inclusion in the Microsoft .NET Framework 4 and Visual Studio 2010. In fact, it’s still evolving and Microsoft is providing developers access to the bits to play with and provide feedback through the Entity Framework Feature CTP (EF Feature CTP). The fourth iteration of this community technology preview (CTP4) was released in mid-July 2010. The changes in this iteration were significant.

In addition to including the code-first assembly in the CTP4, Microsoft has been working on some great new ideas for the next iteration of the EF. Those are also included in CTP4 so that people can play with them and, again, provide feedback. More importantly, code first takes advantage of some of these new features.

You can find a number of detailed walkthroughs on this CTP in blog posts by the EF team (tinyurl.com/297xm3j), Scott Guthrie (tinyurl.com/29r3qkb) and Scott Hanselman (tinyurl.com/253dl2g).

In this column, I want to focus on some of the particular features and improvements in the CTP4 that have generated an incredible amount of buzz around the EF because they simplify developing with the EF—and the .NET Framework in general—so dramatically.

Improvements to Core API

One of the noteworthy improvements that the CTP4 adds to the EF runtime is a lightweight version of two workhorse classes, ObjectContext and ObjectSet. ObjectContext enables querying, change-tracking and saving back to the database. ObjectSet encapsulates sets of like objects.

The new classes, DbContext and DbSet, have the same essential functions, but don’t expose you to the entire feature set of ObjectContext and ObjectSet. Not every coding scenario demands access to all of the features in the more robust classes. DbContext and DbSet are not derived from ObjectContext and ObjectSet, but DbContext provides easy access to the full-featured version through a property: DbContext.ObjectContext.

ObjectSet Basics and the Simplified DbSet

An ObjectSet is a generic collection-like representation of a set of entity types. For example, you can have an ObjectSet<Customer> called Customers that’s a property of a context class. LINQ to Entities queries are written against ObjectSets. Here’s a query against the Customers ObjectSet:

from customer in context.Customers 
where customer.LastName=="Lerman" 
select customer

ObjectSet has an internal constructor and does not have a parameterless constructor. The way to create an ObjectSet is through the ObjectContext.CreateObjectSet method. In a typical context class, the Customers property would be defined as:

public ObjectSet< Customer> Customers {
  get { 
    return _ customers?? (_customers = 
      CreateObjectSet< Customer >(" Customers"));
  }
}
private ObjectSet< Customer > _ customers;

Now you can write queries against Customers and manipulate the set, adding, attaching and deleting objects as necessary.

DbSet is much simpler to work with. DbSet isn’t publicly constructible (similar to ObjectSet) but it will auto-create DbSets and assign them to any properties (with public setters) you declare on your derived DbContext.

The ObjectSet collection methods are geared toward the EF terminology—AddObject, Attach and DeleteObject:

context.Customers.AddObject(myCust)

DbSet simply uses Add, Attach and Remove, which is more in line with other collection method names, so you don’t have to spend time figuring out “how to say that in the EF,” like so:

context.Customers.Add(MyCust)

The use of the Remove method instead of DeleteObject also more clearly describes what the method is doing. It removes an item from a collection. DeleteObject suggests that the data will be deleted from the database, which has caused confusion for many developers.

By far my favorite feature of DbSet is that it lets you work more easily with derived types. ObjectSet wraps base types and all of the types that inherit from it. If you have a base entity, Contact and a Customer entity that derives from it, every time you want to work with customers, you’ll have to start with the Contacts ObjectSet. For example, to query for customers, you write context.Contacts.OfType<Customer>. That’s not only confusing, it’s also definitely not easily discoverable. DbSet lets you wrap the derived types, so now you can create a property Customers that returns DbSet<Customer> and enables interaction with that directly rather than through the Contacts set.

DbContext—a Streamlined Context

DbContext exposes the most commonly used features of the ObjectContext and provides some additional ones that are truly beneficial. My two favorite features so far of DbContext are the generic Set property and the OnModelCreating method.

All of those ObjectSets I’ve referred to so far are explicit properties of an ObjectContext instance. For example, say you have a model called PatientHistory that has three entities in it: Patient, Address and OfficeVisit. You’ll have a class, PatientHistoryEntities, which inherits ObjectContext. This class contains a Patients property that’s an ObjectSet<Patient> as well as an Addresses property and an OfficeVisits property. If you want to write dynamic code using generics, you must call context.CreateObjectSet<T> where T is one of your entity types. Again, this is just not discoverable.

DbContext has a simpler method called Set that lets you simply call context.Set<T>, which will return a DbSet<T>. It may only look like 12 less letters, but to my coding brain, using the property feels right, whereas calling a factory method doesn’t. You can also use derived entities with this property.

Another DbContext member is OnModelCreating, which is useful in code first. Any additional configurations you want to apply to your domain model can be defined in OnModelCreating just before the EF builds the in-memory metadata based on the domain classes. This is a big improvement over the previous versions of code first. You’ll see more about this further on.

Code First Gets Smarter and Easier

Code first was first presented to developers as part of the EF Feature CTP1 in June 2009 with the name “code only.” The basic premise behind this variation of using the EF was that developers simply want to define their domain classes and not bother with a physical model. However, the EF runtime depends on that model’s XML to coerce queries against the model into database queries and then the query results from the database back into objects that are described by the model. Without that metadata, the EF can’t do its job. But the metadata does not need to be in a physical file. The EF reads those XML files once during the application process, creates strongly typed metadata objects based on that XML, and then does all of that interaction with the in-memory XML.

Code first creates in-memory metadata objects, too. But instead of creating it by reading XML files, it infers the metadata from the domain classes (see Figure 1). It uses convention to do this and then provides a means by which you can add additional configurations to further refine the model.

image: Code First Builds the Entity Data Model Metadata at Run Time

Figure 1 Code First Builds the Entity Data Model Metadata at Run Time

Another important job of code first is to use the metadata to create a database schema and even the database itself. Code first has provided these features since its earliest public version.

Here’s an example of where you’d need a configuration to overcome some invalid assumptions. I have a class called ConferenceTrack with an identity property called TrackId. Code-first convention looks for “Id” or class name + “Id” as an identity property to be used for an entity’s EntityKey and a database table’s primary key. But TrackId doesn’t fit this pattern, so I have to tell the EF that this is my identity key.

The new code first ModelBuilder class builds the in-memory model based on the classes described earlier. You can further define configurations using ModelBuilder. I’m able to specify that the ConferenceTrack entity should use its TrackId property as its key with the following ModelBuilder configuration:

modelBuilder.Entity<ConferenceTrack>().HasKey(
  ct => ct.TrackId);

ModelBuilder will now take this additional information into account as it’s creating the in-memory model and working out the database schema.

Applying Configurations More Logically

This is where DbContext.OnModelCreating comes in so nicely. You can place the configurations in this method so that they’ll be applied as the model is being created by the DbContext:

protected override void OnModelCreating(
  ModelBuilder modelBuilder) { 
  modelBuilder.Entity<ConferenceTrack>().HasKey(
    ct => ct.TrackId); 
}

Another new feature added in the CTP4 is an alternative way to apply configurations through attributes in the classes. This technique is called data annotations. The same configuration can be achieved by applying the Key annotation directly to the TrackId property in the ConferenceTrack class:

[Key] 
public int TrackId { get; set; }

Definitely simpler, however, my personal preference is to use the programmatic configurations so that the classes don’t have to have any EF-specific knowledge in them.

Using this approach also means DbContext will take care of caching the model so constructing further DbContexts doesn’t incur model discovery cost again.

Relationships Get Easier

One of the most noteworthy improvements to code first is it’s much smarter about making assumptions from the classes. While there are many improvements to these conventions, I find the enhanced relationship conventions to have affected my code the most.

Even though relationships are defined in your classes, in previous CTPs it was necessary to provide configuration information to define these relationships in the model. In addition, the configurations were neither pretty nor logical. Now code first can correctly interpret the intent of most relationships defined in classes. In cases where you need to tweak the model with some configurations, the syntax is much simpler.

My domain classes have a number of relationships defined through properties. ConferenceTrack for example, has this one-to-many relationship:

public ICollection<Session> Sessions { get; set; }

Session has the converse relationship as well as a many-to-many:

public ConferenceTrack ConferenceTrack { get; set; }
public ICollection<Speaker> Speakers { get; set; }

Using my classes and a single configuration (to define the TrackId as the key for conferences), the model builder created the database with all of the relationships and inheritances shown in Figure 2.

image: Model Building-Created Database Structure

Figure 2 Model Building-Created Database Structure

Notice the Sessions_Speakers table created for the many-to-many relationship. My domain has a Workshop class that inherits from Session. By default, code first assumes Table Per Hierarchy inheritance, so it created a discriminator column in the Sessions table. I can use a configuration to change its name to IsWorkshop or even to specify that it should create a Table per Type inheritance instead.

Planning Ahead for These New Features

These compelling new features that you have early access to in CTP4 might be a source of frustration for developers who are saying “I want it now!” It’s true that this is just a preview and you can’t put it in production today, but certainly play with these bits if you’re able to and provide your feedback to the EF team to help make it even better. You can begin planning ahead for how you’ll use code first and the other new EF features in your upcoming applications.


Julie Lerman is a Microsoft MVP, .NET mentor and consultant who lives in the hills of Vermont. You can find her presenting on data access and other Microsoft .NET topics at user groups and conferences around the world. Lerman blogs at thedatafarm.com/blog and is the author of the highly acclaimed book, “Programming Entity Framework” (O’Reilly Media, 2009). Follow her on Twitter.com: julielerman.

Thanks to the following technical expert for reviewing this article: Rowan Miller