Skip to main content

Model-First in the Entity Framework 4

Jon Fancey
Published July 2010

Summary: In this paper we'll look at the new Entity Framework 4 that ships with .NET Framework 4 and Visual Studio 2010. I'll discuss how you can approach it's usage from a model-first perspective with the premise that you can drive database design from a  model and build both your database as well as your data access layer declaratively from this model. The model contains the description of your data represented as entities and relationships providing a powerful approach to working with ADO.NET, creating a separation of concerns through an abstraction between a model definition and its implementation.

 

Introducing the Entity Framework

Microsoft's Entity Framework (EF) was released with .NET 3.5 SP1 nearly two years ago after a long gestation period through technical previews and betas. .NET Framework 4 sees the second version which is version aligned with the framework itself, hence EF4. Microsoft has listened to customers asks and feedback and added a large number of new and improved features to push for the goal of making EF a compelling way of modeling and interacting with data.

The Entity Framework is an object-relational mapping framework or O/RM. That is, it has the ability to map objects and their state to a relational data store. I may prefer this as a way to work with databases through an object model I'm familiar with as a developer rather than shifting gears and having to think in both C# and TSQL. People have been doing this for a decade or more and it is generally accepted practice to split an application into layers: presentation, business logic and data access at a minimum. The idea is that this affords isolation between the various concerns and thus enables change to be facilitated more easily, enables testable applications to be created and perhaps most importantly enables complexity management over highly scalable and distributed applications - oftentimes these tiers will physically reside on different servers. The concept of N-tier applications is, of course, not new and it is this experience that Microsoft has brought to bear on the list of improvements requested for EF4. The principle advantage of an O/RM is that it provides synchronization between the data access layer and the data storage provider, most typically a relational database. This synchronization is possible because EF generates both the database schema and the data access layer code to work with it.

However, the EF goes beyond the capabilities of an O/RM by providing a common data model that allows a conceptual model to be defined independently of the concrete data store that may be used. This Entity Data Model enables a great deal of flexibility allowing the EF to support a broader range of scenarios than just an O/RM could.The basic premise of the EF is that it is desirable to separate the representation of a data store from the implementation details of the store itself. In this way it becomes possible to create a model of entities that hold data in an object-oriented way. The decoupling that this provides then enables potentially any data store being targeted as the concrete realization of that model. This affords a number of benefits including the ability to model the data store in an abstract way and even chose different target data storage platforms for the same model.

EF4 significantly improves on one particular feature in EF1, model-first – the ability to start with a conceptual model and create the database from it.

 

Introducing Model-First

It's impossible to talk about the advantages and differences in a model-first approach without considering what else I could do first instead, so we'll look at that too and briefly compare the different approaches.

Model-first is the premise that I start with a clean slate, where I am free in all my expressive power to design whatever I like. EF supports this with the Entity Designer, shown in Figure 1, that presents this blank canvas.

Figure 1. Entity Framework Designer

 

With the Entity Designer, I can define entities which are abstractions representing the objects in my application domain. This is where I create my Entity Data Model (EDM) and is driven by an XML grammar from a model file extension of .EDMX – more on this later. As an example, I might create an Order entity that will represent instances of orders and their details such as who placed the order, on what date and so on. Then I might create related entities to hold additional information such as the items in the order and their cost and quantities. The modeling experience gives me a natural way to build up a description of the data in my application visually, playing around with it and changing it easily until I am happy. The real power of this comes though when I want to turn the model into reality. That's because EF can take the EDM and generate two things from it, the Data Definition Language (DDL) to create a database as the model's concrete representation and data access code to manipulate it.

The EF achieves these capabilities through a layered abstraction model inside the EDM as depicted in Figure 2. The way this works is that each layer feeds off the one above it. The EF designer in Visual Studio 2010 authors the EDM as an EDMX file containing the specification of the model in an XML grammar that EF understands. This part of the EDMX is called the Conceptual Schema Data Layer (CSDL). The designer provides a graphical way of creating entities, defining their content, relationships between them and their mappings, authoring the CSDL for you. At the other end of this stack is the Data Definition Language, or DDL, of the target data store. Note that I'm choosing my words carefully here because although this will most often be a relational database such as SQL Server 2008, technically, it doesn't have to be. For the purposes of the rest of this paper though I'll be assuming SQL Server is used as the data store.

So what this set of abstraction layers provides is the ability to generate data definitions in any supported storage system from the model itself. This is very powerful as it allows developers who may not have deep familiarity with the differences between each individual database such as Oracle DB or DB2 to work with the model abstraction and then target a particular platform.

Figure 2. Entity Data Model

 

As Figure 2, shows there are three layers to this abstraction, all contained within a single EDMX model file. The CSDL is what you see depicted graphically in the designer to create your entity data model. This is where you create entities and specify what data, or properties, they contain and their relationships with other entities. At the bottom of the EDMX stack is the Storage Schema Data Layer or SSDL. It is here that details such as table definitions and column types are expressed. In between the CSDL and SSDL is a Mapping Schema Layer or MSL. The job of this layer is to map the entities in the CSDL to the storage artifacts in the SSDL, thereby specifying the table and column definitions in terms of the entities and relationships in the model. This enables, for example, a single entity to be composed of the data from more than one table and is also what handles the differences in type systems between .NET and ultimately the target data store.

From the SSDL, the DDL is generated. It is at this point that a specific template to do this must be employed, targeted the actual data store to use such as SQL Server 2008. In the same way, a data access layer can be generated to work with this data store from the CSDL. Again, this follows a templated approach to code-gen the object model that you can program against to manipulate the store. We'll look at how all of this is achieved later in much more detail.

The astute amongst you might be wondering why, if the model can generate the DDL/database for you, are the mapping and storage layers required? The answer is actually one of the benefits of model-first - the MSL and SSDL are generated for you - from the model. They only require your attention when starting with a pre-existing database and reverse-engineering it to create model from it. Although all three layers are always required by EF (and therefore will be present in the EDMX), with model-first we only need to concern ourselves with the model and its corresponding CSDL.

 

Why Model-First?

As I've already mentioned, being able to do model-first in the EF was one of the most often requested features. The reason for this is that it is a more natural way to work when starting a new project. You want the ability to be able to create the database and access mechanism from your model. You want the ability to work with the model, creating entities that make sense in your business domain and their relationships, whilst still being able to easily change your mind or add to this during the initial stages of development. The model designer provides all of this and more.

In addition to this desirable way of working, the layered abstraction model means that changes can be more easily isolated between the object model (code) generated and the database to minimize the inevitable "ripple" effect caused by changes in one or the other. This is a benefit of the EF in general, but working from the model first provides less impedance than the other way around. This is especially true when you consider the differences in the type system between columns in SQL Server and .NET types such as Boolean where there is no direct equivalent.

Model-first is not the only approach possible with the EF though. If you have a pre-existing database or perhaps would like to design the database first exactly how you’d like it, you can import the database into EF and the EDMX will be created from it (in fact this was the only option with EF1). This will include all the usual metadata, CSDL, SSDL and the mapping between them just as model-first does. However it’s likely that you will want to change the model once you have imported the database to better map the database tables onto the entities and hierarchical relationships in your model. With database-first it cannot be inferred what the original model’s hierarchy and structure was (as it didn’t exist) and so the generated model can be tweaked accordingly to your taste. The final option is code-first. With this approach, there is no EDM at all and the database is generated from the data access code that you write. With code-first you write the code you want as plain classes, then the EF models are inferred from that at run-time. These models are used to generate the database, if you like, as well as provide the mapping from your hand-written classes. Just because you can’t see the EDM though doesn’t mean it’s not there. The metadata is still created under the covers and the code you write (with conventions that must be followed) is used to create it at runtime. You should note that this approach is only enabled by the downloadable CTP 'add on'– see references..

Before moving on, I should also mention a couple of limitations with model-first currently. The first one is that it's not currently possible to change the change database and update the model to "round trip" the development cycle.  This means that the contents of the database will be generated each time causing data loss as it regenerates everything every time you create the database from the model. Therefore you must also be careful of this as it also means that everything created in the database will be dropped including all your data when you execute the generated scripts. It's worth thinking up-front about how you'll manage this process and repopulate test data after model changes during development. You should also bear in mind that once the database schema is published, it's likely that it will be changed directly in a many that preserves the data. At the point, the model becomes out of date and must be updated from the database in a database-first style from this point on.Model-first Walkthrough

Let's walk through what a typical development cycle would look like when we drive EF from a model-first perspective. Here I've created a new, empty C# project called Reservation and have added an ADO.NET Entity Data Model item. In Figure 3, I've shown the project items available for building Entity Framework models under the Data templates section.

Figure 3. Visual Studio Templates

 

Clicking the Add button will then display the Entity Data Model wizard shown in Figure 4. It is here that I can choose to create a new model from an existing database or start with an empty model. The latter is the option that I will choose in order to demonstrate the steps required. Clicking Next will add a new EDM item to the project called reservation.edmx. The model designer appears deceptively simple, with a blank canvas and only three toolbox items (refer back to Figure 1). However this belies its expressive power. Less is more when working with the model designer.

Figure 4. Entity Data Model Wizard

 

Entities are added by dragging and dropping them to the design surface from the toolbox. In Figure 5 below I've added two entities and then renamed them Flight and Customer respectively as I want to model a particular airline flight and its associated customer reservations.

Figure 5. Adding Entities

 

The next thing I need to do is define how these two entities are related. Prior to EF4, I had no way of modeling foreign keys explicitly and bringing them through from the model to the database. This was problematic because often consumers need to be able to see this relationship in order to construct queries against the resources represented to navigate the data. However, in EF4 I now have the ability to create such an association. I can do this by using the Add Association option that appears when I right-click an entity. Selecting this on the Flight entity will display the dialog shown in Figure 6.

Figure 6. Add Association Dialog

 

As you can see, EF will automatically attempt to define the relationship for me but I'm at liberty to change this as I choose. Notice also the multiplicity setting displayed on each entity. Here I am modeling a one-to-many relationship between a flight entity instance and its customers but I can change the direction of this at either end of the relationship. The final item I'd like to highlight is the option to create a foreign key. This is what enables the relationship defined in the model to be 'pulled' through into the database schema that I'm ultimately going to create from my model. As the defaults are exactly what I want in this case I can just click OK and the association will be added to my model. This now updates to reflect the new association as shown in Figure 7.

Figure 7. One-to-Many Association

 

Not only do I now have a connection between my two entities but navigation property has been added to each as well. It is this that allows me to "walk" the relationships in my data. The Customer entity has also been updated with a foreign key called FlightID from the Flight entity. In Figure 8, I've shown the properties that are set for the association. Here you can see essentially what was specified in the dialog whereby a flight may have any number of customers but each customer is associated only with a single flight. Notice that I have a getter and setter specified for my association as well. This allows me to specify the visibility in my data model to consumers enabling read/write or read only for access.

Figure 8. Navigation Properties

 

Of course, my model still isn't terribly useful as I've not said anything yet about the data items I want each entity to hold. I'll fix that issue now and add this information. For the Flight entity I want to hold the flight number, airline code, date/time of departure together with the origin and destination airports as listed in Table 1.

Table 1. Flight Entity

PropertyType
FlightNumberString
DepartureDateDateTime
AirlineString
OriginString
DestinationString

 

I can do this by right-clicking my entity and selecting Add>Scalar Property as shown in Figure 9. Notice that the types in Table 1 are .NET types not the target database's. One of the benefits of the EF is that it will generate and handle the appropriate mapping between type systems for us. This provides the most natural type system for the developer, .NET, to be used to interact with the data source regardless of its implementation. It is the DDL driven by the SSDL that will contain the actual database column types used to store the properties we added to the entities in the CSDL.

Figure 9. Adding a Property

 

Once I've added each of these my entities now appear as shown in Figure 10.

Figure 10. Entities

 

When a new entity is added, a single Id property is created. It's possible to control how this is generated when new entity instances are created. This is achieved through the StoreGeneratedPattern property (visible in the standard Properties window) which provides the option to set the property to "Identity". This will auto-generate a unique ID for you when creating entity instances and does this by creating the corresponding column in the database with the IDENTITY setting.

I'll now add the properties I need for the Customer entity where I want to represent each customer on the flight. Therefore I'll hold each customer's first and last names, date of birth and their seat assignment. To do this I'll demonstrate another new feature of EF 4, complex types.

 

Complex Types and Inheritance

To assist working with groups of properties you can create a complex type. A complex type is simply a collection of simple types such as strings and integers, or other previously-defined complex types. In this way you can create a set of properties, give it a name and then add a property of this type to your entity. The main advantages of creating complex types are encapsulation and reuse. You can create any number of entity properties all based on the same type. Here I'm going to create a complex type called Person that will consist of the data items in Table 2.

Table 2. Person Complex Type

PropertyType
SalutationString
FirstNameString
MiddleNameString
DateOfBirthDateTime

 

I create a complex type by right-clicking the model design surface and selecting Add > Complex type from the context menu that appears (shown in Figure 11). A complex type does not appear in the model designer, but instead you interact with it through the Model Browser.

Figure 11. Adding a Complex Type

 

The model browser, shown in Figure 12 allows you to add new properties, either scalar or complex, to your type. If you can't see the Model Browser, click on the main View menu, select Other Windows and then click Entity Model Data Browser. Once done, we can add a property called CustomerInfo of the Person complex type to our Customer entity.

Figure 12. Model Browser

 

Finally, I've added a Seat string property. FIgure 13 shows the entites once these tasks are complete. Notice that the CustomerInfo property is denoted as a complex type by the  symbol.

Figure 13. Completed Flight and Customer entities

 

For the Flight entity, the Origin and Destination properties are designed to hold three character city or airport codes such as ORD for Chicago O'Hare or JFK for New York City. I therefore want to ensure that exactly three characters are held in these properties. I can do this by setting the facet properties, Fixed Length to true and the Max Length to 3 as shown in the property settings in Figure 14.

Figure 14. Property settings

 

Generating the Database

Now that we've created our model it's time to turn it into reality. The EF will generate DDL based on the targeted data source platform which is derived from the connection details you provide. The process of generating the MSL, SSDL and DDL from the model is a multi-step workflow. In the Properties Window, notice the Database Generation Workflow property (shown in Figure 15) - if you can't see it make sure you have set the 'Categorized' grouping at the top of the property list (the default is A-Z).

Figure 15. Workflow Properties

 

This property references a XAML file which is actually a Windows Workflow definition and is shown in Figure 16. Here you can see two activities represented. The first, CsdlToSsdlAndMsl, takes the EDMX CSDL schema as input and generates the mapping (MSL) and storage schema (SSDL) as input to the second, SsdlToDdl, activity. This activity's job is to convert the normalized SSDL to DDL. This produces the actual database generation script. EF4 ships with one option - a single database/DDL generation strategy of a table per type, hence the name of the XAML file itself. What this means is that for each entity, the generated SQL will contain an associated table for it. It's possible you may want to employ an alternative strategy for the mapping of entities to tables and are free to do so by adding or changing the activities or even authoring your own workflow. We'll look at one alternative to this strategy later.

Figure 16. Table-per-Type Workflow

 

The Database Generation Workflow works in conjunction with the DDL Generation Template property. This property specifies a T4 template (tt) file. It is this template that does the actual creation of the DDL from the model. T4 is shorthand for the Text Template Transformation Toolkit and is a Microsoft technology explicitly designed to make text file creation such as code generation straightforward using templates. It is a mixture of processing logic and text to emit with the idea that the text can be created from the template by substituting values in it programmatically. This is similar in principle to XSLT but T4 allows the blending of .NET code and text mark-up to be combined seamlessly and it is used extensively in Visual Studio 2010.

 

Generating the DDL

We can run this workflow by right-clicking the model and selecting the Generate Database from Model... option. The first time you deploy the model, the Generate Database Wizard (shown in Figure 17) will appear where you must specify the database connection details. It is this that informs EF of the appropriate DDL to generate for the target database.

Figure 17. Generate Database Wizard

 

Clicking the New Connection button displays the Connection Properties dialog shown Figure 18. This is a standard dialog that you may well have seen before. Here I have specified a SQL Server data source on my local machine and Windows Authentication.

Figure 18. Connection Properties

 

As you can see I've provided a database name of Reservation. This does not currently exist, so the dialog in Figure 19 will appear asking me if I'd like to create it.

Figure 19. Confirm Database Creation

 

At the end of the process a summary of the generated DDL is provided (Figure 20) and clicking Finish will close the wizard.

Figure 20. Generate Database Wizard : Summary

 

In doing this you should notice that the Reservation.edmx.sql has been added to the project. This contains the generated DDL for the database but two other things have also happened. An empty Reservation database has been created in SQL Server and an app.config file has been added to the project with the connection details to the database binding to the model.

Now let's open the DDL in the .sql file and see what has been generated from our model. Firstly you should see from the extract below that two CREATE TABLE statements have been generated, one for each of the entities in the model.  Here I've shown the Customers table which specifies the primary key and columns to create. Notice that the [Id] field uses the IDENTITY keyword. This was automatically added because of the value provided for the StoreGeneratedPattern property of the property in the model which was set to Identity.

-- Creating table 'Customers'
CREATE TABLE [dbo].[Customers] (
    [Id] int IDENTITY(1,1) NOT NULL,
    [FlightId] int  NOT NULL,
    [CustomerInfo_Salutation] nvarchar(max)  NOT NULL,
    [CustomerInfo_FirstName] nvarchar(max)  NOT NULL,
    [CustomerInfo_LastName] nvarchar(max)  NOT NULL,
    [CustomerInfo_DateOfBirth] datetime  NOT NULL,
    [Seat] nvarchar(max)  NOT NULL
);
GO

-- Creating primary key on [Id] in table 'Passengers'
ALTER TABLE [dbo].[ Customers]
ADD CONSTRAINT [PK_Customers]
    PRIMARY KEY CLUSTERED ([Id] ASC);
GO

Notice also that the CustomerInfo complex type has not turned into a table but instead all of its properties have been amalgamated into the Customer table which referenced this type with the name of each column prefixed by the name of the complex type - CustomerInfo_.

The third item to notice is that the association that was added to the model has been turned into a table constraint as shown below:

-- Creating foreign key on [FlightId] in table 'Customers'
ALTER TABLE [dbo].[Customers]
ADD CONSTRAINT [FK_FlightCustomer]
    FOREIGN KEY ([FlightId])
    REFERENCES [dbo].[Flights]
        ([Id])
    ON DELETE NO ACTION ON UPDATE NO ACTION;

Here you can see that the Customers table has a constraint added referring to the Flights table, specifying the FlightId as a foreign key.

 

Influencing the DDL Generation

One of the great new enhancements in EF4 is that all aspects of converting the model to DDL are extensible. If you don't like how something is done, you are free to change the appropriate layer in the stack to generate what you require. A simple but common example of this could be naming standards in your organization that dictate that all table names are prefixed or suffixed with TBL_. Another example is that perhaps we'd like to generate an index for our Customer table on the Seat property. As the Customer entity can potentially represent multiple Customer instances across the same flight as well as different flights it seems reasonable that the database I'm going to create should provide an index on this column.

To add a further requirement, I'd like the ability to specify which columns to generate indexes on in a generic way. EF has always provided the ability to add annotations to the CSDL but this has become considerably more useful with the introduction of T4 integration. Recall that the CSDL is the conceptual schema - it contains the entity definitions that are represented in the model designer. In order to add an annotation we need to open the EDMX file in the XML editor. This can be achieved simply by right-clicking the file in the Solution Explorer and selecting "Open With..." and choosing the XML (Text) Editor from the list of options.

An understanding of the three schemas in the EDMX was essential in the first version of EF and happily that need has mostly disappeared thanks to improved tooling. In the XML snippet below, I've shown the Seat property on the Customer entity in the CSDL portion of the EDMX. I've added an Index element under the Property node as a custom annotation.

<EntityType Name="Customer" BaseType="Reservation.Passenger">
  <NavigationProperty Name="Flight"
                      Relationship="Reservation.FlightCustomer"
                      FromRole="Customer" ToRole="Flight" />
  <Property Type="String"
            Name="Seat"
            Nullable="false"
            FixedLength="false"
            MaxLength="3" >
    <myExtensions:Index indexName="Seat" edmx:CopyToSSDL="true"/>
  </Property>
  <Property Type="String" Name="MealOption" Nullable="false" />
  <Property Type="Int32" Name="FlightId" Nullable="false" />
</EntityType>

Note that this is namespace prefixed. I've declared this namespace on the CSDL schema root node as follows to make sure it's in scope:

<!-- CSDL content -->
<edmx:ConceptualModels>
  <Schema xmlns="http://schemas.microsoft.com/ado/2008/09/edm" xmlns:cg="http://schemas.microsoft.com/ado/2006/04/codegeneration" xmlns:store="http://schemas.microsoft.com/ado/2007/12/edm/EntityStoreSchemaGenerator" Namespace="Reservation" Alias="Self" xmlns:annotation="http://schemas.microsoft.com/ado/2009/02/edm/annotation"
          xmlns:Regex="http://Regex"
          xmlns:myExtensions="http://www.microsoft.com/userExtensions">    <EntityContainer Name="ReservationContainer"  annotation:LazyLoadingEnabled="true">

Because this is my own content, I can add whatever I like. Here I've simply added two attributes. The first one, indexName allows me to specify the name of the index to create on the column on which my custom annotation appears. The CopyToSSDL attribute is defined in the EF schema and is required to ensure that my annotations are passed through from the CSDL to the SSDL. Recall that it is the SSDL that is used by the T4 template to generate the DDL. Without this attribute, the metadata would not be available.

Being able to implement requirements like this is a big improvement in EF 4 and can be done simply by editing the template that is going to generate the DDL for the database definitions.  The T4 template is really quite concise. Templates are located under the install folder of Visual Studio (C:\Program Files by default) at the following relative path:

\Microsoft Visual Studio 10.0\Common7\IDE\Extensions\Microsoft\Entity Framework Tools\DBGen

This is also where the workflow that drives this process is located. You should find one template, the SSDLToSQL10.tt file that contains the following (abridged) definition:

<#
   foreach (EntitySet entitySet in Store.GetAllEntitySets())
   {
       string schemaName = Id(entitySet.GetSchemaName());
       string tableName = Id(entitySet.GetTableName());
#>
   CREATE TABLE [<#=schemaName#>].<#}#>[<#=tableName#>] (
<#
   for (int p = 0; p < entitySet.ElementType.Properties.Count; p++)
   {
      EdmProperty prop = entitySet.ElementType.Properties[p];
#>
   [<#=Id(prop.Name)#>] <#=prop.ToStoreType()#>   
    <#=WriteIdentity(prop, targetVersion)#>   
    <#=WriteNullable(prop.Nullable)#>
    <#=(p < entitySet.ElementType.Properties.Count - 1) ? "," : ""#>
<#
   }
#>

As you can hopefully see from this extract, the template is looping over the entities and generating  CREATE TABLE statements for each of them. Then, within each entity, its properties are used to generate the columns in the table. Recall that the SSDL provides the storage definitions that have been mapped to the conceptual model. However, with model-first both the MSL mapping and SSDL are generated each time you generate the database DDL. On second and subsequent times you will receive the following warning informing you of the destructive action about to occur:

Figure 21. Overwrite Warning

 

Let's look at how we can support our requirement to generate an index for the Customer table. We can do this by making a number of straightforward modifications to the template. This is most easily done by copying and renaming the tt file. I've called it SSDLToSQL10WithCustomIndexers.tt. When placed in the same folder location as before this will appear in the drop down of the model properties allowing you to select it instead of the provided template.

The first modification is to add the following at the top of the template:

<#@ assembly name="System.Xml.Linq" #>

This is required because I'll use LINQ to enable me to gain access to the annotations. At the bottom of the template, I've add the following:

-- Creating index for table based on custom extensions --
<#
   foreach (EntitySet entitySet in Store.GetAllEntitySets())
   {
     string tableName = Id(entitySet.GetTableName());
     string schemaName = Id(entitySet.GetSchemaName());
     EdmProperties props = entitySet.ElementType.Properties;
     foreach (EdmProperty ep in props.Where(p =>
                           p.TypeUsage.EdmType is PrimitiveType))
     {
        MetadataProperty meta = ep.MetadataProperties.FirstOrDefault(mp => mp.Name == "http://www.microsoft.com/userExtensions:Index");
        if (meta != null)
        {
            System.Xml.Linq.XElement e = meta.Value as System.Xml.Linq.XElement;
            System.Xml.Linq.XAttribute attr = e.Attributes().FirstOrDefault(a => a.Name == "indexName");
            string indexName = attr.Value;     
            // create an index for specified column
#>
CREATE INDEX [IX_<#=indexName#>]
ON <#if (!IsSQLCE) {#>[<#=schemaName#>].<#}#>[<#=tableName#>]
([<#=indexName#>]);
<#
         }
     }   
   }
#>

This code iterates over each entity as shown before but this time checks an entity's properties for my custom annotation. Access to annotations is provided by the MetadataProperties property which returns the LINQ to XML XElement type. I simply query the properties to find my annotation:

ep.MetadataProperties.FirstOrDefault(mp => mp.Name ==
                   "http://www.microsoft.com/userExtensions:Index");

Notice that the name must be the fully qualified attribute name with its namespace. Then if the query returned something, I cast the results to an XElement type and retrieve my indexName attribute. Once that's done I emit the CREATE INDEX statement to generate an index with the given name on the column in the specified table.

Right-clicking the designer surface and selecting "Generate Database from Model" again will now generate my index as well. This time, the following SQL is present at the end of the generated file:

-- Creating index for table based on custom extensions --
CREATE INDEX [IX_Seat]
ON [dbo].[Customers]
([Seat]);

Here you can see that an index has been created for the Customers table on the Seat column. I can run this script into the database directly from Visual Studio. This is also a destructive operation; the script will drop the original database objects and recreate them with my changes. If you do this you'll lose all of the data in them too.

This is really scratching the surface of what's possible though. For example I could extend this to provide an annotation for multi-column indexes. Another approach is that I could extend the designer to show the property annotations in the entity properties to save hand-editing the CSDL in the EDMX file.

 

Inheritance in the Entity Framework

One of the benefits I noted earlier was being able to change the model and regenerate everything else under it. One likely reason to want to do this is to refactor the design as it evolves. Figure 22 shows the Person type split out into its own entity, Passenger. I've also introduced a new entity, Crew. This now enables me to model all the people on board a flight, not just the paying customers. You can also see that I have added an Operating Boolean property to the Crew entity. This is to represent whether the crew member is operating, or working, on the flight or not.

Figure 22. Refactoring with Inheritance

 

When I do this I can then inherit from the Passenger entity for both Customer and Crew. The inheritance relationship is shown by an open arrow pointing to the inherited entity. When I generate the DDL from this model the result for the Customer and Crew entities will be three tables in total, not two. This is because a table per entity will be created by default, one for the Customer entity and one for the inherited Passenger entity which is shared with the Crew entity making three. However, from my OM point of view, I work with a single object that gives me access to both the Customer's properties and the Customer's inherited properties.

Later you'll see a way of 'flattening' this structure to create a single table from an inheritance hierarchy such as this.

 

Going Beyond What’s In The Box

As well as being able to extend the DDL generation capabilities of the Entity Framework in the ways I've discussed so far, Microsoft has also released some addition features for the EF in a number of ways. Some of these are simply because they didn't make it in time for the final release whilst others are more nascent.

Here we'll look at the EF Power Pack, the Community Technical Preview (CTP) 3 providing Code-First support and a T4 template to generate POCOs. All of these are made available as separate downloads (see references) for you to use. Before going any further you should bear in mind that these are not supported in the same way as the EF is. The EF is part of .NET Framework 4 and as such is supported by standard product support whereas the Power Pack, CTP and template are provided as-is without warranty or support. With that caveat out of the way, I want to talk about some of the capabilities these additions include.

 

Entity Designer Database Generation Power Pack

The power pack provides a couple of features to make working with the generated DDL and database easier. The first of these is included support for database project types.  This allows a Database-type project to be added to the same solution and provided the project is given the same name as the model edmx file, the generated T-SQL will be added to the database project instead.  

The second useful feature related to model-first the ability to update an existing database and synchronize the model with it. This allows you to make changes to the model that can be deployed to the database without data loss. This is achieved by generating ALTER statements for the schema instead of the usual DROP/CREATE. When the database DDL is generated from the model it is compared to the pre-existing database schema and a new non-destructive DDL script generated from the differences between them. This allows the database to change after initial creation from the generated DDL without dropping everything.

To see this in action, I've added some data to the entities created earlier. This is shown in Figure 23 below where I now have two passengers in the underlying Passenger table.

Figure 23. Passenger Data

 

Now let's suppose I'd like to add a new Bag entity that tracks the luggage checked in by a passenger. Figure 24 shows the new entity and its properties, listed in Table 3. One final change I've made is that I've added a FrequentFlyerID string property to the Passenger entity.

Table 3. Bag Entity Properties

PropertyType
WeightDecimal
ReceiptString

 

Figure 24. Updated Entity Model

 

I want to relate the Bag entity to the Passenger entity and can do this in the usual way by using the Association toolbox item to connect them. I will also make the FrequentFlyerID property nullable in the Passenger entity as shown in Figure 25. You see why this is necessary in a moment.

Figure 25. Nullable FrequentFlyerID Property

 

I can now use the new migration workflows the Power Pack provides to deploy my updated model. To make configuration of the new templates and workflows easier, the Power Pack provides a new UI that appears when you right-click the model and select Generate Database from Model... This UI is shown in Figure 26 and allows you to pick the workflow where the view then shows what it will do. There are two migration options, differing only by the final step of displaying the generated DDL or immediately applying it to the database. Here I've selected the "Generate Migration T-SQL  And Deploy" workflow which will create the new DDL and deploy it without going through the display and manual update steps.

Figure 26. New Workflow Manager Dialog

 

Clicking Next to run the workflow, you'll also notice the new progress bar (Figure 27) which appears showing the actions being taken.

Figure 27. Generate and Deploy Progress Bar

 

Examining the Passenger table after this deployment, from Figure 28 you can see that the table's data as it was prior to the model change and deployment has been preserved. If we look at the generated DDL we can see that instead of the DROP/CREATE statements, the workflow has instead created ALTER statements.

Figure 28. Updated Passenger Table

 

The two important statements in the DDL are shown below. The first is an ALTER generated to add the new property defined in the model whilst the second is created to store the new Bag entity's data. In this way model-first provides a significant benefit over other approaches by preserving the data in an existing database allowing us to version and update over time in a production setting.

ALTER TABLE [dbo].[Passengers]
    ADD [FrequentFlyerID] NVARCHAR (MAX) NULL;
CREATE TABLE [dbo].[Bags] (
    [Id]           INT            IDENTITY (1, 1) NOT NULL,
    [Weight]       DECIMAL (18)   NOT NULL,
    [Receipt]      NVARCHAR (MAX) NOT NULL,
    [Passenger_Id] INT            NOT NULL
);

In Figure 28 you can see the FrequentFlyerID column is NULL. This is why it is important to make the new property nullable because otherwise there would be no way to generate an ALTER statement for the table as the contents of it would immediately violate the constraint if a value for the column was mandatory.

One final point I want to make on this feature is that it presently can't handle renames of entities. Instead a renamed entity will be created as a new table with the original table left untouched. However, renaming properties is supported. In this case a temporary table is created with the new/changed columns and the original table's data is inserted into it. Finally, the original table is dropped and the new table renamed to original to replace it. All of this is done within a transaction to ensure that the whole process is rolled back in the event of failure.

 

Table Per Hierarchy Support

The Power Pack also provides an alternative generation scheme to table-per-entity or type (TPT). This new strategy is called table-per-hierarchy or TPH and allows a single table to be generated from an inheritance hierarchy of entities rather than a table for each of them. Recall that when we looked at inheritance earlier we saw that when the Customer and Crew entities inherited from the Passenger entity, three tables were created in the database. With the TPH strategy we can create a single table from all of them as shown below.

CREATE TABLE [dbo].[Passengers] (
    [Id] int IDENTITY(1,1) NOT NULL,
    [Person_Salutation] nvarchar(max)  NOT NULL,
    [Person_FirstName] nvarchar(max)  NOT NULL,
    [Person_LastName] nvarchar(max)  NOT NULL,
    [Person_DateOfBirth] datetime  NOT NULL,
    [Seat] nvarchar(max)  NOT NULL,
    [MealOption] nvarchar(max)  NOT NULL,
    [FlightId] int  NOT NULL,
    [StaffNumber] nvarchar(max)  NOT NULL,
    [OperatingCrew] bit  NOT NULL,
    [__Disc__] nvarchar(4000)  NOT NULL );

This might not be what I want of course, I might want two tables, one for Customers and one for Crew with both amalgamating the Passenger entity. If this is what I wanted I could clone the supplied template and change it however I like with the advantage that the template can become reusable for models I create in the future.

The Power Pack provides six new Windows Workflows as shown in Figure 29.

Figure 29. Additional Power Pack Workflows

 

These cover the TPT and TPH DDL generation strategies as well as workflows for the synchronization discussed above. There are also workflows for another new feature, DACPAC support. Data Tier Application (DAC) Package (PAC) is a SQL Server 2008 R2 deployment technology allowing you to package a database and its contents for deployment to another environment. See references if you're interested in more details on this.

 

Summary

In this paper we've investigated some of the many new features in EF4. Of the most significant of these are the set of enhanced functionality that enables a rich model-first experience. By providing comprehensive support, the EF team have it possible to move from model to database. EF4 provides full customization and extensibility over this process enabling your own scenarios and requirements to be incorporated in ways the authors didn't anticipate.

 

References

 

About the Author

Jon Fancey is a member of the Pluralsight technical staff, a Microsoft .NET training provider, where Jon teaches BizTalk Server and owns the SharePoint curriculum. Jon is a co-founder of Affinus, a UK-based consultancy specializing in Microsoft technologies with .NET, BizTalk® Server, Host Integration Server and SharePoint. You can reach Jon via his blog at www.pluralsight.com/blogs/jfancey or at jon.fancey@affinus.com.

Microsoft is conducting an online survey to understand your opinion of the MSDN Web site. If you choose to participate, the online survey will be presented to you when you leave the MSDN Web site.

Would you like to participate?