Database Initializers in Entity Framework 4.1

Julie Lerman

https://thedatafarm.com

Published: June 2011

Download the code for this article


Entity Framework 4.1 Code First lets you use your own classes to describe your model and the database where the data gets persisted. You can map the classes to an existing database, but the default behavior is that Code First creates a database for you on the fly. This article will focus on the default behavior to show you how code database initialization works. You’ll also learn how you control the database initialization and even use it to seed your database with initial data.

I've got a small project where I've already defined my domain classes but there is no database yet. Here are two of those classes: Blog and Post.

namespace HowDoI.DBInitializers.Domain

{

    public class Blog

    {

        public Blog() 

        {

            Posts = new List<Post>();

        }



        public int Id { get; set; }

        public string Title { get; set; }

        public string BloggerName { get; set; }



        public virtual ICollection<Post> Posts { get; set; }

    }



    public class Post

    {

        public int Id { get; set; }

        public string Title { get; set; }

        public DateTime DateCreated { get; set; }

        public string Content { get; set; }

        public int BlogId { get; set; }



        public Blog Blog { get; set; }

        public ICollection<Comment> Comments { get; set; }

        public ICollection<Tag> Tags { get; set; }

    }

There are also classes defined for Comment, Tag and person.

The BlogContext class manages the domain classes and inherits from DbContext to provide the database interaction and track changes to objects.

namespace HowDoI.DBInitializers.DataAccess

{

    public class BlogContext : DbContext  

    {

        public DbSet<Blog> Blogs { get; set; }

        public DbSet<Post> Posts { get; set; }

        public DbSet<Comment> Comments { get; set; }

    }

}

Finally there is a small console application that exercises the context, querying blogs and listing some of their details with the ListBlogs() method.

private static void ListBlogs()

        {

            var db = new BlogContext();

            var blogs = db.Blogs.Include("Posts").ToList();

            foreach (var blog in blogs)

            {

                Console.WriteLine("{0} by {1}", blog.Title, blog.BloggerName);

                Console.WriteLine("     # of Posts: {0}", blog.Posts.Count());

            }

            Console.WriteLine("Complete. Press any key to continue...");

            Console.ReadKey();

        }

The console application is important to this sample because by requesting the context to do some work with the database, it will trigger the database initialization.

Here’s what will happen in the background when the application is run for the first time.

  • Instantiating the context will cause Code First to work out not only the conceptual model, but additional metadata it needs to do its work between the model and the database, even though there is no database yet.
  • The next line of code, the one that actually asks for data to be returned — db.Blogs.Include(“Posts”).ToList()— will trigger the database initialization because it is at that point that the context will seek out the database.
  • Because there is no database connection string anywhere in the application, Code First will rely on its default behavior, which is to look in a local SQL Server Express instance for a database with the same name as the strongly typed DbContext class. In this example, the name is HowDoI.DBInitializers.DataAccess.BlogContext.
  • When the application is run for the first time and no such database exists, Code First will see that and create the database on the fly. The resulting database is shown in Figure 1.
Figure 1

Responding to Class Modifications

If the database already exists, Code First will do a quick comparison of the database schema to the in memory model that it derived from your classes. If something is different, it will let you know — in a big way.

I’ve added a new field to the Blog class: DateCreated.

public DateTime DateCreated { get; set; }

When the application is run again, Code First will notice that there is no DateCreated property in the Blogs table in the database and throw an exception.

Figure 2

The exception tells you that the model has changed since the database was created and you need to somehow modify the database. It offers suggestions such as manually deleting or updating the database or using Database Initializers.

Database Initializers

The Database Initializers offer an automated way for your data layer to respond to these types of changes during development. Database initialization offers three strategies which are self-describing:

  • CreateDatabaseIfNotExists
  • DropCreateDatabaseAlways
  • DropCreateDatabaseIfModelChanges

The first, CreateDatabaseIfNotExists is the default strategy and you’ve already seen that in action.

You can explicitly implement one of the non-default strategies using the Database.SetInitializer method. You also need to let the initializer know for which model it is supposed to perform this action.  Here’s code that will set the strategy to drop and recreate the database any time the BlogContext model changes.           

Database.SetInitializer(new DropCreateDatabaseIfModelChanges<BlogContext>());

In most applications, you should call the SetInitializer at application startup. In the console application that is in the Main method of the Program class:

static void Main(string[] args)

    {

        Database.SetInitializer(new DropCreateDatabaseIfModelChanges<BlogContext>());

        ListBlogs();

    }

Now when the application is run again and the context attempts to retrieve the blogs from the database, rather than throwing an exception because the model has changed, it will delete the existing database and create a new one based on the new model.

Figure 3 shows the Blogs table with its DateCreated column in the recreated database.

Figure 3

For testing purposes, the DropCreateDatabaseAlways stategy can be a convenient way to create a database with known seed data before running your tests. However, generally speaking, dropping the database is not the most desirable way to handle model changes, but it’s what is currently available. The Entity Framework team is working on migration solutions.

Seeding the Database During Initialization

But the database has no data in it therefore the console application will still have nothing to display. The Database Initializer has an internal method, Seed, which it calls whenever it is creating a database. The internal Seed method has no code, but it’s virtual, which means you can override it and supply your own logic.

The key to overriding Seed is that you will have access to the DbContext instance and in the method, you can populate the context with data. Seed will pass that context back to the class that is performing the initialization and it, in turn will call SaveChanges, pushing that data into the newly created database.

In order to access all of this logic, you must create a class which inherits from the initialization strategy that you want to use. Since my application is using DropCreateDatabaseIfModelChanges, I’ll inherit from that, then override its Seed method.

public class BlogContextInitializer : DropCreateDatabaseIfModelChanges<BlogContext>

    {

        protected override void Seed(BlogContext context)

        {

            var blog=new Blog

            {

                BloggerName = "Julie",

                Title = "My Code First Blog",

                DateCreated = System.DateTime.Now

            };

            context.Blogs.Add(blog);

            base.Seed(context);

        }

    }

In this simple example, I’ve created a new blog, added it to the context, then told the Seed method to finish doing its job with base.Seed(context).

Now that you see how it works here’s an example that adds more data. This code creates a list of blogs and for one of those blogs, adds a list of Posts. At the end, it uses the ForEach LINQ method to iterate through the list of Blogs and add each one to the context. The posts that are attached to the first blog get added along with that blog.

new List<Blog>

{

  new Blog{ BloggerName = "Julie", Title = "My Code First Blog",

      DateCreated=System.DateTime.Now,

      Posts=new List<Post>

      {

        new Post{Title="ForeignKeyAttribute Annotation",

            DateCreated=new System.DateTime(2011,3,15), 

            Content="Mark navigation property with ForeignKey"

            },

        new Post{Title="Working with the ChangeTracker",

            DateCreated=System.DateTime.Now, 

            Content="You can use db.Entry to get to state for a single entry or" +

                    "db.ChangeTracker.Entries to work with all of the tracked entries."

            }

       }

  },

  new Blog { BloggerName = "Ingemaar", Title = "My Life as a Blog", 

    DateCreated=System.DateTime.Now.AddDays(1)

  },

  new Blog { BloggerName = "Sampson", Title = "Tweeting for Dogs",

    DateCreated=System.DateTime.Now.AddDays(2)

  }

}.ForEach(b => context.Blogs.Add(b));

base.Seed(context);

Now that the new initializer has been created, you can tell the Database class to use that instead of one of its own. Here’s the revised Main method.

static void Main(string[] args)

    {

        Database.SetInitializer(new BlogContextInitializer());

        ListBlogs();

    }

But there is a problem. I’ve provided seed data but since the model hasn’t changed, the initializer won’t be triggered and the seed data will not be inserted. I’ll simply delete the database from the server which will cause Code First to recreate it for me.

Now the console app finds data when it retrieves the Blogs and Posts and is will list the following in its console.

My Code First Blog by Julie

     # of Posts: 2

My Life as a Blog by Ingemaar

     # of Posts: 0

Tweeting for Dogs by Sampson

     # of Posts: 0

Complete. Press any key to continue...

SUMMARY

So now you've seen the basics of how Code First is able to automatically create a database for you and what role the Database Initializers play in the Code First workflow. You can choose to let the initializer create a database only when it doesn’t already exist, recreate it every time your run your application or create it only when the model changes. Whichever strategy you use, you can leverage the virtual Seed method to provide initial data for your database, which is very convenient as you work through the development process or even for applications for which you need to provide a starting database.