.NET Entity Objects: An Open Source O/R Mapping Toolkit
Debugging: Root Out Elusive Production Bugs with These Effective Techniques
Smart Tags: Simplify UI Development with Custom Designer Actions in Visual Studio
Ten Essential Tools: Visual Studio Add-Ins Every Developer Should Download Now
XML Comments: Document Your Code in No Time At All with Macros in Visual Studio
Collapse the table of content
Expand the table of content

.NET Entity Objects: An Open Source O/R Mapping Toolkit

Visual Studio .NET 2003

Pierre Couzy
Microsoft Corporation

May 2006

Applies to:
   .NET Entity Objects
   Visual Studio .NET

Summary: Object–relational (O/R) mapping is fairly common among Java developers, but still in its infancy on the .NET side of the world. This article illustrates why and how you can leverage this technique in your .NET projects. (9 printed pages)


NEO General Design Principles
First Contact with NEO
Very Nice, but Would You Use That in Real-Life Projects?


Object–relational (O/R) mapping is fairly common among Java developers, but still in its infancy on the .NET side of the world. We'll try to understand a bit why and how you might leverage that technique in your projects.

Most of the applications that you write share a very standard pattern: a low-level layer is in charge of data access and gives you raw data to work with, usually exchanging back and forth data structures that are roughly equivalent to records in the database. On top of that, DAL (Data Access Layer) is a business logic layer. This, of course, applies validation and business rules, but it also creates more elaborate structures that contain all the information you need in the user interface layer (for example, a Customer Record will be augmented with the status of its last order).

This technique is tedious, and sometimes complex—tedious, because you have to code the same notions in two different places (the DAL and the business layer). I admit that you don't code exactly the same things, but the coupling is strong, so that every time you update something in the lowest layer, it has an impact on the intermediate one. The complexity arises from that coupling you have to eliminate or alleviate. You'll find many authors referring to this as the impedance mismatch between the relational world (DAL) and the object world (our business layer).

O/R mapping tools give you the freedom to work solely on one layer as they automate the management of the other one. You'll very often encounter the term domain model in those tools: it represents the underlying structures that will have to be persisted in relational form and manipulated in object form.

As you can easily imagine, there are three ways to achieve this goal:

  • The database is the domain model, and the mapping tool creates a set of classes that mimic the database structure. This technique is widely used to create Data Access Layers that still have to be augmented by a business layer, but they eliminate a bunch of code, and are well suited to a stored-procedures approach. There is a limitation, though, because the Data Access Layer is often very specific, and very hard to reuse from one entity to another.
  • Classes are the domain model, and the mapping tool creates a database structure. Developers just love that technique, because they are totally in control: the database becomes a mere persistence artefact, and they get the flawed feeling that it will smoothly comply to any fantasy. They often discover late in the development cycle that performance matters, and reintegrating data from one structure to another is often a nightmare.
  • The domain model is a stand-alone construct, and both the database and classes get pre-generated by the O/R mapping tool. In that case, the developer will have the ability to easily refactor both the database structure and the classes.

NEO (.NET Entity Objects) uses the third paradigm, and—even if it means a bit more work when setting up things for the first time—you'll love the versatility it gives you.

NEO General Design Principles

.NET Entity Objects is an Open Source product maintained by Erik Doernenburg. You'll find it on CodeHause, which also hosts JIRA and Boo, among other nice projects.

NEO is, initially, a relatively thin layer on top of ADO.NET, and it relies on datasets to load information from the database and track the changes that will have to be persisted back. The developer does not manipulate those datasets; instead, he or she uses generated classes, and has the ability to code any supplemental behaviour in new classes that inherit from the auto-generated ones. The code generation engine is NVelocity, a port of the Apache Jakarta Velocity project—you might want to check it out, although it's not very active at the moment. (See Figure 1.)

Figure 1. NEO general design

First Contact with NEO

Every project relying on NEO contains a special XML file that serves as the domain model. This file uses a semi-standardized format, called Norque, which is heavily inspired by another Apache project called Torque. Figure 2 shows a sample domain model file in the Norque format.

Figure 2. Domain model file

You should have no difficulty reading this file; it describes a part of the aging (and soon to retire) Pubs database. The package attribute represents the namespace where the generated classes will live (the term comes from the Java world), and the javaName attribute of the table tag is the .NET class name that will map to the entities in the table.

Relations are described twice, for navigability purposes: the <foreign-key /> tag in the entity consuming a foreign key, and a <iforeign-key/> (i stands for inverse) in the entity providing the foreign key.

You can write the domain model first (or infer it from a UML model), and then create a database schema; or, you can use a little Open Source project, SQLToNeo, that creates the domain model file from an existing database (see Figure 3).

Figure 3. Using SqlToNEO to auto-generate the domain model file

From your XML domain model, you'll generate the following (see Figure 4):

  • A DDL script to create a database structure.
  • A set of C# classes.

Figure 4. Generating SQL and C#

Creating the SQL Script

This operation isn't necessary; you usually do it once when your domain model is first created, and later on you'll refactor by hand. The script only defines tables, constraints, and relations; you have to create yourself the database to host those structures.

Creating the C# Classes

This operation can be made from the command line, or directly from Visual Studio, thanks to an add-in that regenerates the classes every time you update the domain model file. NEO generates two sets of classes: the first one is non-updateable, and represents low-level classes responsible for defining the entities and giving your code access to them. The second one is a set of placeholder classes inheriting from the previous classes, and you'll be able to add any logic in those, because they won't get regenerated. (See Figure 5.)

Figure 5. The AuthorFactory class is in charge of creating Author and AuthorList instances

Using NEO to Query and Update Data

All you have to do is create a new project, include the classes generated by NEO, and add a reference to neo.dll. You'll then be ready to work with your data, without writing a single line of SQL.

Most of the time, you'll begin by initialize a DataStore object, which represents the link with a physical database. NEO natively supports SQL Server (and MSDE), Oracle, MySql, and FireBird, and adding a new database engine is fairly simple, provided that the database supports the SQL 92 specification.

You'll always initialize an ObjectContext, which represents the object side of your O/R mapping. Once those two objects are up and running, you can play with the generated classes and forget about the database.

Each entity depends on a specific factory that knows how to query, follow a relationship, and, of course, return one or N entities. The following code retrieves a list of authors, and then a specific author.

       private void button1_Click(object sender, System.EventArgs e)
           // Let's establish a link to the database
           SqlDataStore s = new SqlDataStore(Config.GetConnectionString());
           // We'll need an objectContext to use our classes 
           ObjectContext o = new ObjectContext(s);

            // This factory knows how to manipulate Authors
           MonNamespace.AuthorFactory af = new MonNamespace.AuthorFactory(o);

           // I first want a list of every author currently under contract 
           AuthorList someList = af.Find("contract = 1");
           // Let's refine our search to a single author
           string firstName = "Abraham";
           Author anAuthor = someList.FindUnique("au_fname = {0}", firstName);

            // NULL values are cleanly handled : 
           if (anAuthor.city != null)
               MessageBox.Show("City is undefined");

           // Updates are directly done against the objects
           anAuthor.city = null ;

           // They won't go back to the DataStore until you explicitly ask for it

Using NEO to Ease Unit Testing Against a Database

Unit tests against a database are always a pain, because the state of the database is hard to reset, and therefore you can't reproduce bugs. The best solution is to create test data and load it in the database before launching the test, but this is long, cumbersome, and often unrealistic. NEO has a very elegant solution to that problem: it's able to work disconnected from a database. This means that you can, at any given time, memorize the current state of your working set and save it, without changing your code. The following is a simple example.

           // This is the normal production code, establishing a link to the database
           SqlDataStore s = new SqlDataStore(Config.GetConnectionString());
           //  We'll need an objectContext to use our classes
           ObjectContext o = new ObjectContext(s);

           ... //some time later, we encounter an exception.

           // Let's keep the data that caused the error : 

Now, you'll want to write the regression test, to make sure that the bug won't happen again.

            // We won't connect to the database, let's comment the DataStore part
           //SqlDataStore s = new SqlDataStore("initial catalog=pubs;uid=sa;pwd=as");
           // We need an ObjectContext, let's create it from the test data instead of the DataStore : 
           //ObjectContext o = new ObjectContext(s);
           ObjectContext o = new ObjectContext();
           DataSet ds = new DataSet();

Very Nice, but Would You Use That in Real-Life Projects?

Now that we've seen some interesting facts about O/R mapping tools, let's sum up things a bit:

  • Development is faster. A small application involving 20 tables typically implies writing forty to fifty classes, ranging from data access modes, entity representation, factories, and the business layer. A mapper automates at least half of this work, so that this is definitely a quicker way to code.
  • Code is more robust. This one is interesting as well. All the generated classes share a lot of common ground, and the templates that gave birth to your classes have been tested by hundreds of other developers. Therefore, chances are that the code is of a much better quality than what you would produce in a few hours or days.
  • Code is easier to test and refactor. We saw in the last example that creating test suites is made a lot easier with NEO, and that's good insurance against regression when you refactor something. Also, part of the refactoring will be done by NEO itself when it regenerates your classes from your new settings, and it won't forget small details that would otherwise be a nightmare to chase down.

So, if it works that beautifully, why don't we all use it already? There are currently a fair number of issues with O/R mappers, and you have to understand them before committing to that technique:

  • Performance sucks. That may sound a bit harsh, but never forget that objectrelational mapping means "using objects in memory, and forgetting the database side." So, you'll have many collections cluttering your RAM, and many SQL requests going back and forth each time you have to fetch another bit of data. That behaviour is expected, but the magic behind automatic management of those requests means that you'll have a tendency to forget about the price you pay each time you ask for more from the database.
  • Writing relational data is not an object-oriented concept. That one is easy to understand: what happens if you have to do complex updates? O/R mapping is almost always a disconnected approach, and writing back to the database very often implies putting on some locks and transactions. Those two mechanisms are very hard to reconcile, and although NEO provides some help and hints (it nicely handles sequences and auto-incremented fields, for example), you'll have to fall back to more conventional techniques when updates are complex or when they are done under heavy load.
© 2016 Microsoft