Article
01/25/2019

March 2017

Volume 32 Number 3

[Cutting Edge]

Soft Updates with Temporal Tables

By Dino Esposito | March 2017

Dino Esposito The vast majority of today’s developers use a classic relational database to store data. It’s an approach that worked for decades and still works today, even though alternate and schemaless data stores (collectively known as NoSQL stores) are proving themselves quite effective in a variety of business scenarios. Every time you update an existing table record, you automatically lose track of the state it had before. Overriding existing data hasn’t been a big problem for companies so far, but things are changing fast. Today, data is the most valuable asset of every company and the input of business intelligence processes.

In the May and June 2016 installments of this column (msdn.com/magazine/mt703431 and msdn.com/magazine/mt707524, respectively), I discussed a general-purpose approach to take a create, read, update, delete (CRUD) system to the next level through soft updates and deletes. A soft update is a standard update operation, except that the old state of the record is preserved in some way. In doing so, you have an additional API to retrieve the history of each entity created during the system’s lifetime.

There aren’t many ways in which you can approach the problem of updating and deleting records while keeping track of them. For deletions, that means adding an extra column—typically a Boolean column—to mark the record as deleted. For updates, the most functional approach is to create and maintain a separate history table for each that’s relevant to track. Keeping data and history tables in sync requires additional business and data access logic, and a dedicated API to query the history.

Managing historical data in relational tables has been formalized in the ANSI SQL 2011 standard. In its newest version, SQL Server supports a feature called temporal tables that lets you create and manage a shadow historical table for each data table of your choice. This month, I’ll delve into SQL Server 2016 temporal tables and their use from within Entity Framework (EF).

Structure of Temporal Tables

Conceptually, the main difference between temporal databases and databases is that classic databases only store data that’s true at the current time. Temporal databases, instead, maintain multiple copies of each piece of data. A temporal table appends a few extra time columns to denote when the record reached the given state. Figure 1shows how a temporal table looks like in SQL Server 2016 Management Studio.

Figure 1 A Temporal Table in SQL Server 2016

There are a couple of noticeable things in the figure. One is the child history table, named dbo.BookingsHistory. This table is automatically created by SQL Server every time it processes the T-SQL instruction that would create a temporal table. Developers have only read access to the history table. The other thing to notice in Figure 1 is the lack of the Delete command in the context menu of the selected table. Once a temporal table is created, any further manipulation on it, whether it happens through the interface of SQL Server Management Studio or programmatically, is strictly controlled and in some cases limited. As examples of this, you can’t drop or replicate a temporal table, and limitations also exist on cascading updates and deletions. More information on limitations that affect temporal tables in SQL Server 2016 can be found at bit.ly/2iahP1n.

In SQL Server 2016, you create a temporal table using a special clause at the end of the CREATE TABLE instruction. The status of temporal table boils down to turning on and off the value of the new SYSTEM_VERSIONING setting. In other words, any table can programmatically be turned into a temporal table and then moved back to the original non-temporal status at any time. All aforementioned limitations that affect temporal tables cease to exist the moment in which the SYSTEM_VERSIONING setting is turned off.

Temporal Table and Entity Framework

Many developers today use SQL Server through the services of EF and EF Code First, in particular. At the moment, though, EF provides no special support for temporal tables. Ad hoc support will come in the near future. The good news is that some basic level of support for temporal tables can still be achieved with the current versions of EF 6.x and even more with EF Core. The bad news is that full integration with LINQ-to-Entities is realistically possible to achieve only with low-level changes to the framework, specifically in the way in which the LINQ-to-Entities provider generates the SQL code for the query. If you’re a SQL developer instead, then the newest T-SQL language provides all the syntax tools you need to operate temporal tables.

The state of the art of temporal tables for EF developers can be summarized as follows: First, it’s pretty easy to create temporal tables in Code First. Second, updates and deletions can be operated via EF in the usual way. Third, queries require ad hoc facilities to be created.

The approach that I’ve found the most effective to query temporal tables passes through a small set of ad hoc repository methods based on ADO.NET code. It might sound surprising at first, but at the end of the day, if you need temporal tables, then chances are you mostly need to get the history of one particular entity. Take, for example, all the changes of an order or an invoice.

Therefore, in the end, all you need to have is a dedicated and possibly handy FirstOrDefault-like method directly exposed for an aggregate. And a repository class looks to me like a good place to have it.

A Temporal-Enabled Initializer

In EF Code First, a new database is created whenever the database doesn’t exist and the DbContext class inherits from CreateDatabaseIfNotExists. In doing so, a new table is created for each declared DbSet property. Can you create a temporal table, as well? Today, without ad hoc attributes and syntax features, creating a temporal table is a two-step operation. First, the regular table must be created; this is just the kind of work that Code First usually does. The second step revolves around turning on the SYSTEM_VERSIONING setting. This requires an ad hoc ALTER TABLE statement. Figure 2 shows a possible implementation of the Seed method of the initializer class that first checks the version of the underlying SQL Server and then alters the state of a previously created Booking table.

Figure 2 Creating a Temporal Table Via Code First

protected override void Seed(DbContext context)
{
  // Grab the SQL Server version number
  var data = context
    .Database
    .SqlQuery<string>(@"select
  left(cast(serverproperty('productversion')
       as varchar), 4)")
    .FirstOrDefault();
  if (data != null)
  {
    var version = data.ToInt();+
    if (version < 13)
      throw new Exception("Invalid version of SQL Server");
  }
  // Prepare the SQL to turn system versioning on SQL Server 2016
  var cmd = String.Format(SqlSystemVersionedDbFormat, "Bookings");
  context.Database.ExecuteSqlCommand(cmd);
}

The actual T-SQL command you need to create a temporal table (or a system-versioned table, as it’s referenced in the jargon of SQL Server 2016), is shown in Figure 3.

Figure 3 Creating a Temporal Table

private const string SqlSystemVersionedDbFormat =
  @"ALTER TABLE dbo.{0}
    ADD SysStartTime datetime2(0)
    GENERATED ALWAYS AS ROW START HIDDEN CONSTRAINT
      DF_{0}_SysStart DEFAULT SYSUTCDATETIME(),
    SysEndTime datetime2(0)
    GENERATED ALWAYS AS ROW END HIDDEN CONSTRAINT
      DF_{0}_SysEnd DEFAULT CONVERT(datetime2 (0),
      '9999-12-31 23:59:59'),
    PERIOD FOR SYSTEM_TIME (SysStartTime, SysEndTime)
  ALTER TABLE dbo.{0}
    SET (SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.{0}History))";

The {0} placeholders you see in the string in Figure 3 refer to the name of the actual table. It’s Bookings, as shown in Figure 1.

The resulting history table is a copy of the main table, plus a couple of datetime2 columns named SysStartTime and SysEndTime. Altogether, the two columns indicate the validity period for that particular state of the record. SysStartTime indicates when the record got a given state and SysEndTime indicates when the validity of that state ceased. Updates and deletes are the database operations that cause the values in SysStartTime and SysEndTime to change.

Updates and Deletes

The logic that keeps that primary table and its history synchronized is backed into the SQL Server 2016 database engine. A new history record is created whenever a record in the primary table is updated, regardless of how you perform the update. In other words, whether you edit the values of a temporal record directly in Management Studio, through stored procedures, ADO.NET commands, or via EF, a new history record is created, as shown in Figure 4.

Figure 4 Updating Records in a Temporal Table

The first query shown in Figure 4 presents the current state of the record with ID=2. The second query, instead, presents the records found in the history table for the same ID. Such an observable state has been determined by two quick updates I made directly in the Management Studio editor. I first changed the column Hour from nine to 13 and then a couple of seconds later I changed the value of the Owner column from Dino to Paul. The first record in the history table says that the originally created record (which I put in the table through EF and a call to SaveChanges) was in a valid state for about five minutes. Then it moved into another state that lasted for a couple of seconds and finally reached the current state. As you can see, the current state isn’t stored in the history table. Figure 5 shows the state of the tables after the record with ID=2 is deleted.

Figure 5 Deleting Records in a Temporal Table

The primary table returns an empty result set when queried for ID=2. The history table, instead, now has a third record whose validity period ranges from the time of the last update to the time of deletion.

Querying for a Specific Entity

Keeping track of all state changes is useful because it doesn’t let you miss a thing of what happens in the system. It gives you a complete (and free) log of all database operations. Better yet, it gives you a full list of state changes, which has a much more relevant business role than a plain log of SQL statements. Put another way, conceptually speaking, temporal tables are very close to Event Sourcing; I dare say that temporal tables are a CRUD-based form of Event Sourcing. Let’s see how you can query past states of a given aggregate.

Even though the History nested table is useful to figure things out, SQL Server 2016 offers a direct syntax to query temporal data about a given record. Here’s the schema of a sample command that retrieve versions of the record with ID=2 in a given time interval:

var sql = @"SELECT * FROM Bookings 
  FOR SYSTEM_TIME BETWEEN '{0}' AND '{1}'
  WHERE ID=2";

A temporal query is a regular query plus the FOR SYSTEM_TIME clause that sets the time interval to consider. The database engine will resolve the query looking at the additional columns in the history table and the content of the primary and nested table. The query is expected to return a list of records. How can you force EF to run a query like this? In EF 6, you can only leverage the SqlQuery method of the DbSet class:

using (var db = new EF6Context())
{
  var current = db.Bookings.Single(b => b.Id == 1);
  var time = DateTime.Now.AddMinutes(-5);
  var old = db.Bookings
    .SqlQuery("SELECT * FROM dbo.Bookings
          FOR SYSTEM_TIME AS OF {0} WHERE Id = 1", time)
    .SingleOrDefault();
}

Note that for EF 6, the column names returned in the query need to match the property names on the class. This is because SqlQuery doesn’t use mappings. If column and property names don’t match, then you’d need to alias the columns in the SELECT list, rather than just SELECT *.

With EF Core things are in some respect easier and better. In EF Core, the method to use is FromSql. First off, FromSql method uses mappings; meaning that you don’t have to worry about aliasing if column and property names don’t match:

using (var db = new EFCoreContext())
{
  var current = db.Bookings.Single(b => b.Id == 1);
  var time = DateTime.Now.AddMinutes(-5);
  var old = db.Bookings
    .FromSql("SELECT * FROM dbo.Bookings
              FOR SYSTEM_TIME AS OF {0}", time)
    .SingleOrDefault(b => b.Id == 1);
}

Second, you can compose on top of the initial select using LINQ. This means that you can use Where, OrderBy, GroupBy or whatever other LINQ operator and those things in general will be translated to a query of the form:

SELECT projection
FROM (SELECT * FROM dbo.Bookings FOR SYSTEM_TIME AS OF {0}) as Bookings
WHERE condition

All this said, if you prefer, you could always use plain ADO.NET and data readers to access data stored in temporal tables.

Wrapping Up

You can definitely work with temporal tables in a data layer largely based on EF and even if you happen to use plain ADO.NET for the queries, you can leverage LINQ-to-Objects to build complex in memory. The Entity Framework team roadmap shows in the coming months some work items to do on temporal tables. So, let’s just wait.

Dino Esposito is the author of “Microsoft .NET: Architecting Applications for the Enterprise” (Microsoft Press, 2014) and “Modern Web Applications with ASP.NET” (Microsoft Press, 2016). A technical evangelist for the .NET and Android platforms at JetBrains, and frequent speaker at industry events worldwide, Esposito shares his vision of software at software2cents.wordpress.com and on Twitter: @despos.

Thanks to the following Microsoft technical expert for reviewing this article: Rowan Miller