April 2014

Volume 29 Number 4


Windows with C++ : Using Databases on Windows Azure

Kenny Kerr | April 2014

Kenny KerrMicrosoft has a long history of confusing developers with a dizzying array of data access technologies. There was a time when it seemed as if every release of Windows, SQL Server or Visual Studio ushered in a new data access API. Somewhere along the line—around 1996, I believe—Microsoft applied its usual enthusiasm to move developers from ODBC to OLE DB.

ODBC stands for Open Database Connectivity and was the old standard for accessing database management systems. OLE DB stands for … wait for it … Object Linking and Embedding Database and was the new and universal data access utopia. But this name is entirely misleading, which I’ll discuss in a moment.

I still remember reading Don Box’s column in the July 1999 issue of MSDN Magazine (then called the Microsoft Systems Journal) where he described the motivation and inspiration leading to OLE DB. I remember thinking at the time that this was a lot more complicated than ODBC but that it sure was a lot more extensible. The reason the name OLE DB is so misleading is that it has nothing to do with OLE and it isn’t specifically for databases. It was truly designed as a universal data access model for all data—relational or otherwise—such as text, XML, databases, search engines, you name it. OLE DB debuted when COM was all the rage on the Windows platform, so its COM-heavy API and natural extensibility appealed to many developers.

Still, as a relational database API, it never quite achieved the raw performance of ODBC. Subsequent data access technologies, such as those from the Microsoft .NET Framework, dropped all but the database access features so the dream of universal data access began to fizzle out. Then in August 2011 the SQL Server team, the biggest proponents of OLE DB, made the stunning announcement that “Microsoft is aligning with ODBC for native relational data access” (bit.ly/1dsYgTD). They declared the marketplace was moving away from OLE DB and toward ODBC. So it’s back to ODBC as you and I figure out how to access SQL Server for the next generation of native C++ applications.

The good news is that ODBC is relatively simple. It’s also extremely fast. It was often claimed that OLE DB outperformed ODBC, but this was rarely the case. The bad news is that ODBC is an old C-style API that few developers remember—or ever learned—how to use. Fortunately, modern C++ comes to the rescue and can make programming ODBC a breeze. If you want to access databases on Windows Azure with C++, then you need to embrace ODBC. Let’s take a look.

Like so many other C-style APIs, ODBC is modeled around a set of handles representing objects. So I’ll again use my trusty unique_handle class template that I’ve written about and used in numerous columns. You can get a copy of handle.h from dx.codeplex.com and follow along. ODBC handles are, however, a little dumb. The ODBC API needs to be told the type of each handle as it’s used, both in creating and freeing a handle (and its underlying object).

A handle type is expressed with a SQLSMALLINT, which is just a short int value. Instead of defining a unique_handle traits class for each type of object that ODBC defines, I’m going to make the traits class itself a template. Figure 1 shows what this might look like.

Figure 1 An ODBC Traits Class

template <SQLSMALLINT T>
struct sql_traits
{
  using pointer = SQLHANDLE;
  static auto invalid() noexcept -> pointer
  {
    return nullptr;
  }
  static auto close(pointer value) noexcept -> void
  {
    VERIFY_(SQL_SUCCESS, SQLFreeHandle(T, value));
  }
};

The traits class close method in Figure 1 is where you can begin to see how you need to tell ODBC the type of each handle when used with some of the generic ODBC functions. Because I’m using the latest preview build of the Visual C++ compiler (the November 2013 CTP as of this writing) I’m able to replace the deprecated throw exception specification with the noexcept specifier, opening the door for the compiler to generate more optimal code in some cases. Unfortunately, although this compiler also provides the ability to deduce the return type for auto functions, it includes a bug that prevents it from doing so for member functions of class templates. Of course, because the traits class is itself a class template, a template alias comes in handy:

template <SQLSMALLINT T>
using sql_handle = unique_handle<sql_traits<T>>;

Now I can define type aliases for the various ODBC objects, such as the environment and statement objects:

using environment = sql_handle<SQL_HANDLE_ENV>;
using statement = sql_handle<SQL_HANDLE_STMT>;

I’ll also define one for connections, although I’ll use a more specific name:

using connection_handle = sql_handle<SQL_HANDLE_DBC>;

The reason for this is that connections require a bit more work to clean up in all cases. While environment and statement objects don’t need much more than this, connection objects really need a connection class to reliably deal with connectivity. Before I can deal with that, I need to create an environment.

The generic SQLAllocHandle function creates various objects. Here, again, you see the separation of the object—or at least the handle—and its type. Rather than duplicating this code all over the place, I’ll again use a function template to bring the type information back together. Here’s a function template for the generic ODBC SQLAllocHandle function:

template <typename T>
auto sql_allocate_handle(SQLSMALLINT const type,
                         SQLHANDLE input)
{
  auto h = T {};
  auto const r = SQLAllocHandle(type,
                                input,
                                h.get_address_of());
  // TODO: check result here ...
  return h;
}

Of course, this is still just as generic as the ODBC function, but it exposes the genericity in a C++-friendly manner. I’ll get back to error handling in a moment. Because this function template will allocate a handle of a given type and return a handle wrapper, I can simply use one of the type aliases I defined earlier. For an environment, I might do this:

auto e = sql_allocate_handle<environment>(SQL_HANDLE_ENV, nullptr);

The input handle or parent handle, the second parameter, provides an optional parent handle for some logical containment. An environment doesn’t have a parent but instead acts as the parent for connection objects. Unfortunately, it takes a bit more effort to create an environment. ODBC requires that I tell it which version of ODBC I’m expecting. I do that by setting an environment attribute with the SQLSetEnvAttr function. Here’s what this might look like when wrapped up in a handy helper function:

auto create_environment()
{
  auto e = 
    sql_allocate_handle<environment>(SQL_HANDLE_ENV, nullptr);
  auto const r = SQLSetEnvAttr(e.get(),
    SQL_ATTR_ODBC_VERSION,
    reinterpret_cast<SQLPOINTER>(SQL_OV_ODBC3_80),
    SQL_INTEGER);
  // TODO: check result here ...
  return e;
}

At this point I’m ready to create a connection, which, fortunately, is quite simple:

auto create_connection(environment const & e)
{
  return sql_allocate_handle<connection_handle>(
    SQL_HANDLE_DBC, e.get());
}

Connections are created in the context of an environment. Here you can see that I use the environment as the parent of the connection. I still need to actually make a connection, and that’s the job of the SQLDriverConnect function, some of whose parameters may be ignored:

auto connect(connection_handle const & c,
             wchar_t const * connection_string)
{
  auto const r = SQLDriverConnect(c.get(), nullptr,
    const_cast<wchar_t *>(connection_string),
    SQL_NTS, nullptr, 0, nullptr,
    SQL_DRIVER_NOPROMPT);
  // TODO: check result here ...
}

Notably, the SQL_NTS constant just tells the function the preceding connection string is null terminated. You could, instead, opt to provide the length explicitly. The final SQL_DRIVER_NOPROMPT constant indicates whether to prompt the user if more information is required to establish a connection. In this case, I’m saying “no” to prompts.

But as I alluded earlier, gracefully closing a connection is a little more involved. The trouble is that while the SQLFreeHandle function is used to free the connection handle, it assumes the connection is closed and won’t automatically close an open connection.

What I need is a connection class that tracks the connection’s connectivity. Something like this:

class connection
{
  connection_handle m_handle;
  bool m_connected { false };
public:
  connection(environment const & e) :
    m_handle { create_connection(e) }
  {}
  connection(connection &&) = default;
  // ...
};

I can now add a connect method to my class using the previously defined non-member connect function and update the connected state accordingly:

auto connect(wchar_t const * connection_string)
{
  ASSERT(!m_connected);
  ::connect(m_handle, connection_string);
  m_connected = true;
}

The connect method asserts the connection is not open to begin with and keeps track of the fact that the connection is open at the end. The connection class destructor can then automatically disconnect as needed:

~connection()
{
  if (m_connected)
  {
    VERIFY_(SQL_SUCCESS, SQLDisconnect(m_handle.get()));
  }
}

This will ensure the connection is disconnected prior to the member handle destructor being called to free the connection handle itself. I can now create an ODBC environment and establish a connection correctly and efficiently with just a few lines of code:

auto main()
{
  auto e = create_environment();
  auto c = connection { e };
  c.connect(L"Driver=SQL Server Native Client 11.0;Server=...");
}

What about statements? The sql_allocate_handle function template again comes in handy, and I’ll just add another method to my connection class:

auto create_statement()
{
  return sql_allocate_handle<statement>(SQL_HANDLE_STMT,
                                        m_handle.get());
}

Statements are created in the context of a connection. Here you can see how the connection is the parent for the statement object. Back in my main function, I can create a statement object quite simply:

auto s = c.create_statement();

ODBC provides a relatively simple function for executing SQL statements, but I’ll again wrap it up for convenience:

auto execute(statement const & s,
             wchar_t const * text)
{
  auto const r = SQLExecDirect(s.get(),
                               const_cast<wchar_t *>(text),
                               SQL_NTS);
  // TODO: check result here ...
}

ODBC is an extremely old C-style API so it doesn’t use const, not even conditionally for C++ compilers. Here, I need to cast away the “const-ness” so the caller is shielded from this const-ignorance. Back in my main function, I can execute SQL statements quite simply:

execute(s, L"create table Hens ( ... )");

But what if I execute a SQL statement that returns a result set? What if I execute something like this:

execute(s, L"select Name from Hens where Id = 123");

In that case, the statement effectively becomes a cursor and I need to fetch the results, if any, one at a time. That’s the role of the SQLFetch function. I might want to know whether a hen with the given Id exists:

if (SQL_SUCCESS == SQLFetch(s.get()))
{
  // ...
}

On the other hand, I might execute a SQL statement that returns multiple rows:

execute(s, L"select Id, Name from Hens order by Id desc");

In that case, I can simply call the SQLFetch function in a loop:

while (SQL_SUCCESS == SQLFetch(s.get()))
{
  // ...
}

Getting the individual column values is what the SQLGetData function is for. This is another generic function, and you need to precisely describe the information you expect as well as the buffer where you expect it to copy the resulting value. Retrieving a fixed-size value is relatively straightforward. Figure 2 shows a simple function to retrieve a SQL int value.

Figure 2 Retrieving a SQL Integer Value

auto get_int(statement const & s,
             short const column)
{
  auto value = int {};
  auto const r = SQLGetData(s.get(),
                            column,
                            SQL_C_SLONG,
                            &value,
                            0,
                            nullptr);
  // TODO: check result here ...
  return value;
}

The first parameter in SQLGetData is the statement handle, the second is the one-based column index, the third is the ODBC type for a SQL int and the fourth is the address of the buffer that will receive the value. The second-to-last parameter is ignored because this is a fixed-size data type. For other data types, this would indicate the size of the buffer on input. The final parameter provides the actual length or size of the data copied into the buffer. Again, this isn’t used for fixed-size data types, but this parameter may also be used to return status information such as whether the value was null. Retrieving a string value is only slightly more complicated. Figure 3 shows a class template that will copy the value into a local array.

Figure 3 Retrieving a SQL String Value

template <unsigned Count>
auto get_string(statement const & s,
                short const column,
                wchar_t (&value)[Count])
{
  auto const r = SQLGetData(s.get(),
                            column,
                            SQL_C_WCHAR,
                            value,
                            Count * sizeof(wchar_t),
                            nullptr);
  sql_check(r, SQL_HANDLE_STMT, s.get());
}

Notice how in this case I need to tell the SQLGetData function the actual size of the buffer to receive the value, and I need to do so in bytes, not characters. If I queried for the name of a particular hen and the Name column holds a maximum of 100 characters, I might use the get_string function, as follows:

if (SQL_SUCCESS == SQLFetch(s.get()))
{
  wchar_t name[101];
  get_string(s, 1, name);
  TRACE(L"Hen’s name is %s\n", name);
}

Finally, while I can reuse a connection object to execute multiple statements, once the statement object represents a cursor, I need to be sure to close the cursor before executing any sub­sequent statements:

VERIFY_(SQL_SUCCESS, SQLCloseCursor(s.get()));

Ironically, this isn’t a resource management issue. Unlike the challenges with open connections, the SQLFreeHandle function doesn’t care if the statement has an open cursor.

I’ve avoided talking about error handling up until now because it’s a complex topic in its own right. ODBC functions return error codes, and it’s your responsibility to check the value of these return codes to determine whether the operation succeeded. Usually the functions will return the SQL_SUCCESS constant indicating success, but they can also return the SQL_SUCCESS_WITH_INFO constant. The latter is equally successful but implies further diagnostic information is available if you wish to retrieve it. Typically, only in debug builds do I retrieve the diagnostic information when the SQL_SUCCESS_WITH_INFO constant is returned. This way I can gather as much information as possible in development and not waste cycles in production. Of course, I always gather this information when an error code is actually returned. Regardless of the cause, the process by which the diagnostic information is retrieved is the same.

ODBC provides diagnostic information as a result set and you can retrieve the rows one at a time with the SQLGetDiagRec function and a one-based row index. Just make sure to call it with the handle of the object that produced the error code in question.

There are three principal bits of information in each row: a native error code specific to the ODBC data source or driver; a short, cryptic, five-character state code that defines the class of error to which this record might refer; and a longer textual description of the diagnostic record. Given the necessary buffers, I can simply call the SQLGetDiagRec function in a loop to retrieve them all, as shown in Figure 4.

Figure 4 Retrieving Diagnostic Error Information

auto native_error = long {};
wchar_t state[6];
wchar_t message[1024];
auto record = short {};
while (SQL_SUCCESS == SQLGetDiagRec(type,
                                    handle,
                                    ++record,
                                    state,
                                    &native_error,
                                    message,
                                    _countof(message),
                                    nullptr))
{
  // ...
}

Windows Azure along with SQL Server provides an amazingly simple way to get started with hosted databases. This is particularly compelling as the SQL Server database engine is the same one C++ developers have known and used for years. While OLE DB has been scrapped, ODBC is more than up to the task and, in fact, is simpler and faster than OLE DB ever was. Of course, it takes a bit of help from C++ to make it all come alive in a coherent way.

Check out my Pluralsight course, “10 Practical Techniques to Power Your Visual C++ Apps” (bit.ly/1fgTifi), for more information about using Visual C++ to access databases on Windows Azure. I provide step-by-step instructions to set up and use database servers and bind columns to simplify the process of fetching rows of data, examples of how to simplify and modernize the error-handling process, and much more.


Kenny Kerr is a computer programmer based in Canada, as well as an author for Pluralsight and a Microsoft MVP. He blogs at kennykerr.ca and you can follow him on Twitter at twitter.com/kennykerr.