All applications use data. Even the simplest "Hello, World!" program displays data. Most applications also need to store data somewhere. In the world of business systems, this data might be a list of the products and services that the business sells, together with information about the customers who have purchased these items, and the details of the various orders that they have placed.

In theory, a business system can store information by using any convenient data storage technology, but since the early 1970s a sizable majority of solutions have based their storage requirements around a relational database. Relational database technology was firmly grounded in the single-site enterprise solutions of the 1970s, 80s, and 90s. However, since the turn of the millennium, many large organizations now look to building distributed solutions using the Internet as their communications infrastructure; an environment that was never envisaged back in the 1970s. Additionally, advances in computing power and disk storage technology now enable organizations to gather, store, and process much more information that was possible 30 or 40 years ago. In this environment issues surrounding scalability, throughput, responsiveness, and the flexibility to handle complex, dynamic relationships between data items have become paramount. Many relational database management systems do not sit comfortably in this setting, and struggle in the face of ever expanding connectivity and potentially massive datasets. This situation has led to several significant large organizations forgoing relational databases and instead implementing their own custom solutions, specifically designed to meet these requirements. A number of these database solutions have been openly documented, and the software for them is available in the public domain; organizations can freely download the software and integrate it into their own systems. Collectively, these solutions are referred to as "NoSQL" databases.

There is an ever-expanding range of NoSQL databases currently available, covering a variety of different data formats and structures. In most cases, a NoSQL database is designed to provide efficient data storage and access to support a specific pattern of use, such as managing highly-connected networks of objects with complex interrelationships, storing and accessing documents with many queryable fields, or providing fast access to blobs containing almost anything. As a result, an organization is likely to find that no single NoSQL database meets all of its requirements, and it might be necessary to incorporate more than one such database into their solutions. Organizations seeking to use a NoSQL database are therefore faced with a twofold challenge:

  • Which NoSQL database(s) best meet(s) the needs of the organization?
  • How does an organization integrate a NoSQL database into its solutions?

This guide focuses on the most common types of NoSQL database currently available, describes the situations for which they are most suited, and shows examples of how you might incorporate them into a business application. The guide summarizes the experiences of a fictitious organization named Adventure Works, who implemented a solution that comprised an assortment of different databases. The architecture of the solution is based on a web service that connects to each of the databases. The rationale that the developers at Adventure Works adopted was to select the most appropriate data storage technology that met the specific business requirements of each part of the application. The result is a polyglot solution, with various parts of the data held in different databases, but combined together by the logic in the web service.

Who This Book Is For

This book is intended for architects, developers, and information technology professionals who design, build, or maintain large-scale applications and services that store data in a database, and that have to handle requests from a large number of users. Much of the guidance in this book is intended to be generic and apply to any operating system and NoSQL database implementation. However, many of the examples shown are based on Microsoft technologies. To understand the sample code provided with this book, you should be familiar with the Microsoft .NET Framework, the Microsoft Visual Studio development system, ASP.NET MVC, Microsoft SQL Server, and the Microsoft Visual C# development language. If you wish to run the sample application in the cloud using Windows Azure, familiarity with the Windows Azure Table service is also useful. Additionally, to configure and run the polyglot version of the sample application, you should be familiar with the NoSQL technologies used by this application. For more information, see the section "What You Need to Use the Code" later in this preface.

Why This Book Is Pertinent Now

Other than being nonrelational, there is currently no formal definition of what constitutes a NoSQL database. Furthermore, there are currently no standard APIs in existence for applications to use to interact with a NoSQL database; each NoSQL database offers its own library (or sometimes a set of libraries). Integrating different NoSQL databases into a single seamless, extensible solution is a challenge facing an increasing number of developers. For example, developers need to understand how to ensure consistency across different databases, and how to maintain the integrity of the relationships between data held in different databases. There are also many occasions when a relational database is a better solution than a NoSQL database. This guide aims to help you understand how to design solutions that take advantage of the most appropriate database technology.

How This Book Is Structured

This is the road map of the guide.

Guide road map

Guide road map



Chapter 1, "Data Storage for Modern High-Performance Business Applications"

This chapter provides an overview of the common challenges that organizations have encountered with the relational model and relational database technology, and discusses how NoSQL databases can help to address these challenges.

Chapter 2, "The Adventure Works Scenario"

This chapter describes the business requirements of the Adventure Works Shopping application, and summarizes the architecture of the solution that Adventure Works built, based on web services and a combination of SQL and NoSQL databases.

Chapter 3, "Implementing a Relational Database"

This chapter describes how to design a relational database, and summarizes the conflicting requirements that you may need to consider to implement a database that supports efficient transactions and fast queries.

Chapter 4, "Implementing a Key/Value Data Store"

This chapter describes the principles that underpin most large-scale key/value stores, and summarizes the concerns that you should address to use a key/value store for saving and querying data quickly and efficiently.

Chapter 5, "Implementing a Document Database"

This chapter describes the primary features of common document databases, and summarizes how you can design documents that take best advantage of these features to store and retrieve structured information in an optimal manner.

Chapter 6, "Implementing a Column-Family Database"

This chapter provides information on how to design the schema for a column-family database to best meet the needs of applications that perform column-centric queries.

Chapter 7, "Implementing a Graph Database"

This chapter describes how to design a graph database to support the analytical processing performed by an application.

Chapter 8, "Building a Polyglot Solution"

This chapter focuses on the challenges faced by developers building a business solution with the data spanning different types of databases. It summarizes the major concerns of a polyglot solution, and describes some strategies for addressing these concerns.

This guide also includes appendices that describe how the sample application works, and why the developers at Adventure Works selected the various databases that the application uses.

What You Need to Use the Code

These are the system requirements for building and running the sample solution:

  • Microsoft Windows 7 with Service Pack 1, Microsoft Windows 8, Microsoft Windows Server 2008 R2 with Service Pack 1, or Microsoft Windows Server 2012 (32 bit or 64 bit editions)
  • Microsoft Internet Information Server (IIS) 7.0 or later
  • Microsoft Visual Studio 2012 Ultimate, Premium, or Professional edition.
  • Visual Studio 2012 Update 2
  • Windows Azure SDK for .NET, version 2.0 or later (includes the Windows Azure Tools for Visual Studio)
  • Microsoft SQL Server 2012, or SQL Server Express 2012
  • (Optional) If you wish to deploy the application to Windows Azure, you will also need a Windows Azure subscription

If you wish to configure the solution to use the Windows Azure Table service in the cloud as a key/value store for holding shopping cart information, you must have a valid Windows Azure subscription.

To install and run the polyglot database solution, you must have the following additional software installed and configured:

  • MongoDB (version 2.2.3 or later), if you wish to store the product catalog in a document database.
  • Neo4J (version 1.8 or later), if you wish to store product recommendations by using a graph database.
You can download the sample code from the Microsoft Download Center at

Who's Who?

This book uses a sample application that illustrates integrating applications with the cloud. A panel of experts comments on the development efforts. The panel includes a cloud specialist, a software architect, a software developer, and a database professional. The delivery of the sample application can be considered from each of these points of view. The following table lists these experts.


Jana is a software architect. She plans the overall structure of an application. Her perspective is both practical and strategic. In other words, she considers the technical approaches that are needed today and the direction a company needs to consider for the future.

“It's not easy to balance the needs of the company, the users, the IT organization, the developers, and the technical platforms we rely on.”


Markus is a senior software developer. He is analytical, detail oriented, and methodical. He's focused on the task at hand, which is building a great cloud-based application. He knows that he's the person who's ultimately responsible for the code.

“For the most part, a lot of what we know about software development can be applied to different environments and technologies. But, there are always special considerations that are very important.”


Poe is database specialist. He is an expert on designing and deploying databases, whether they are relational or nonrelational. Poe has a keen interest in practical solutions; after all, he's the one who gets paged at 03:00 when there's a problem.

“Implementing databases that are accessed by thousands of users involves some big challenges. I want to make sure our databases perform well, are reliable, and are secure. The reputation of Adventure Works depends on how users perceive the applications that access our databases.”


Bharath is a cloud specialist. He checks that a cloud-based solution will work for a company and provide tangible benefits. He is a cautious person, for good reasons.

“The cloud provides a powerful environment for hosting large scale, well-connected applications. The challenge is to understand how to use this environment to its best advantage to meet the needs of your business”.

If you have a particular area of interest, look for notes provided by the specialists whose interests align with yours.

Where to Go for More Information

There are a number of resources listed in text throughout the book. These resources will provide additional background, bring you up to speed on various technologies, and so forth. For your convenience, there is a bibliography online that contains all the links so that these resources are just a click away.

You can find the bibliography on MSDN at: