Azure Insider - Migrating Database Workloads to the Cloud
One of the biggest trends in IT today is moving database workloads to the cloud. We frequently get questions about the issues involved when migrating relational database workloads into Windows Azure, so we decided to reach out to various product teams and practitioners in the field to get some concrete answers to these challenges. There’s a spectrum of offerings available that must be carefully considered prior to migration, and a host of issues including cost, compatibility, flexibility, maintenance, compliance and scale, to name a few.
This month’s column is about moving your SQL Server database to the public cloud. This means we won’t discuss the private cloud or other on-premises options. Naturally, beyond just the data, there are a number of issues to consider when moving an entire application to the cloud, such as caching, identity and legacy code.
Many customers are confused about all the options in the cloud, especially when it comes to hosting relational data. After all, the Microsoft relational database story is big and comprehensive. In this month’s column, we want to demystify your options when it comes to SQL Server and data storage with the Windows Azure platform.
Most IT decisions are based on cost and value. A good rule of thumb to remember is as you increase the abstraction from the hardware and start to virtualize your technologies, efficiencies are gained, resulting in lower provisioning and maintenance costs. However, there’s a trade-off when you virtualize hardware—you might lower costs but at the same time decrease the level of compatibility with existing on-premises databases, because you have less control over the configuration of the database with respect to its underlying hardware. One of our goals here is to describe those trade-offs so that you can make better decisions.
Figure 1 highlights some of the options. There are at least four ways you can leverage SQL Server, as the numbered circles indicate. Notice that both cloud-based and on-premises solutions are depicted. The top half of Figure 1 describes the public cloud. If you drill further into the public cloud, you’ll see it consists of two main pillars: Infrastructure as a Service (IaaS) and Platform as a Service (PaaS). The bottom half of the figure illustrates the options with respect to on-premises SQL Server. You can either host your own private cloud or follow the traditional method and run SQL Server on raw, physical hardware, free of any virtualization technology.
Figure 1 SQL Server Hosting Options
The Windows Azure Portal
We’ll leverage three Windows Azure components to deploy our on-premises database to the cloud, illustrating both the IaaS and PaaS models: Virtual Machines, or VMs (IaaS); Windows Azure SQL Database (PaaS); and Storage. They can be accessed through the management portal as shown in Figure 2. The first two are fairly obvious, but the Storage section might be a surprise. Windows Azure Storage is needed to hold the BACPAC file, which contains the database schema and the data stored in the database. You can create this file with SQL Server Management Studio (SSMS); it’s the logical equivalent of a database backup. SSMS will conveniently place this BACPAC file in Windows Azure Storage, which is an ideal location because it can be imported directly from the cloud versions of SQL Server, as seen in options 3 and 4 in Figure 1.
Figure 2 The Windows Azure Management Portal
Referring back to Figure 1, you can see that SQL Server maps to four main use cases. As you move through the four scenarios in order, efficiencies are gained, resulting in lower costs. However, the options correspondingly impact your control over hardware and configuration, affecting the level of compatibility as you deploy on-premises databases to the cloud.
Old School, Raw Iron Scenario No. 1 is about running SQL Server using traditional approaches, which means it isn’t virtualized and can be highly customized by IT personnel and DBAs. Naturally, because this puts IT staff in full control of exactly how everything is configured and how it functions, there’s significantly more work in provisioning, maintaining, and scaling the databases. Preparing the cluster for high availability (HA) and performing backup and restore are just two of the many resource-intensive responsibilities that must be managed by a DBA.
On-Premises, Private Cloud Scenario No. 2 depicts SQL Server deployed to a private cloud, which leverages virtualization to optimize the use of hardware resources. This technology allows you to pool SQL Server databases using Windows Server 2012 Hyper-V for efficient use of compute, network and storage. It lets you support an elastic infrastructure for your databases, so you can scale up and down more effectively. From a management perspective, this option provides built-in self-service capabilities, so you can provision databases using Microsoft System Center 2012. However, you’re still responsible for procuring the hardware and keeping it up-to-date with software patches.
Windows Azure SQL Server Database VM Windows Azure SQL Server for VMs (IaaS) makes it possible for you to host your cluster in a Microsoft datacenter. The key advantage this public cloud-hosted option offers is a high level of compatibility with existing on-premises SQL Server installations. Because this option supports a full-featured version of SQL Server, you’re able to leverage HA, transparent data encryption, auditing, business intelligence (BI) such as analysis services and reporting services, distributed transactions, support for active directory (Active Directory Federation Services, or AD FS 2.0), and more. Moreover, you can build your own VM or leverage a template from the Windows Azure image gallery (described later). Finally, this offering allows you to build your own virtual private networks (VPNs) easily, bridging your cloud resources to your on-premises network and providing a seamless experience that blurs the lines between the massive Microsoft datacenters and your internal servers.
SQL Server for VMs also supports the new Always-On feature, which lets you have up to five complete copies of a database maintained by each of the participating instances. Each SQL Server can host the database on its local storage, or at some other location. This provides a cloud-hosted, enterprise-level alternative to database mirroring.
Another advantage of Windows Azure VMs is that they allow you to leverage the capabilities of Windows Azure Storage, which means that the OS and disk drives are automatically persisted in Windows Azure by default, providing three automatic copies of your data inside the datacenter and allowing the optional use of geo-replication. In short, leveraging Windows Azure SQL Server for VMs makes it possible to migrate your on-premises database applications to Windows Azure, and minimize, if not eliminate, the need to modify the application and the database.
Windows Azure SQL Database To clear up any confusion about terminology, be aware that Windows Azure SQL Database used to be called SQL Azure. Windows Azure SQL Database is both a subset and a superset of the traditional SQL Server, and is hosted in the cloud as a service.
Because Windows Azure SQL Database is indeed a database as a service, there are some real advantages. First, it can be fantastically economical—some companies have reduced their database costs by more than 90 percent by using Windows Azure SQL Database. Second, it provides self-managing capabilities, enabling organizations to provision data services for applications throughout the enterprise without adding to the support burden of the central IT department. Third, the service replicates three copies of your data, and in the case of a hardware failure, it provides automatic failover.
But there are limitations. The maximum database size is 150GB, though this can be increased through sharding (the horizontal partitioning of data). This approach is also subject to more latency issues, due to the fact that it runs on a shared infrastructure in the Microsoft datacenter. Another drawback is that you must expect transient faults and program your code accordingly. Finally, Windows Azure SQL Database represents a subset of SQL Server, meaning that some features, such as XML, system stored procedures, distributed transactions, synonyms, CLR procedures, Full Text Search and the ability to have linked servers are not supported. Because of these limitations, migrating to Windows Azure SQL Database may not work for some scenarios. As you can see, this option isn’t as compatible with on-premises SQL Server instances as the previous option, Windows Azure SQL Server Database VM.
When migrating your database from an on-premises solution to a public cloud, there are several aspects to consider. One important factor is performance, which takes into account CPU utilization, disk I/O, memory constraints, and network throughput and latency. You also need to think about how much disk space is needed as well as the number of VMs. Network topology is a concern if database applications require connectivity to on-premises resources, potentially requiring you to establish a VPN, which is another offering of the Windows Azure platform. Compliance and security are also extremely important, especially when it comes to customer data—specifically the location, transfer and handling of data. There are significant liabilities with respect to medical history, test and laboratory results, and insurance information, to name just a few.
Migrating a SQL Server Database to a Windows Azure VM
Although Microsoft has released a community technology preview (CTP) version of a deployment wizard, we’ll take a more manual approach that gives a better idea of what happens during the deployment process. We’ll essentially create a BACPAC file from our on-premises database and then import it into our Windows Azure SQL Server VM.
Before creating a BACPAC file, we recommend two preliminary steps. First, disconnect all users from the database and perform a traditional backup. This preserves the integrity of the transaction log and ensures the database is deployed correctly and in its entirety. We absolutely discourage the creation of a BACPAC file from a live production database. BACPAC files should be created from a backup image. Next, create a Windows Azure SQL Server VM. You can create your own VM and upload it if you wish, but an easier path is to leverage the Windows Azure image gallery that exists at the Windows Azure Management Portal. It allows you to simply choose from a list of available VM images.
To start, sign in to the Windows Azure Management Portal. On the command bar, click New | Virtual Machine | From Gallery. Figure 3lists the various flavors of SQL Server and Windows Server from which you can choose.
Figure 3 The Image Gallery for Windows Azure Virtual Machines
After you select an image from the gallery, you can choose from a list of hardware options. Figure 4 shows that at the high end you can choose 8 cores and 56GB of RAM.
Figure 4 Available Virtual Machine Configurations
Export a BACPAC of Your On-Premises Database First, keep in mind that this process is specific to SQL Server 2012.
- To create a BACPAC file, start SSMS and connect to the on-premises instance.
- Next, in Object Explorer, expand the node for the instance from which you want to export the database and create the resulting BACPAC.
- Right-click the database name, click Tasks, then select Export Data-tier Application. A wizard will appear and you’ll be asked for the storage destination of the BACPAC file.
The great thing here is that the wizard will let you store the BACPAC file in Windows Azure Storage in a Microsoft datacenter, which can dramatically simplify the import process from the Windows Azure SQL Server VM. If you don’t have a storage account, you’ll need to go to the Windows Azure Management Portal and create one first. Make sure to record the storage account name and management key associated with it, as they’re required by the wizard. Assuming you encounter no errors, you’re now finished creating the BACPAC file. Note that this process can take from several minutes to several hours depending on the size of your database. Moreover, as mentioned previously, you need to be aware that it will disrupt service to existing users who are connected, so you should first do a traditional backup before creating a BACPAC file.
The BACPAC file is just a blob in Windows Azure Storage. Once you’ve successfully created it, you can start Visual Studio and view the file using the Server Explorer (or similar tooling that lets you browse Windows Azure Storage). The URL for your BACPAC file will look something like this: https://[your-storage-account-name].blob.core.windows.net/[your-container-name]/[your-database-name].bacpac.
Remoting into Your SQL Server Database VM You now need to remote into the SQL Server Database VM you created previously. Note that the login credentials will not be your Windows Azure subscription credentials, but rather the credentials you specified when you created the VM. In our case, the login name was “msdn-vm\the-admin” plus the password we entered.
Importing the BACPAC File from Windows Azure Storage Once you’re logged into the VM, you can import the BACPAC file into the cloud-hosted VM.
- Start SSMS, then connect to the local instance of SQL Server.
- In Object Explorer, right-click on Databases and then select the Import Data-tier Application menu item to launch the wizard. The wizard is similar to the one used to create the BACPAC for the on-premises database. Once again, enter the storage account details, such as the name of the storage account and the management key. The wizard will display the container and the name of the BACPAC file.
- Click on Next to complete the import database process. Figure 5 illustrates a successful import of the database into the cloud-hosted VM in a Microsoft datacenter.
Figure 5 Successful Import of MarketIndexData
Migrating a SQL Server Database to a Windows Azure SQL Database
As noted earlier, Microsoft has a PaaS database offering known as Windows Azure SQL Database. To import the BACPAC file (the packaged on-premises database you just created), log in to the Windows Azure Management Portal. Select SQL Databases, then Import from the Command Window. You’ll be asked to enter the URL for the BACPAC file residing in Windows Azure storage as a blob file, as seen in Figure 6. This is the same URL used previously for the BACPAC file. This is one of the great advantages of BACPAC files: they can be used for importing both Windows Azure SQL Database and Windows Azure SQL Server Database VMs.
Figure 6 Importing the BACPAC File
When the import is complete, you’ll be able to open the database in SSMS, just as we did with the Windows Azure SQL Server Database VMs. In fact, SSMS works with all four options previously presented. You’ll just need to get the server name from the Windows Azure Management Portal for the database, and you’ll need to enable specific client IP addresses at the portal. You’ll find more information at the Tools and Utilities Support (Windows Azure SQL Database) page at bit.ly/Vh8jGg.
The Windows Azure platform has been evolving quickly, especially in the area of VMs and SQL Server database technology. Customers are aggressively migrating their databases to the cloud in an effort to reduce costs, as well as to leverage other resources, such as storage, caching, identity, VPNs, messaging and more. While Windows Azure SQL Database can offer tremendous cost advantages, it doesn’t provide the level of compatibility required by many enterprise customers. In future releases of SSMS (2014), deployment wizards are expected to automate the manual process illustrated in this month’s column. When these wizards do arrive, the lessons learned here will ensure you know how they work and what they do.
Bruno Terkaly is a developer evangelist for Microsoft. His depth of knowledge comes from years of experience in the field, writing code using a multitude of platforms, languages, frameworks, SDKs, libraries and APIs. He spends time writing code, blogging and giving live presentations on building cloud-based applications, specifically using the Windows Azure platform. You can read his blog at blogs.msdn.com/b/brunoterkaly.
Ricardo Villalobos is a seasoned software architect with more than 15 years of experience designing and creating applications for companies in the supply chain management industry. Holding different technical certifications, as well as a master’s degree in business administration from the University of Dallas, he works as a platform evangelist in the Globally Engaged Partners DPE group for Microsoft. You can read his blog at blog.ricardovillalobos.com.
Terkaly and Villalobos jointly present at large industry conferences. They encourage readers of Windows Azure Insider to contact them for availability. Reach them at email@example.com or Ricardo.Villalobos@microsoft.com.
Thanks to the following technical expert for reviewing this article: Robin Shahan
Robin Shahan is a Microsoft Windows Azure MVP with over 25 years of experience developing complex, business-critical applications. As the VP of Technology for GoldMail, she migrated their entire infrastructure to Windows Azure in 2 months, reducing their infrastructure costs by 90%. She is now a Windows Azure consultant. You can follow @RobinDotNet on twitter.