April 2009

Volume 24 Number 03

Windows With C++ - The Virtual Disk API in Windows 7

By Kenny Kerr | April 2009

This article is based on a prerelease version of Windows 7. All information herein is subject to change.

Contents

The VHD Format
Creating Virtual Disks
Opening Virtual Disks
Attaching Virtual Disks
Querying Virtual Disks
What's Next

As I write this, the Windows 7 beta has been available for a few days, and I must say there is a lot to like. As usual, I took a look under the hood to see what is new in the Windows SDK. Windows 7 is very much a minor release as far as the SDK is concerned, and that's a good thing. The fundamentals of writing native C++ applications for Windows 7 have not changed much compared to the way they changed for Windows Vista. Having said that, however, Windows 7 has some completely new features that are sure to interest anyone looking to take advantage of the platform.

One of these features is the Virtual Disk API. Although designed with other formats in mind, the Virtual Disk API in the Windows 7 beta is very much geared toward the Microsoft Virtual Hard Disk (VHD) format popularized by virtualization products from Microsoft such as Hyper-V and Virtual PC.

This really takes me back because I've been involved with virtualization for quite a few years. When I first started working with virtualization technology almost ten years ago, VMware was the clear leader. Then in 2003 Microsoft acquired virtual machine technology from Connectix, and everything changed. All of a sudden there were two big players. It was clear the VHD format that Microsoft acquired from Connectix was well suited to the direction that Microsoft wanted to take with virtualization, namely to turn it into a platform. Whereas the VMware virtual disk formats were proprietary and very convoluted, often changing wildly from one release to the next, the VHD format was from the start straightforward and flexible enough to stand the test of time. In the intervening years, the VHD format has proven itself again and again, having been adopted by other products and technologies within Microsoft and by other software companies, big and small.

For this reason I'm happy to see that Windows 7 supports the VHD format natively. This means that users and administrators can easily create and attach virtual disks as if they were additional physical storage devices, without installing any third-party drivers or tools. You can, for example, use the Disk Management MMC (Microsoft Management Console) snap-in or the DISKPART command-line tool to create and attach virtual disks. You can then partition, format, and use them like any other hard disk on your computer.

In addition, you also gain very granular control over the creation and management of virtual disks. At the heart of these capabilities is VirtDisk.dll, which provides a low-level C API, known as the Virtual Disk API, for creating and manipulating virtual disks. This API provides access to a number of kernel-mode drivers that are needed to actually represent the disk and its volumes in the storage subsystem so that file system drivers can be layered on top without requiring any knowledge of the actual source of the storage.

Disk management tools typically interact with virtual disks via the Virtual Disk Service (VDS) that itself relies on the Virtual Disk API for handling VHD-based storage. Of course Hyper-V has its own Windows Management Instrumentation (WMI)–based API for creating and manipulating virtual machines, and it too relies on the Virtual Disk API.

Before I dive in and show how you can use the Virtual Disk API, I'll quickly describe the basics of the VHD format. You'll see that given this new API, you can now throw away much of the code you needed previously for managing virtual disks.

The VHD Format

The VHD format provides three different image types: fixed, dynamic, and differencing disks. All three disk types include a 512-byte VHD footer residing at the end of the disk file. More on this in a minute.

Fixed disks are the simplest type and provide the best performance because the disk file is fully allocated to accommodate the size requested when the disks are created. A 500-MB fixed disk will be exactly 500 x 1024 x 1024 + 512 bytes in size. Because the footer is at the end of the disk, the disk's storage can align with the beginning of the file to ensure the simplest and fastest possible random access.

Dynamic disks are called sparse disks by the Virtual Disk API, and sparse is a good way to think of them. The disk files are initially created with just enough space to store the VHD footer as well as some additional metadata that is needed to manage the dynamic nature of the storage allocation. As data is written to the disk, additional blocks of storage are allocated at the end of the disk file. Dynamic disks are advantageous because they take up far less space if not used at full capacity, which is the case most of the time. The disadvantage is that the additional indirection and metadata management required to read, write, and grow a dynamic disk on demand takes its toll on performance. For this reason, dynamic disks are favored in testing scenarios, whereas fixed disks are preferred in production scenarios because performance is a high priority.

Differencing disks are very similar to dynamic disks internally, but other characteristics are very different. As with dynamic disks, differencing disks use dynamically allocated blocks, but those blocks contain only modifications related to a parent disk. This type of disk is therefore dependent on a parent disk to function. The parent of a differencing disk can be either a fixed or a dynamic disk. The parent can, in fact, be another differencing disk, allowing a chain or tree of disks to be created. Disks that represent leaf nodes can be freely written to, but nonleaf nodes should be considered read-only because any descendant differencing disks rely on them to fill in the blanks, so to speak, and if they were to change it would very likely result in corrupt virtual disks.

As I mentioned earlier, the different types of disks share a common footer. This footer contains information such as the logical size of the disk, disk geometry, the type of disk, the disk's globally unique identifier, and so on. For dynamic and differencing disks the footer also includes an offset value indicating where the dynamic header resides relative to the beginning of a disk. This secondary structure chiefly contains information that might be required to locate and identify a parent disk, if necessary, as well as another offset indicating where the disk's block allocation table (BAT) resides. This table indicates which blocks have been allocated as well as their absolute offsets in the disk file.

With that introduction out of the way, let's dive in and take a look at the Virtual Disk API.

Keep in mind that the API is designed to allow Microsoft to add support for additional formats in the future. In fact, support for the ISO optical disk image format has been considered but has not been added as of the Windows 7 beta. I hope that the necessary virtual disk support provider will be included in the release build. This would allow users to attach and mount ISO images as read-only hard drives.

Creating Virtual Disks

Virtual disks are represented by opaque handles, much like other system objects such as files, registry keys, and so on. The familiar CloseHandle function is even used to close a virtual disk handle. The Active Template Library's (ATL) CHandle class is a good choice for managing this resource. You can derive a "VirtualDisk" class from CHandle to wrap up some of the boilerplate code necessary to manipulate virtual disks.

Whenever you open or create a virtual disk, you need to specify the disk's storage type. This is accomplished with the VIRTUAL_STORAGE_TYPE structure. Storage types are defined for the ISO and VHD formats. The structure also specifies the vendor that provides the implementation for the particular storage type. Here's how you can identify the VHD storage type:

VIRTUAL_STORAGE_TYPE storageType = { VIRTUAL_STORAGE_TYPE_DEVICE_VHD, VIRTUAL_STORAGE_TYPE_VENDOR_MICROSOFT };

The CreateVirtualDisk function creates all three types of virtual disks. Along with the storage type, you must populate a CREATE_VIRTUAL_DISK_PARAMETERS structure. How this structure is populated depends on the type of virtual disk you would like to create. Many of the Virtual Disk API structures use a versioning scheme to accommodate future updates to the API. The structures begin with a member named Version followed by a union of structures. For example, here's how you can populate the common fields for creating virtual disks:

CREATE_VIRTUAL_DISK_PARAMETERS parameters = { CREATE_VIRTUAL_DISK_VERSION_1 }; parameters.Version1.MaximumSize = size; parameters.Version1.BlockSizeInBytes = CREATE_VIRTUAL_DISK_PARAMETERS_DEFAULT_BLOCK_SIZE; parameters.Version1.SectorSizeInBytes = CREATE_VIRTUAL_DISK_PARAMETERS_DEFAULT_SECTOR_SIZE;

The size of the virtual disk is specified in bytes and must be a multiple of 512. The block size and sector size is configurable, but if you need to ensure the greatest level of compatibility across implementations the default values are a good choice. The Version1 structure also provides a UniqueId member, but if you leave this zeroed out, the CreateVirtualDisk function will generate a GUID for you. The SourcePath member may also be specified for all disk types except the differencing disks. This instructs CreateVirtualDisk to copy the contents of the source disk into the newly created virtual disk. The two do not need to be of the same type. In fact, the source disk does not even have to be a virtual disk and can be a physical disk that you would like to make a copy of.

Figure 1 provides the beginnings of a VirtualDisk wrapper class that includes a CreateFixed member function for creating fixed virtual disks. Note that for fixed disks you must specify the CREATE_VIRTUAL_DISK_FLAG_FULL_PHYSICAL_ALLOCATION flag. Creating a dynamic disk is exactly the same as creating a fixed disk except that you must omit this flag and can specify the CREATE_VIRTUAL_DISK_FLAG_NONE flag instead. Creating a differencing disk is also much the same. The difference there is that you must not set the size, as it is inferred from the parent disk, and you must specify the parent with the ParentPath member of the Version1 structure. The SourcePath cannot be set for differencing disks for obvious reasons.

Figure 1 Creating Fixed Disks

class VirtualDisk : public CHandle { public: DWORD CreateFixed(PCWSTR path, ULONGLONG size, VIRTUAL_DISK_ACCESS_MASK accessMask, __in_opt PCWSTR source, __in_opt PSECURITY_DESCRIPTOR securityDescriptor, __in_opt OVERLAPPED* overlapped) { ASSERT(0 == m_h); ASSERT(0 != path); ASSERT(0 == size % 512); VIRTUAL_STORAGE_TYPE storageType = { VIRTUAL_STORAGE_TYPE_DEVICE_VHD, VIRTUAL_STORAGE_TYPE_VENDOR_MICROSOFT }; CREATE_VIRTUAL_DISK_PARAMETERS parameters = { CREATE_VIRTUAL_DISK_VERSION_1 }; parameters.Version1.MaximumSize = size; parameters.Version1.BlockSizeInBytes = CREATE_VIRTUAL_DISK_ PARAMETERS_DEFAULT_BLOCK_SIZE; parameters.Version1.SectorSizeInBytes = CREATE_VIRTUAL_DISK_ PARAMETERS_DEFAULT_SECTOR_SIZE; parameters.Version1.SourcePath = source; return ::CreateVirtualDisk(&storageType, path, accessMask, securityDescriptor, CREATE_VIRTUAL_DISK_FLAG_FULL_ PHYSICAL_ALLOCATION, 0, // no provider-specific flags &parameters, overlapped, &m_h); }

The CreateVirtualDisk function provides a few additional parameters that are worth mentioning. The VIRTUAL_DISK_ACCESS_MASK enumeration provides a set of flags for controlling the access that the API will grant callers through the resulting handle. Although you can specify VIRTUAL_DISK_ACCESS_ALL, this is often undesirable as it prevents you from running certain operations such as querying a virtual disk that is currently attached as a storage device. The other very useful feature is the ability to specify an OVERLAPPED structure. This is supported by a number of the Virtual Disk API functions, and as you would expect has the effect of performing the operation asynchronously. Simply provide a manual reset event and it will be signaled upon completion.

Opening Virtual Disks

The OpenVirtualDisk function can be used to open a virtual disk. As with virtual disk creation, you must provide a VIRTUAL_STORAGE_TYPE structure to identify the storage type. An OPEN_VIRTUAL_DISK_PARAMETERS structure may optionally be provided but is typically only required when manipulating differencing disk relationships.

Figure 2 provides an Open member function to add to the VirtualDisk wrapper class started in Figure 1. Opening virtual disks is typically simpler than creating them, but some of the flags and options must be used in a very particular way to allow certain maintenance operations such as merging and attaching virtual disks.

Figure 2 Opening Virtual Disks

DWORD Open(PCWSTR path, VIRTUAL_DISK_ACCESS_MASK accessMask, OPEN_VIRTUAL_DISK_FLAG flags, // OPEN_VIRTUAL_DISK_FLAG_NONE ULONG readWriteDepth) // OPEN_VIRTUAL_DISK_RW_DEPTH_DEFAULT { ASSERT(0 == m_h); ASSERT(0 != path); VIRTUAL_STORAGE_TYPE storageType = { VIRTUAL_STORAGE_TYPE_DEVICE_VHD, VIRTUAL_STORAGE_TYPE_VENDOR_MICROSOFT }; OPEN_VIRTUAL_DISK_PARAMETERS parameters = { OPEN_VIRTUAL_DISK_VERSION_1 }; parameters.Version1.RWDepth = readWriteDepth; return ::OpenVirtualDisk(&storageType, path, accessMask, flags, &parameters, &m_h); }

Attaching Virtual Disks

The Windows 7 beta uses the term surface, or surfacing, to mean attaching the disk as a storage device in the operating system. This has subsequently been changed to the more obvious word attach. The AttachVirtualDisk (called SurfaceVirtualDisk in the beta) function attaches the virtual disk. The virtual disk is identified by a handle previously obtained from a call to either CreateVirtualDisk or OpenVirtualDisk. You must make sure that the handle has the appropriate access defined for it. To attach and detach a virtual disk, you must also have the SE_MANAGE_VOLUME_NAME privilege present in your token. This privilege is stripped from an administrator's token when User Account Control is in use, so you may need to elevate your application to gain access to the unrestricted token that includes this privilege.

Figure 3 provides an Attach member function to add to the VirtualDisk wrapper class. The ATTACH_VIRTUAL_DISK_FLAG (called SURFACE_VIRTUAL_DISK_FLAG in the beta) parameter is how you control the method in which the virtual disk is attached.

Figure 3 Attaching Disks

DWORD Attach(ATTACH_VIRTUAL_DISK_FLAG flags, __in_opt PSECURITY_DESCRIPTOR securityDescriptor, __in_opt OVERLAPPED* overlapped) { ASSERT(0 != m_h); return ::AttachVirtualDisk(m_h, securityDescriptor, flags, 0, // no provider-specific flags 0, // no parameters overlapped); }

ATTACH_VIRTUAL_DISK_FLAG_READ_ONLY (called SURFACE_VIRTUAL_DISK_FLAG_READ_ONLY in the beta) can be specified to ensure that the attached disk is write-protected. This cannot be overridden by an attempt to make the disk writable with VDS.

ATTACH_VIRTUAL_DISK_FLAG_NO_DRIVE_LETTER (called SURFACE_VIRTUAL_DISK_FLAG_NO_DRIVE_LETTER in the beta) will prevent Windows from automatically assigning drive letters to any volumes present in the virtual disk. You are free then to mount any volumes programmatically or not at all, depending on your needs. The GetVirtualDiskPhysicalPath function can be used to identify the physical path where the virtual disk was attached.

ATTACH_VIRTUAL_DISK_FLAG_PERMANENT_LIFETIME (called SURFACE_VIRTUAL_DISK_FLAG_PERMANENT_LIFETIME in the beta) ensures that the virtual disk remains attached even after the virtual disk handle is closed. Failure to specify this flag results in the virtual disk being detached automatically when the handle is closed. To detach the virtual disk in this case, you need to call the DetachVirtualDisk function (called UnsurfaceVirtualDisk in the beta).

Querying Virtual Disks

The GetVirtualDiskInformation function allows you to query a virtual disk for different classes of information. The information is communicated using the GET_VIRTUAL_DISK_INFO structure, which uses the same version pattern used by many of the other Virtual Disk API structures. For example, to get the disk's size information you set the structure's version to GET_VIRTUAL_DISK_INFO_SIZE. The corresponding Size union member will then be populated. Figure 4 illustrates this.

Figure 4 Getting Size Information

DWORD GetSize(__out ULONGLONG& virtualSize, __out ULONGLONG& physicalSize, __out ULONG& blockSize, __out ULONG& sectorSize) const { ASSERT(0 != m_h); GET_VIRTUAL_DISK_INFO info = { GET_VIRTUAL_DISK_INFO_SIZE }; ULONG size = sizeof(GET_VIRTUAL_DISK_INFO); const DWORD result = ::GetVirtualDiskInformation(m_h, &size, &info, 0); // fixed size if (ERROR_SUCCESS == result) { virtualSize = info.Size.VirtualSize; physicalSize = info.Size.PhysicalSize; blockSize = info.Size.BlockSize; sectorSize = info.Size.SectorSize; } return result; }

The GetVirtualDiskInformation function operates on a virtual disk handle, so again you need to ensure you have the appropriate access. In this case the VIRTUAL_DISK_ACCESS_GET_INFO permission is required. Since some of the information that can be obtained using GetVirtualDiskInformation is variable in length, it provides two additional parameters to specify both how much storage is initially being provided and how much was ultimately populated. For most of the information you will query the size is known ahead of time and the last parameter can be omitted, as is the case in Figure 4.

The notable exception is when querying a differencing disk for the location, or path, of a parent virtual disk. In this case you need to start by determining the amount of memory that is required with an initial call to GetVirtualDiskInformation. This call will fail with ERROR_INSUFFICIENT_BUFFER but provide you with the size of the buffer that needs to be allocated. You can then call the function a second time to actually get the information. The GET_VIRTUAL_DISK_INFO_PARENT_LOCATION version flag is used to get the parent location. It is, however, a little more complicated still. Since maintaining a reference to a parent is so critical for the operation of a differencing disk, the VHD format provides a degree of redundancy that can be exploited should the link to the parent be broken. Suffice it to say that the querying for the parent location involves parsing a sequence of null-terminated strings, terminated by an empty string. This is the same as the REG_MULTI_SZ registry value type. Figure 5 shows you how it's done. The example uses ATL's CAtlArray collection class as well ATL's CString class.

Figure 5 Getting Parent Location

DWORD GetParentLocation(__out bool& resolved, __out CAtlArray<CString>& paths) const { ASSERT(0 != m_h); GET_VIRTUAL_DISK_INFO info = { GET_VIRTUAL_DISK_INFO_PARENT_LOCATION }; ULONG size = sizeof(GET_VIRTUAL_DISK_INFO); DWORD result = ::GetVirtualDiskInformation(m_h, &size, &info, 0); // not used if (ERROR_INSUFFICIENT_BUFFER != result) { return result; } CAtlArray<BYTE> buffer; if (!buffer.SetCount(size)) { return ERROR_NOT_ENOUGH_MEMORY; } GET_VIRTUAL_DISK_INFO* pInfo = reinterpret_cast<GET_VIRTUAL_DISK_ INFO*>(buffer.GetData()); pInfo->Version = GET_VIRTUAL_DISK_INFO_PARENT_LOCATION; result = ::GetVirtualDiskInformation(m_h, &size, pInfo, 0); // not used if (ERROR_SUCCESS == result) { resolved = 0 != pInfo->ParentLocation.ParentResolved; PCWSTR path = pInfo->ParentLocation.ParentLocationBuffer; while (0 != *path) { paths.Add(path); path += paths[paths.GetCount() - 1].GetLength() + 1; } } return result; }

What's Next

The Virtual Disk API provides a few more functions primarily aimed at maintaining or repairing virtual disks. The MergeVirtualDisk function allows you to merge any changes in a differencing disk back into a parent disk. It supports merging with any parent in the chain. Functions are also provided for compacting and expanding virtual disks as well as updating relational metadata in the case of differencing disks.

Note that the Windows SDK for the Windows 7 beta omitted the VirtDisk.lib file necessary to link to the Virtual Disk API functions. This will be corrected for the release. Developers using the beta can use the LoadLibrary and GetProcAddress functions to load and address the functions or generate the lib file yourself.

Send your questions and comments to mmwincpp@microsoft.com

Kenny Kerr is a software craftsman specializing in software development for Windows. He has a passion for writing and teaching developers about programming and software design. You can reach Kenny at weblogs.asp.net/kennykerr.