.gif)
SQL Server
Technical Article
Writers: Kevin Farlee, Pradeep Madhavarapu
Technical Reviewer: Pradeep Madhavarapu, Michael
Warmington
Published: August 2008
Applies to: SQL Server 2008
Summary: This is a specification to be used by
those creating a storage provider plug-in library for the SQL Server 2008
Remote BLOB Store feature.
Introduction
Remote BLOB
Store (RBS) is designed to move the storage of large binary data (BLOBs) from
database servers to commodity storage solutions.
With RBS,
BLOB data is stored in storage solutions such as Content Addressable Stores
(CAS), commodity hardware with data integrity
and fault-tolerance systems, or mega service storage
solutions like MSN Blue. A reference to the BLOB is stored in the database. An
application stores and accesses BLOB data by calling into the RBS client
library. RBS manages the life cycle of the BLOB, such as doing garbage
collection as and when needed.
RBS is an
add-on that can be applied to Microsoft SQL Server 2008 and later. It uses
auxiliary tables, stored procedures, and an executable to provide its services.
A reference to the BLOB (provided by the BLOB Store) is stored in RBS auxiliary
tables and an RBS BLOB ID is generated. Applications store this RBS BLOB ID in
a column in application tables. These columns in application tables are called RBS Columns in this specification. The
RBS Column is not a new data type; it is just a simple binary(20).
RBS exposes
three views for interacting with it: application view (through the RBS client
library), administrator view (through stored procedures), and provider view
(through a provider interface). This document discusses the provider view.
RBS Provider Requirements
The
requirements of RBS are covered in Functional Description, later in this paper.
The requirements of RBS providers are listed here.
Goals of an RBS Provider
The main goal
of an RBS provider is to enable the use of a particular type of BLOB store
(called a target BLOB store) to store
RBS BLOB data.
Typically,
target BLOB stores offer large storage space at a low cost, including hardware
costs, maintenance, and expandability. The technical requirements and
recommendations for RBS providers are listed here.
Required
An RBS
provider must:
- Provide
an implementation of the BlobStore
abstract class that uses the target BLOB store to store BLOB data. Honor the
semantics specified by RBS.
- Allow
multiple instances of the provider (pointing to the same or different instances
of the target store, and using the same or different credentials) to be used
simultaneously from one or more client machines.
Recommended
An RBS
provider should optimally:
- Allow
the use of the features of the target BLOB store through RBS interfaces and
configuration options wherever possible. It should also minimize the need for
custom configuration options to exploit features of the target BLOB store.
- Implement
optional optimizations and capabilities if possible. These help improve
performance and provide extra functionality.
- Allow
attaching, detaching, enabling, disabling, configuring, and deploying target
stores and providers without affecting the availability of SQL Server and
client computers.
- Avoid
introducing too much overhead; the performance of an RBS provider should be
close to the performance of native access to the target BLOB store.
Guarantees Provided by RBS
Providers
An RBS
provide must guarantee certain implementation features. Others are recommended.
Required
An RBS
provider must guarantee:
- Link-level
consistency. This means that there are no dangling references─if the
provider gives out a StoreBlobId to
represent a newly stored BLOB, the BLOB can be accessed later using the same StoreBlobId as long as it is not
deleted.
- That
the BLOB persists when a Store()
call returns. BLOB data and any metadata that the provider associates with a
BLOB must be persisted by the BLOB store before the call to store the BLOB
returns successfully. This means that if the BLOB store goes down because of a
power outage or other reason, after the successful completion of a Store() operation, the BLOB is
available after the BLOB store comes online.
Recommended
An RBS
provider should optimally guarantee:
- BLOB
data immutability. This means that BLOB data cannot be changed after a BLOB is
stored initially. This guarantees that the data returned on reading a BLOB is
the same as the data that was given to the provider when the BLOB was
stored─no changes are allowed after that.
Deliverables
Each provider
must deliver the following pieces, together known as a Provider Pack:
- Provider
library (set of managed DLLs and dependencies, such as native libraries)
- Documentation
- Sample
configuration files
- Installer
- Optional:
Provider source code if this is a sample provider
Functional Description
Overview and Component Descriptions
An RBS
provider consists of a managed library and, optionally, a set of native
libraries that communicate with the BLOB store. The basic components and their
interactions are as follows:
- Application
– RBS Maintainer or an application that uses RBS, such as Microsoft SharePoint.
- RBS
Client Library – In the case of applications other than RBS Maintainer, the
provider library is called by RBS client library and not the application
directly.
- BLOB
Store – An entity which is used to store BLOB data. This can be a CAS storage
solution (such as EMC Centera or Microsoft SRS), SMB file server, a mega
storage service (such as MSN XStore) or even a SQL Server database.
- Provider
Library – Managed library for implementing the BlobStore abstract class. This also referred to as the provider. It knows how to use the BLOB
store for storing BLOBs.
- Native
Library for BLOB Store – Any libraries used by the provider library to
communicate with the BLOB store. This is optional.
.jpg)
Figure 1: Provider Architecture
.jpg)
Figure 2: Provider Architecture with
Native Library
.jpg)
Figure 3: Provider Architecture with RBS
Client Library
Sample Control Flow
Following is
a sample control flow for a simple operation.
- The
application calls the provider library to perform an operation.
- The
provider library calls into the native library to perform the operation.
- The
provider (or native) library sends the request to the store.
- The
BLOB store returns a response.
- The
native library returns a response to the provider library.
- The
provider library returns a response to the application.
.jpg)
Figure 4: Sample Control Flow for
Provider Library
.jpg)
Figure 5: Sample Control Flow for
Provider Library with Optional Native Library
Provider Abstract Class
RBS defines
an abstract class named BlobStore,
that must be inherited and implemented by provider writers. The reasons to use
an abstract class instead of an interface are as follows:
- It
is easy to extend an abstract class in future versions without breaking
backward compatibility, which is not possible with interfaces. For example, in
an abstract class, new methods (with default implementations) can be added
without breaking compatibility with previous versions.
- The
core function of the provider library is that it is an RBS provider, so it
makes sense to have it inherit an abstract class.
- Some
common code that may be useful to many providers can be included in the
abstract class. The derived providers can chose to either use it or write their
own code.
Overview
Following is
an overview of the steps performed by the application (RBS maintainer or RBS
client library) on a provider library.
- RBS
loads the provider library managed DLL and uses configuration information to
find the required class within that DLL that is derived from BlobStore.
- RBS
gets information about the provider through configuration information that is
added to the machine-wide CLR configuration file when a provider library is
installed.
- Using
this provider information, it associates zero or more BLOB stores with this
provider class.
- When
a BLOB store associated with this provider class needs to be used, one object
of the class is instantiated.
- The
object is initialized with information about the BLOB store.
- Operations
(such as storing and fetching BLOBs, creating pools, and so on) are performed
using this object. The object may be cached for use later and operations may be
performed again after long pauses.
- Dispose()
is called on the provider object and it is not used after that.
- Multiple
instances of the same class can be used simultaneously to access the same or
different BLOB stores.
The next
section lists what must be implemented by the provider class. They are
discussed in groups.
Exceptions
BLOB store
providers are only expected to throw exceptions of type BlobStoreException. A valid exception code must be specified while
throwing an exception. Each operation has a set of expected exception codes.
Throwing any other exceptions or codes indicates a bug in the provider or that
exceptions have occurred outside the provider’s control. The valid exception
codes are:
- AccessDenied.
The caller or application does not have permissions to perform the requested
action.
- NoMoreSpace.
No more storage space is available on the BLOB store or pool.
- PoolNotFound.
Specified pool does not exist on the BLOB store.
- BlobNotFound.
Specified BLOB does not exist on the BLOB store or pool.
- BlobIdAlreadyExists.
A BLOB with the specified StoreBlobId
already exists in the same pool, so a new one cannot be created.
- BlobInUse.
A BLOB is currently being used, so it cannot be deleted or expunged.
- PoolNotEmpty.
Specified pool is not empty, so it cannot be deleted.
- ConfigurationMissing.
Required BLOB store configuration items are missing.
- ConfigurationDoesNotAllowOperation.
Current configuration of the BLOB store does not allow the requested operation.
- OperationFailedAuthoritative
- The
requested operation failed for a reason not included in other codes.
- The
failure is authoritative - no part of the operation was performed.
- OperationFailedMaybe
- The
requested operation may have failed for a reason not included in other codes.
- The
failure is not authoritative─all, some or no part of the operation may
have been performed.
- NotImplemented.
The requested operation is not
implemented by this BLOB store provider.
Providers are
encouraged to include descriptive messages while throwing any exception.
Initialization
Constructor()
After
a provider class is picked for a store registered with RBS, an object of the
provider class is instantiated to use that store. RBS instantiates an object of
the provider class by using the empty constructor. Within this constructor, the
provider must call the base constructor (base()).
void Initialize(ConfigItemList
commonConfiguration, ConfigItemList coreConfiguration, ConfigItemList
extendedConfiguration,
BlobStoreCredentials[] credentials)
RBS
calls this method once on an object of the provider class before using it for
any operations.
Configuration
information is passed in the form of ConfigItemList
objects that contain multiple ConfigItem
objects. ConfigItems are explained
in the RBS Functional Description. They are essentially (key, value) pairs.
There is a pre-defined list of ConfigItems
that RBS client library defines. In addition, providers can define their own ConfigItems that are used for
provider-specific configuration.
CommonConfiguration contains configuration information
that is understood by the RBS client library. ConfigItems present in this are: StoreMajorVersion, StoreMinorVersion,
and StoreLocation. CoreConfiguration and ExtendedConfiguration contain
provider-specific configuration items associated with this BLOB store in the
RBS database. The core configuration consists of configuration information that
is required to access existing BLOBs in the back-end BLOB store. The extended
configuration consists of configuration information that is not needed to
access existing BLOBs, but is needed for other operations, such as create pool,
store BLOB, and so on. Extended configuration information is optional and may
not be present. This is because extended configuration information is not
included in BLOB Locators, which can be used to access a BLOB. BlobStoreCredentials is optional (it
may be null). If specified, the specified credentials should be used to connect
to the store.
Providers
are encouraged to check validity of the passed configuration items and
credentials and build internal structures as part of initialization. They may
optionally connect to the store as well.
Allowed
exception codes are: AccessDenied, ConfigurationMissing.
void Dispose()
This
is the opposite of Initialize(),
previously described and is called by RBS to indicate that the internal
structures, connections etc. can be cleaned up. An object will not be used
after Dispose() is called on it.
Pool Operations
Poll
operations are operations performed on pools. None of these operations are
performed by RBS in parallel (on multiple threads) on the same provider object.
For each operation, a list of expected exceptions is specified. If some
exception other than those specified is thrown, it indicates either a bug or
extraordinary circumstances.
byte[] storePoolId
CreatePool(ConfigItemList configuration)
This
creates a new pool on the BLOB store. A byte array representing the StorePoolId for that pool is returned.
If
the OptimizationSpecifiedIds
capability is TRUE, StorePoolId must
be less than or equal to 16 bytes.
Allowed
exception codes are: AccessDenied, ConfigurationDoesNotAllowOperation, NoMoreSpace, OperationFailedAuthoritative.
void DeletePool(byte[]
StorePoolId)
This
deletes an existing pool.
Allowed
exception codes are: AccessDenied, ConfigurationDoesNotAllowOperation, PoolNotFound, PoolNotEmpty, OperationFailedAuthoritative,
OperationFailedMaybe.
object ResumeObject
BeginEnumerateBlobs(byte[] StorePoolId)
This
method is called to start enumerating the list of BLOBs in a particular pool.
Since the number of BLOBs expected in a pool is very high, we need support for
paging─retrieving a few entries at a time. This method is called to set
up any context and internal structures do represent such an enumeration.
The
provider is free to create any type of object to store its enumeration state.
The object should then return the enumeration state from this method. RBS keeps
uses this object in subsequent method calls to enumerate BLOBs.
This
method must be implemented even if OptimizationSortedEnumeration
is TRUE.
Allowed
exception codes are: AccessDenied, ConfigurationDoesNotAllowOperation, PoolNotFound, OperationFailedAuthoritative.
object resumeObject
BeginEnumerateBlobs(byte[] storePoolId, byte[] startingStoreBlobId, DateTime
createTimeFilterStart, DateTime createTimeFilterEnd)
This
method is called by RBS to get a sorted enumeration of BLOBs in a pool. The
provider is expected to return a ResumeObject
that can be used to enumerate BLOBs in that pool in sorted order of StoreBlobId. The enumeration should
return BLOBs belonging to that pool with (StoreBlobId
>= StartingStoreBlobId).
Comparison of BLOB IDs is a binary comparison of all the bytes of the ID. In
addition, all BLOBs returned should have a CreateTime
such that (CreateTime >= CreateTimeFilterStart) and (CreateTime <= CreateTimeFilterEnd). If CreateTimeFilterStart
or CreateTimeFilterEnd is set to DateTime.MinValue or DateTime.MaxValue respectively, the clause for that parameter should be
skipped (that clause is assumed to be satisfied). Both times are specified in
UTC.
This
method is equivalent to retrieving BLOB entries from a completely sorted list of BLOBs belonging to the specified pool,
starting at the lowest entry that satisfies (StoreBlobId >= StartingStoreBlobId).
For any two consecutive entries in the returned array B1 and B2, the following
conditions hold:
- B1
< B2
- CreateTimeFilterStart
<= B1 CreateTime <= CreateTimeFilterEnd
- CreateTimeFilterStart
<= B2 CreateTime <= CreateTimeFilterEnd
- There
is no BLOB Bk belonging to the specified pool such that (B1 < Bk < B2)
and (CreateTimeFilterStart <= Bk CreateTime <= CreateTimeFilterEnd)
This
method must be implemented if OptimizationSortedEnumeration
is TRUE.
Allowed
exception codes are: AccessDenied, ConfigurationDoesNotAllowOperation, PoolNotFound, OperationFailedAuthoritative, NotImplemented.
BlobInformation[]
EnumerateBlobs(object resumeHandle, int maxBlobs)
This
method is called by RBS, specifying a ResumeObject
that was previously returned by the provider. The provider is expected to
return an array of BLOB entries belonging to that pool. MaxNum is the maximum number of entries to be returned from this
call.
Next
time this method is called, the provider should continue enumerating BLOBs in
the pool at the point where the current call stops. No BLOBs should be returned
twice and no BLOBs should be missed. Returning less than MaxNum number of entries indicates that there are no more BLOBs
left to enumerate.
BlobInformation includes the StoreBlobId, the CreateTime
of the BLOB (this should be the same value that was returned when the BLOB was
stored) and Length of the BLOB.
Allowed
exception codes are: OperationFailedAuthoritative.
void EndEnumerateBlobs(object
resumeHandle)
This
method is called to end enumerating BLOBs in a pool. The provider can clean up
any internal state related to this enumeration.
Allowed
exception codes are: None.
BLOB Operations
These are
operations that are performed on BLOBs within pools. These operations may be
performed by RBS in parallel (on multiple threads) on the same provider object.
So, they must be thread-safe. For each operation, a list of expected exceptions
is specified. If some exception other than those specified is thrown, it
indicates either a bug or extraordinary circumstances.
BlobStoreWriterStream
CreateNewBlob(byte[] storePoolId)
This
is the “Push” version of storing a BLOB─the provider is expected to
return a writable stream, into which RBS or the application writes data that
must be stored in the BLOB store. The BLOB should be stored in the specified
pool.
BlobStoreWriterStream is inherited from System.IO.Stream and has one additional
method: Commit(). When RBS calls Commit() on this object, the provider
should commit the BLOB on the back-end BLOB store and return the BlobInformation for the stored BLOB.
The stream cannot be used after that.
This
method must be implemented even if the OptimizationSpecifiedIds
capability is TRUE.
Allowed
exception codes are: AccessDenied, ConfigurationDoesNotAllowOperation, PoolNotFound, NoMoreSpace, OperationFailedAuthoritative.
BlobInformation
CreateNewBlobFromStream(byte[] storePoolId, Stream inStream)
This
is the “Pull” version of storing a BLOB─a stream containing the data to
be stored is given. The BLOB should be stored in the specified pool.
The
specified stream supports reading (CanRead
is TRUE) and supports querying the Length
property. No other assumptions (including assumptions related to CanSeek) should be made about this
stream object.
This
method must be implemented even if the OptimizationSpecifiedIds
capability is TRUE.
Allowed
exception codes are: AccessDenied, ConfigurationDoesNotAllowOperation, PoolNotFound, NoMoreSpace, OperationFailedAuthoritative.
BlobStoreWriterStream
CreateNewBlob(byte[] storePoolId, byte[] storeBlobId)
This
is similar to CreateNewBlob,
described above, with the addition that here RBS specifies the StoreBlobId instead of the provider
generating it.
This
method must be implemented if the OptimizationSpecifiedIds
capability is TRUE.
Allowed
exception codes are: AccessDenied, ConfigurationDoesNotAllowOperation, NotImplemented, PoolNotFound, NoMoreSpace,
BlobIdAlreadyExists, OperationFailedAuthoritative.
BlobInformation
CreateNewBlobFromStream(byte[] storePoolId, Stream inStream, byte[]
storeBlobId)
This
is similar to CreateNewBlobFromStream,
described above, with the addition that here RBS specifies the StoreBlobId instead of the provider
generating it.
This
method must be implemented if the OptimizationSpecifiedIds
capability is TRUE.
Allowed
exception codes are: AccessDenied, ConfigurationDoesNotAllowOperation, NotImplemented, PoolNotFound, NoMoreSpace,
BlobIdAlreadyExists, OperationFailedAuthoritative.
Stream ReadBlob(byte[] storePoolId,
byte[] storeBlobId)
This
is the “Pull” version of fetching a BLOB─the provider returns a readable
stream that contains the BLOB data.
The
returned stream object must allow reading and seeking (CanRead and CanSeek are
TRUE) and must support querying the Length property (correct length should be
returned). It must disallow writing (CanWrite
is FALSE).
Allowed
exception codes are: AccessDenied, ConfigurationDoesNotAllowOperation, PoolNotFound, BlobNotFound, OperationFailedAuthoritative.
void ReadBlobIntoStream(byte[]
storePoolId, byte[] storeBlobId, Stream outStream)
This
is the “Push” version of fetching a BLOB─the provider copies BLOB data
into the passed writable stream.
The
passed stream supports writing (CanWrite
is TRUE). No other assumptions (including assumptions related to CanSeek) should be made about this
stream object.
Allowed
exception codes are: AccessDenied, ConfigurationDoesNotAllowOperation, PoolNotFound, BlobNotFound, OperationFailedAuthoritative.
void DeleteBlob(byte[] storePoolId,
byte[] storeBlobId)
This
should delete the specified BLOB.
Allowed
exception codes are: AccessDenied, ConfigurationDoesNotAllowOperation, PoolNotFound, BlobNotFound, BlobInUse,
OperationFailedAuthoritative, OperationFailedMaybe.
BlobStoreWriterStream Class
BlobStoreWriterStream is inherited from System.IO.Stream and has one additional
method: Commit(). The members are
briefly outlined in the below table. Important methods are described after the
table.
Table 1
void Close()
If
this method is called, the BLOB data should be discarded and the BLOB should
not be stored in the BLOB store.
BlobInformation Commit()
When
this method is called, the provider should ensure the BLOB is stored in the
BLOB store, and return the details of the BLOB (StoreBlobId, CreateTime,
and Length). This method should do
an implicit Close().
Supporting Objects
These objects
are all defined by the RBS client library infrastructure and are used by the
provider library. Details on each of these objects are in the RBS class library
documentation.
BlobInformation
Table 2
BlobStoreCredentials
Table 3
Data Types Used
Table 4
Setup
As part of
setup for the provider library, Setup must register the DLL and class names to
be used by RBS client library. In addition, configuration information about the
provider needs to be registered. This is done through the machine-wide CLR xml
configuration file.
The different
pieces of information needed are described below. Helper classes present in the
RBS client library can be used to set this configuration during setup. Look at
the sample provider in the RBS SDK for examples on how to use these helper classes
to specify the xml elements.
BlobStoreType
Type:
string.
This
is a Unicode string of up to 128 characters. This uniquely identifies the type
of this provider. This is the same string that is used by applications and DBAs
in the BlobStoreType field when
configuring RBS BLOB stores for a database. Examples are “EMC Centera”,
“Microsoft SRS”. Provider writers are encouraged to start the type with the
name of the company so as to avoid collisions with other provider writers.
DllFile
Type:string.
This
specifies the path to locate the assembly in which the provider class is
present.
ClassName
Type:
string.
This
specifies the name of the class implementing the BlobStore abstract class within the specified assembly.
ProviderVersion
Type:
string.
These
fields indicate the version number for this provider class. The provider writer
is free to pick any non-negative values for these fields. It is expected that
these numbers increase over a period of time as new versions are released.
MinSupportedBackendStoreVersion
Type:
string.
These
fields indicate the minimum version number of the backend BLOB store that is
supported by this provider library.
ImplementedCommonBlobStoreSpecificationVersion
Type:
string.
These
fields indicate the version number of the RBS specification (RBS client library
and BlobStore abstract class) that
is implemented by this provider library. This means that the provider
understands and complies with all the requirements of the specified version of
RBS specification.
This
property is not used currently, but may be used in the future. Providers are
required to set this correctly.
ProviderSpecificConfigKey
This
describes ConfigItems that are
specific to this provider. Provider-specific configuration items can be used to
store configuration information about the back-end BLOB store. This
configuration is passed to the provider in the Initialize method.
Multiple
instances of this element are allowed. One such element needs to be specified
for each ConfigItem key that the
provider class understands (only provider-specific keys, not common keys
defined by RBS). It has the following fields:
name
Type:
string
Key
name of the provider-specific configuration item.
format
Type:
string
The
format of this configuration item, must be among: (Name, Boolean, Number,
Binary, Duration).
Provider/Store Version Picking Algorithm
RBS uses
standard four-part version numbers, that is, w.x.y.z where each of the terms is
progressively decreasing in significance.
The RBS
client library uses the above set of version numbers to determine which
provider libraries to use with which back-end BLOB stores. The algorithm used
is described below.
- Build
a list of provider libraries available for each BlobStoreType.
Current_RbsVersion is the version of this RBS client library. Load all the
provider libraries available and for each provider class:
- Add
this provider class with version {ProviderVersion} to the list of providers
available for type {BlobStoreType}. Maintain the list in sorted
order─descending order of {ProviderVersion}.
- For
a BLOB store that is registered as an RBS BLOB store in the database, find a
suitable provider class. BackendStoreVersion is the version of the backend BLOB
store as specified in the database.
- Find
the list of providers available for this {BlobStoreType}. Process each entry in
the list in order (highest version first):
- If (MinSupportedBackendStoreVersion
> BackendStoreVersion) skip to the next entry. The store is too old for this
provider class.
- Else pick this provider class for this
store.
Conclusion
This
specification should be used to guide the development of provider plug-in
libraries for the Remote BLOB Store feature of SQL Server 2008.
For more information:
http://www.microsoft.com/sqlserver/: SQL Server Web site
http://technet.microsoft.com/en-us/sqlserver/: SQL Server TechCenter
http://msdn.microsoft.com/en-us/sqlserver/: SQL Server DevCenter
Did this
paper help you? Please give us your feedback. Tell us on a scale of 1 (poor) to
5 (excellent), how would you rate this paper and why have you given it this
rating? For example:
- Are
you rating it high due to having good examples, excellent screen shots, clear
writing, or another reason?
- Are
you rating it low due to poor examples, fuzzy screen shots, or unclear writing?
This feedback
will help us improve the quality of white papers we release.
Send feedback.