Development Considerations for Database Federations (Azure SQL Database)
Updated: April 24, 2014
Developing solutions using federations requires additional considerations above those discussed in Azure SQL Database Development Considerations. Information specific to developing with federations is discussed below.
|The current implementation of Federations will be retired with Web and Business service tiers. Consider deploying custom sharding solutions to maximize scalability, flexibility, and performance. For more information about custom sharding, see Scaling Out Azure SQL Databases.|
Connecting To a Federation
A typical federation is comprised of many individual federation member databases, which are implemented as physical databases in SQL Database. While it is possible to directly connect to one of these physical databases, this method of connectivity is primarily only useful for tools that are not federation aware. For most applications, it is much more convenient to use the USE FEDERATION (Azure SQL Database) to establish a connection to a federation. The USE FEDERATION statement does not require an understanding of the underlying physical databases, and will automatically route the connection to the federation member that contains the requested data.
The USE FEDERATION statement accepts a federation key value, which is used by SQL Database to route the connection to the federation member containing data associated with this value. The For example, if your data is federated on customer_id and you specify a federation key of 100, you will be connected to the federation member that contains federated table rows for customer_id=100.
Beyond simply connecting to the correct federation member, the USE FEDERATION statement enables you to filter the connection to only those records that exactly match the federation key value you specify. In the above example, if the FILTER=ON clause is specified, only rows associated with customer_id=100 will be returned by the connection. This is accomplished by the query processor adding a predicate of (federated_column=value) on the federated column of every query that refers to a federated table.
If the FILTER=OFF clause is specified, the connection is still established to the federation member containing the specified value, but query operations can access the entire range of values stored within the federation member. For example, if the federation member covers the range of 0-200 for customer_id, you will be able to access any of records within that range even though you specified customer_id=100 in the USE FEDERATION statement.
Non-filtering connections are useful when performing operations that affect all rows stored in a member, such as schema modifications, or bulk operations such as bulk inserts.
The distribution of data across multiple physical databases within a federation introduces additional requirements at the application level. Since Microsoft Azure SQL Database doesn’t support distributed queries across multiple databases, you must instead implement the logic to perform these operations within your application. For example, to perform a select * from a federated table, you might do the following:
Connect to the federation using the low value for the range covered by the federation key. For example, USE FEDERATION customerfederation (customer_id=1) WITH RESET, FILTERING=OFF.
Select * to return the federated table rows contained in this federation member, and then store the results in memory.
Query sys.federation_member_distributions (Azure SQL Database) to determine the next federation member’s range_low value.
Connect to the federation using the range_low value. For example, USE FEDERATION customerfederation (customer_id=100) WITH RESET, FILTERING=OFF.
Select * to return the rows contained in this federation member, and then store them with the results of the previous query.
Repeat steps 3-5 until there are no entries in sys.federation_member_distributions with a higher range_low value than the current value being used as the federation key value.
The process of querying each individual member can be performed in parallel, which may reduce the time it takes to aggregate the data and return it to the application user.