Improving Both Availability and Scalability
In many applications, it is crucial to provide both higher availability and higher read scalability; replication can be used as a key part of a solution to provide both. For some applications, the goal might be to improve either availability or scalability through replication. If you only need to address one of these areas, consider one of the following scenarios:
The following diagrams illustrate two applications that benefit from using replication to increase availability and scalability. In both cases, the three databases in the diagrams are peers with each other: they contain identical schema and data. Write activity for these databases must be partitioned: if the database contained a product catalog, you could, for example, direct updates to the first database for product names beginning with A-I, the second database for J-R, and the third database for S-Z. The updates are then replicated to the other databases.
The first diagram illustrates a configuration in which each Web and application server uses data from a particular caching server. Reads and updates for a given user flow to a specific application server and then to a specific caching server. Because the application server updates the cache directly, a central source server is not required. Updates at each cache are propagated to the other caches.
The second diagram shows three geographically dispersed servers with data flowing between all three, allowing each of the servers to support read requests and improve availability.
Adventure Works Cycles is a fictional manufacturing company used to demonstrate database concepts and scenarios. For more information, see AdventureWorks Sample Databases.
Adventure Works Cycles has a number of offices throughout the world, including locations in Los Angeles, London, and Taipei. Customer order information is collected at each location and then replicated to the other locations.
Order information can be read from any of the locations; therefore if the London office is experiencing heavy read activity, internal applications can distribute some of this activity to the other two offices.
If a server is down for maintenance at the London office, for example, orders can still be retrieved from another location and workers at the London office can continue to read and enter data. After the London server is back online, the changes received while it was down will be propagated to the London server, so that it will be up to date.
Applications that use replication for scalability and availability typically have the following requirements, which an appropriate replication solution must address:
The system should allow changes to be made at any server and have the changes replicated to all other servers.
The system must maintain transactional consistency.
The system should have low latency: updates at one server must reach the other servers quickly.
The system should have high throughput: it should handle the replication of a large number of transactions.
Replication processing should require minimal overhead.
Microsoft SQL Server uses a publishing industry metaphor to describe the components of the replication system. The components include the Publisher, Subscribers, publications and articles, and subscriptions.
In the diagrams above, all cache servers are Publishers and Subscribers. All data in the replicated database at each server is included in the publication, with each table of data being an article (articles can also be other database objects, such as stored procedures). Each server subscribes to publications from the other servers, receiving schema and data as a subscription. For more information on the components of the system, see Replication Publishing Model Overview.
SQL Server offers different types of replication for different application requirements: snapshot replication, transactional replication, and merge replication. This scenario is best implemented with peer-to-peer transactional replication, which is well suited to handle the requirements outlined in the previous section. For more information on peer-to-peer transactional replication, see Peer-to-Peer Transactional Replication.
If the application requires modifications to a given row to be made at more than one node at the same time, data conflicts can occur. In this case, use merge replication, which is well suited to handling conflicts. For more information on merge replication, see Merge Replication Overview.
By design, transactional replication addresses the principal requirements for this scenario:
Changes can be made at any server
The peer-to-peer option for transactional replication allows the servers to publish and subscribe to the same data. All nodes in a peer-to-peer topology are peers: each node publishes and subscribes to the same schema and data. Changes (inserts, updates, and deletes) can be made at all nodes and then replicated to all other nodes.
To implement this scenario, you must first create publications and subscriptions, and initialize each subscription. For more information, see:
SQL Server Management Studio: How to: Configure Peer-to-Peer Transactional Replication (SQL Server Management Studio)
Replication Transact-SQL Programming: How to: Configure Peer-to-Peer Transactional Replication (Replication Transact-SQL Programming)
After the subscriptions are initialized and data is flowing between the peers, you might need to consult the following topics for information on common administration and monitoring tasks: