Microsoft SQL Server 7.0 Replication Made Easy
June 8, 1999
Summary: This article gives you a broad general view of the replication feature of Microsoft® SQL Server™ 7.0 and how you can use replication to reduce the administrative burden of keeping multiple copies of information current and consistent at different sites. (5 printed pages)
Have you ever needed to make sure that data your business uses is copied to every branch office and is consistent across your organization? For example, suppose your corporate headquarters is in Chicago and you have regional offices in Seattle, Phoenix, Atlanta, and New York City. You want to keep a master price list for the products you sell at your home office, but you also want each regional office to have its own copy of the price list. Furthermore, each regional office tracks the orders it receives, and you want the corporate headquarters to have a master list of the order history from each regional office. One method you might use to accomplish this is to keep a price spreadsheet and an order history spreadsheet at the Chicago headquarters. You periodically copy the price sheet to each regional office and update the order history sheet at the home office with all the orders from each regional site. Although this method accomplishes your goal of keeping all the information current at each site, it can become difficult to manage. A better way would be to automate these tasks so that the consistency of the information is ensured while minimizing the administrative burden.
A situation such as the one described above is called a distributed environment, in which multiple copies of the same information exist at multiple locations. This environment may have been created deliberately or it may have evolved unintentionally. A distributed environment may become more and more difficult to manage as it grows, sites are added, and the information being retained becomes more complex. At some point, you will probably need to find an easier way to manage data in a distributed environment.
One way to manage data in a distributed environment is to use Microsoft SQL Server replication. In SQL Server, replication refers to copying data in a database from one location to another.
Publishers and Subscribers
SQL Server uses a publisher-subscriber metaphor for its data replication feature. The publisher makes information (data) available. A distributor forwards any changes to the appropriate subscribers. The subscribers receive the information. As an analogy, think of a local newspaper organization. The news organization or company is the publisher, your delivery boy is the distributor, and you are the subscriber. Similarly, in SQL Server replication, the publisher maintains the source data and makes it available for replication. The distributor receives the data and any changes, and forwards the information to the subscribers, who receive the data and hold a copy of it.
The data being replicated is referred to as a publication. The publication contains one or more individual articles. An article is a subset of the data that is being replicated. To continue with the newspaper analogy, the newspaper itself would be a publication, and a story on the President's State of the Union address in the newspaper, for example, would be an article in the publication. As with a newspaper subscription, in SQL Server replication you must subscribe to an entire publication. That is, you cannot subscribe to individual articles of a publication.
Information in databases is stored in tables. When you designate a table to be a publication, you do not need to publish the entire table. For example, if you have an employee table, you could choose to publish information such as the employee's name and address, but not salary information. This is called filtering the data to be published. You can apply either a vertical filter or a horizontal filter (or both) to a table. A vertical filter allows you to choose a subset of the table's columns to publish, while a horizontal filter allows you to select a subset of the table's rows.
In SQL Server replication, there are two types of subscriptions: push subscriptions and pull subscriptions. A push subscription is what we usually think of as a subscription. The publisher pushes the subscription to the distributor, who then sends it to the subscribers. The newspaper delivery model uses a push subscription. A pull subscription may sound like an oxymoron because we tend to think of subscriptions as being push by default. In a pull subscription, the subscriber initiates the subscription. For example, if you go down to the local news stand to get the newspaper, that would be considered a pull subscription.
There are also different replication models. The first model is to have a central publisher/distributor who sends the publication to multiple subscribers (this is the same as the newspaper analogy used above). In this model, one server can be set up as the publisher and distributor. This publisher/distributor publishes and distributes data to any number of subscribers. The following picture illustrates this model:
Figure 1. Publisher/distributor replication model
The next model is the opposite of the first: You have a central subscriber who receives data from multiple publishers. This model is helpful when you need to consolidate data at a central site. The following is diagram illustrates this model:
Figure 2. Central subscriber replication model
To go back to the scenario at the beginning of this article, you now know that an easy way to have each regional site maintain a current copy of your product price list and have your corporate headquarters receive the orders from each regional site is to use SQL Server replication. You can have your headquarters in Chicago set up as a publisher and distributor to send the product price table of your database to the regional offices (which will be the subscribers), using the central publisher model. You can also have the regional offices be the publishers/distributors to the Chicago office, which will subscribe to the order information as described in the central subscriber, multiple publisher model.
When planning to implement replication, you should carefully consider the following questions:
- What information do you want to publish?
Replication will have an impact on your network. To save disk space, processing time, and network bandwidth, you should replicate only the necessary data. If you do not need to replicate all the information in a table, you can use a horizontal or vertical filter (or both) to replicate only a part of the table.
- How up-to-date does your data need to be?
For example, some people like to receive their newspapers daily, while others choose to receive only the Sunday paper. In SQL Server replication, this is referred to as transactional consistency.
- How independent do your subscribers need to be?
When you go out of town for an extended period of time, you probably stop your newspaper subscription temporarily. Because replication happens across your network, you must consider cases where the subscribers are not connected to the network. Decide whether a subscriber will always be connected to the publisher or if it will only be connected some of the time. Also, decide whether or not subscribers will be able to modify the published data after they receive it from the publisher. This is referred to as subscriber autonomy.
- How often do you want the data to be synchronized?
You can schedule replication to occur at predefined intervals, or you can specify that replication will happen on demand.
- What type of replication model do you want to use?
Decide whether you need a central publisher/distributor with multiple subscribers, a central subscriber who receives information from multiple publishers, or a combination.
- Who will initiate the replication?
Decide whether you want the publisher/distributor to automatically send the information to the subscribers (push subscription) or if you want the subscribers to initiate the replication (pull subscription).
There are other considerations to keep in mind as well, but these should help to get you started.
Additional Information and Resources
For additional information on SQL Server replication, see the following:
- The SQL Server 7.0 Replication Frequently Asked Questions (FAQ) list on the SQL Server Support Highlights FAQ page, located at http://support.microsoft.com/support/sql/70faq.asp.
- The Replication for SQL Server 7.0 white paper, available in the MSDN Library.