This topic has not yet been rated - Rate this topic

How Merge Replication Tracks and Enumerates Changes

After a publication or subscription has been initialized, merge replication tracks and enumerates all changes to the data in published tables. Changes are tracked through triggers (which replication creates for each published table) and system tables in the publication and subscription databases. These replication system tables are populated with metadata that indicates which changes should be propagated. When the Merge Agent runs during synchronization, changes are enumerated by the agent and then applied to the Publisher and Subscriber as necessary.

Merge replication uses the following triggers and system tables to track changes for all published tables:

  • MSmerge_ins_<GUID>: insert trigger (the GUID value for this and the other triggers is derived from sysmergearticles)

  • MSmerge_upd_<GUID>: update trigger

  • MSmerge_del_<GUID>: delete trigger

  • MSmerge_contents

  • MSmerge_tombstone

  • MSmerge_genhistory

Merge replication uses the following additional system tables to track changes for filtered tables:

  • MSmerge_partition_groups

  • MSmerge_current_partition_mappings

  • MSmerge_past_partition_mappings

Note Note

The system tables listed are used by all merge publications and subscriptions in a database; for example, if you have more than one publication in a publication database, MSmerge_contents contains rows from articles in all publications.

Change Tracking for Unfiltered Tables

System Tables

The system tables used for unfiltered and filtered tables contain the following metadata:

  • MSmerge_contents contains one row for each row inserted or updated in a published table in the database.

  • MSmerge_tombstone contains one row for each row deleted from a published table in the database.

  • MSmerge_genhistory contains one row for each generation. A generation is a collection of changes that is delivered to a Publisher or Subscriber. Generations are closed each time the Merge Agent runs; subsequent changes in a database are added to one or more open generations.

Change Tracking Process

The following change tracking process is used for all unfiltered tables:

  • When an insert or update occurs on a published table, the MSmerge_ins_<GUID> or MSmerge_upd_<GUID> trigger fires, and a row is inserted into the MSmerge_contents system table. The rowguid column of MSmerge_contents contains the GUID for the inserted or updated row, indicating that the next time synchronization occurs, the corresponding inserted or updated row in the user table should be sent to the Publisher or Subscribers. If subsequent updates occur on a row in a user table, the row in MSmerge_contents is updated to reflect this.

  • When a delete occurs on a published table, the MSmerge_del_<GUID> trigger fires, and a row is inserted into the MSmerge_tombstone system table. The rowguid column of MSmerge_tombstone contains the GUID for the deleted row, indicating that the next time synchronization occurs, a delete should be sent to the Publisher or Subscribers for the corresponding deleted row in the user table. If the deleted row is referenced in MSmerge_contents (because it was inserted or updated since the last synchronization), the row is deleted from MSmerge_contents.

Change Tracking for Filtered Tables

System Tables

In addition to the system tables described in the previous section, three tables in the publication database contain metadata for tracking changes to filtered tables:

  • MSmerge_partition_groups contains one row for each partition that is defined in a publication. Partitions can be:

    • Defined explicitly using sp_addmergepartition or the Data Partitions page of the Publication Properties dialog box.

    • Created automatically when a Subscriber synchronizes if the Subscriber requires a partition that does not yet have an entry in MSmerge_partition_groups.

  • MSmerge_current_partition_mappings contains one row for each unique combination of rows in MSmerge_contents and MSmerge_partition_groups. For example, if a row in a user table belongs to two partitions, and the row is updated, one row is inserted into MSmerge_contents to reflect the update, and two rows are inserted into MSmerge_current_partition_mappings, to indicate that the updated row belongs to the two partitions.

  • MSmerge_past_partition_mappings contains one row for each row that no longer belongs in a given partition. A row moves out of a partition if:

    • The row is deleted. If a row is deleted from a user table, a row is inserted into MSmerge_tombstone and one or more rows are inserted into MSmerge_past_partition_mappings.

    • The value in a column used for filtering has changed. For example, if a parameterized filter is based on the state in which a company is headquartered and the company moves, the row for the company (and related rows in other tables) might move out of one sales person's partition of data into the partition for another sales person. If a row is updated such that it no longer belongs in a partition, a row is inserted or updated in MSmerge_contents and one or more rows are inserted into MSmerge_past_partition_mappings.

Note Note

If nonoverlapping partitions with one subscription per partition (a value of 3 for the @partition_options parameter of sp_addmergearticle) are used, the system tables MSmerge_current_partition_mappings and MSmerge_past_partition_mappings are not used to track the rows' partition mappings, because each row belongs to only one partition and can be changed at only one Subscriber.

Change Tracking Process

The process described above (in the section "Change Tracking for Unfiltered Tables") for unfiltered tables is also used for filtered tables, with the following additions:

  • When an insert occurs on a published table, in addition to data being updated or inserted into MSmerge_contents, a partition mapping is added to MSmerge_current_partition_mappings for each partition that the row belongs to.

  • When an update occurs on a published table, in addition to data being updated or inserted into MSmerge_contents, if a partition mapping does not exist in MSmerge_current_partition_mappings for each partition that the row belongs to, one is added. If the update resulted in a row being moved from one partition to another, a row is updated in MSmerge_current_partition_mappings and one is added to MSmerge_past_partition_mappings.

  • When a delete occurs on a published table, in addition to a row being inserted into MSmerge_tombstone, a row is deleted from MSmerge_current_partition_mappings and one is added to MSmerge_past_partition_mappings.

System Tables and Procedures

When the Merge Agent runs, changes are enumerated using a number of system tables and stored procedures:

  • MSmerge_genhistory contains one row for each generation. A generation is a collection of changes that is delivered to a Publisher or Subscriber. Generations are closed each time the Merge Agent runs; subsequent changes in a database are added to one or more open generations.

  • sysmergesubscriptions contains information on subscriptions, including a record of the last generations of changes a node has sent and received. In the publication database, this table contains a row for the Publisher and one row for each Subscriber. In a subscription database, this table typically contains a row for the Subscriber and a row for the Publisher.

  • MSmerge_generation_partition_mappings is used only for filtered tables, recording whether a given generation contains any changes relevant to a given partition. This table in the publication database contains one row for each unique combination of rows in MSmerge_genhistory and MSmerge_partition_groups.

  • sp_MSmakegeneration closes all open generations at the beginning of the enumeration process.

  • sp_MSenumchanges enumerates changes for tables (a number of related procedures that have names beginning with sp_MSenumchanges are also used in this process).

  • sp_MSgetmetadata determines whether a change from one node should be applied at another node as an insert, update, or delete.

Change Enumeration Process

The following process occurs during change enumeration:

  1. The system procedure sp_MSmakegeneration is called:

    • For unfiltered and filtered tables, this procedure closes all open generations referenced in MSmerge_genhistory (closed generations have a value of 1 or 2 in the column genstatus).

    • For filtered tables, this procedure populates the system table MSmerge_generation_partition_mappings. If a generation contains one or more changes that are relevant for a partition, a row is inserted into the system table. If a generation does not contain any changes relevant for a given partition, a row is not inserted into MSmerge_generation_partition_mappings, and changes are not enumerated for any Subscribers that receive that partition.

  2. The stored procedure sp_MSenumchanges and related procedures are called. These procedures enumerate the changes that have occurred since the last time synchronization occurred:

    1. The procedures first determine the generation at which enumeration starts, based on the columns sentgen (last generation sent) and recgen (last generation received) in the table sysmergesubscriptions.

      For example, when determining which generations' changes must be enumerated for a given Subscriber, the sentgen for the Subscriber (stored in the publication database), and the recgen for the Subscriber (stored in the subscription database) are compared. If the values are the same (which indicates that the last generation sent from the Publisher was successfully received by the Subscriber), changes are enumerated starting with the next generation in MSmerge_genhistory. If the values are not the same, the lower of the two values is used to ensure all required changes are sent.

    2. The procedures then enumerate changes:

      For unfiltered tables, all changes contained in generations after the generation in sentgen or recgen are enumerated: MSmerge_genhistory is joined to MSmerge_contents and MSmerge_tombstone to determine which changes must be sent.

      For filtered tables, MSmerge_generation_partition_mappings is joined to: MSmerge_current_partition_mappings and MSmerge_contents; and MSmerge_past_partition_mappings and MSmerge_tombstone to determine which changes are relevant for the partition that the Subscriber receives.

  3. The stored procedure sp_MSgetmetadata is called to determine whether a change should be applied as an insert, update, or delete. At this point, conflict detection and resolution are performed; for more information, see How Merge Replication Detects and Resolves Conflicts.

Did you find this helpful?
(1500 characters remaining)
Community Content Add
Annotations FAQ
Reducing SQL Server Sync I/O Contention
$0One of the features that makes SQL Server Merge Replication superior to other sync technologies is something called Precomputed Partitions.  SQL Server creates and maintains distinct data partitions/subsets for each unique user or other type of filter value.  Other sync technologies figure out “what’s-changed” on the fly when a Subscriber synchronizes.  This means all change-tracking tables must be evaluated to figure out the data-delta while the user (not so patiently) waits.  Merge Replication with Precomputed Partitions does all this hard work in advance, so the Subscribers can start downloading changes instantly.  Not making your customer wait contributes to a good user experience (UX).$0 $0To pull this off, SQL Server has to use a more advanced set of tracking tables + additional trigger logic than simpler sync systems.  Since it does more work in advance, it goes without saying that the Merge Agents are busily creating extra contention through locking and blocking of all those extra tracking tables.  Luckily, you have some control over how many extra tracking tables are used depending on the partition choices you make.  The fewer tables you have to use, while still getting the functionality you need is one of the keys to reducing SQL Server contention.$0 $0Let’s take a quick look at the tracking tables to give you some context:$0 $0 $0MSmerge_contents (A row is inserted for each user table insert or update)$0 $0MSmerge_tombstone (A row is inserted for user table delete)$0 $0MSmerge_genhistory (A grouping of the above inserts called a generation)$0 $0MSmerge_partition_groups (One row for each distinct partition defined)$0 $0MSmerge_current_ partition_mappings (One row for each unique combination of rows in MSmerge_contents and MSmerge_partition_groups)$0 $0MSmerge_past_partition_mappings (One row for each row that no longer belongs in a given partition)$0 $0 $0Wow, that’s a bunch of tracking tables!$0 $0So imagine a bunch of Merge Agents performing SELECTS, INSERTS, UPDATES, and DELETES against all these tables in order to make the magic happen for the end user.  The more Merge Agents you having running concurrently, the more locking and blocking the system will experience which degrades performance.$0 $0What if I told you that you could reduce contention by eliminating 2 tracking tables from the equation?$0 $0Anyone who has setup Merge Replication might recognize the Add Filter dialog below:$0 $0$0 $0You will see one of these dialog boxes for every filter you create.  For the purposes of this article, I just want you to focus on the 2 radio buttons at the bottom.  You get to choose that a row from a table will sync with multiple subscriptions or just one subscription.  If multiple users share the same distinct partition, you select the top one, because it’s an overlapping partition and it will need to use all 6 of the tracking tables I mentioned previously.  This means multiple users send the exact same HOST_NAME() filter value when they sync.  This might be a scenario where a group of sales professionals share the same region and clients, but nothing specific to themselves.$0 $0On the other hand, if you know that only one user subscribes to a given partition, then you select the bottom radio button because it’s a non-overlapping partition, and only uses the top 4 tracking tables I mentioned above.  This is actually the most common scenario I see with my customers.  Imagine a delivery driver that syncs down the data he needs to tell him where to go, what to do, and then uploads data he’s captured about deliveries.$0 $0So the big takeaway here, is to always select the bottom radio button so that SQL Server will create more efficient, non-overlapping partitions and use fewer tracking tables.$0 $0But wait, there’s more!$0 $0Non-overlapping partitions are less likely to create conflicts that have to be resolved by the built-in conflict resolvers.  If you’re not updating other users rows, then you’re not crashing into each other which leads to faster, smoother operation.$0 $0Back at the beginning of the article, I mentioned that there’s a quite of lot of code executed by triggers and associated stored procedures to make this sophisticated machine work.  Depending on your schema, the amount of data in the tracking tables, and the types of filters you’ve created, a potentially slow 5-way join is executed by sp_MSenumchanges. You can turn that into a 3-way join and speed up the system by sticking to non-overlapping partitions.  That was easy.$0 $0It’s all about UX, and trust me, your users will thank you for caring about this stuff!$0