Integrating Heterogeneous Data
Collapse the table of content
Expand the table of content
The document is archived and information here might be outdated

Integrating Heterogeneous Data

Updated: 14 April 2006

Many businesses and organizations have data stored in databases from multiple vendors. Integrating this data is often a key component in allowing systems to work together in an organization. Replication allows you to integrate heterogeneous data in two ways:

  • Use Oracle as a source for data that can be replicated to Microsoft SQL Server, IBM, and Oracle databases.
  • Use SQL Server as a source for data that can be replicated to IBM and Oracle databases.

The type of replication configuration used for integrating heterogeneous data depends on the source and destination(s) for the data:

  • The following diagram illustrates replicating data from SQL Server to IBM DB2 and Oracle.
    Replicating data to non-SQL Server databases
  • The following diagram illustrates replicating data from an Oracle database to other databases. The data is first replicated to a SQL Server database and can then be replicated to other databases including SQL Server, IBM DB2, and Oracle.
    Replicating data from Oracle

Adventure Works Cycles is a fictional manufacturing company used to demonstrate database concepts and scenarios. For more information, see Samples and Sample Databases.

Adventure Works Cycles acquired the Mexican company Importadores Neptuno recently in an effort to expand their infrastructure to support the expected growth of the company. Importadores Neptuno uses an Oracle database to maintain manufacturing and financial data. However, core elements of that data must be shared in order to maintain accurate scheduling and inventory data in Adventure Works Cycles' Manufacturing Resource Planning (MRP) application.

While there is no current plan to migrate the Importadores Neptuno databases, Adventure Works Cycles does need to transmit and receive data on a daily basis and transfer that information into their existing SQL Server online transaction processing (OLTP) and online analytical processing (OLAP) databases. Adventure Works Cycles will replicate data from the Oracle database to SQL Server databases at the central office.

Applications that involve integrating heterogeneous data typically have the following requirements, which an appropriate replication solution must address:

  • The system must allow data to be replicated between databases from different vendors.
  • The system must maintain transactional consistency.
  • Replication processing should require minimal overhead on the source server.
  • The system should have low latency if the replication of incremental changes is required.
  • The system should have high throughput if the replication of incremental changes is required: it should handle the replication of a large number of transactions.
  • The data required at the destination servers might be a subset of the data available at the source server.

SQL Server uses a publishing industry metaphor to describe the components of the replication system. The components include the Publisher, Distributor, Subscribers, publications and articles, and subscriptions.

  • In the first diagram above, the Oracle database is the Publisher. Some or all of the data at the Oracle database is included in the publication, with each table of data being an article. The data is replicated to the first SQL Server (which is configured as a Distributor) and then distributed to the other SQL Server, and the IBM and Oracle databases. Each of these databases is a Subscriber to the publication, receiving schema and data as a subscription.
  • In the second diagram above, the SQL Server database is the Publisher, and the IBM and Oracle databases are the Subscribers.

For more information on the components of the system, see Replication Publishing Model Overview.

SQL Server offers different types of replication for different application requirements: snapshot replication, transactional replication, and merge replication. This scenario is best implemented with the heterogeneous replication features of snapshot replication and/or transactional replication, which are well suited to handle the requirements outlined in the previous section:

By design, snapshot and transactional replication address the principal requirements for this scenario:

  • Replication between databases from different vendors
  • Transactional consistency
  • Minimal overhead

Transactional replication addresses additional requirements for systems that require incremental updates:

  • Low latency
  • High Throughput

The primary option to consider for this scenario is filtering. Snapshot and transactional replication allow you to filter columns and rows, so the tables at Subscribers contain only the data required by your application. For more information, see Filtering Published Data.

To implement these scenarios, you must first create a publication and subscriptions, and then initialize each subscription. Click the links below for more information.

After the subscription is initialized and data is flowing between the Publisher and Subscribers, you might need to consult the following topics for information on common administration and monitoring tasks:

© 2016 Microsoft