.gif)
SQL Server
Technical Article
Writer: Martin Ellis
Technical Reviewers: Val Fontama (Microsoft), Ritu Kothari
(Microsoft), Chunhui Zhu (Microsoft), Xiaoyan Zhao (Microsoft), Ben Lubetsky
(Attunity), Ramakrishnan Krishnan (Microsoft), Nitin Mehrotra (Microsoft),
Ricardo Mendes (Microsoft)
Published: December 2008
Applies to: SQL Server 2008
Summary: This white paper provides an overview
of the connectivity options that are available for SQL Server 2008 Integration
Services.
>Introduction
The data to which a company has access is key to its future
success, but obtaining meaningful information from data can be far from
straightforward. Companies may need to harvest data from multiple geographical
locations and it is unlikely that all the data will be stored in a single
format. Spreadsheets in the Microsoft Office Excel spreadsheet software,
databases in the Microsoft Office Access database software, XML documents,
databases in the Microsoft SQL Server data management software, Oracle
databases, Teradata data warehouses, and SAP systems are just a few of the data
stores that contemporary organizations use. Other issues, such as data
ownership and compliance with regulatory requirements, can further complicate
matters.
Data consolidation can be time consuming and resource
intensive, and batch windows can be hard to find in an increasingly globalized
environment. Furthermore, the value of data can depreciate in a relatively
short period of time. Consequently, making reliable data available in a timely
and efficient manner is a major challenge for the modern data worker.
SQL Server 2005 introduced SQL Server Integration Services,
an enterprise-level data integration and workflow solutions platform for
performing extract, transform, and load (ETL) operations. Integration Services
provides a set of powerful features that enable the merging and consolidation
of data from heterogeneous sources, and it includes tools for extracting,
cleaning, standardizing, transforming, and loading data. A wide variety of
built-in connectors support these operations, enabling Integration Services to
interact not just with SQL Server databases, but with many other proprietary
and nonproprietary data sources.
The SQL Server 2008 implementation of Integration Services
builds upon the strengths of the previous release, and as a result the new
release is a robust enterprise ETL platform that is even more productive. For
an outline of the enhancements to Integration Services in SQL Server 2008, see
the article “What's New (Integration Services)” on the MSDN® Web site at http://msdn.microsoft.com/en-us/library/bb522534.aspx.
Two key areas of development in SQL Server 2008 Integration
Services are:
- Improved options for connectivity.
- Significant gains in performance.
Integration Services provides a wide range of data source
connectors out of the box, and many add-on connectors are available from
Microsoft and from third-party vendors. As a result, Integration Services is
able to work with a broader range of sources than ever before.
The new connectivity options have also contributed to
improving performance, and SQL Server now has the fastest ETL tool available.
In a benchmark test, over 1 terabyte of data was loaded over a network from
flat files into a SQL Server 2008 database in under 30 minutes, using an OLE DB
connector. That is a world record 15 minutes faster than the next quickest time
for the same amount of data from the nearest competitor! A white paper that
discusses this test in greater detail is expected soon, but an overview of the
test is provided in an article called “ETL World Record!” on the MSDN Blogs Web
site at http://blogs.msdn.com/sqlperf/archive/2008/02/27/etl-world-record.aspx.
There are so many connectivity options available now that it
can be difficult to know where to start. The aims of this white paper are to
assist IT professionals who want to use SQL Server 2008 Integration Services to
perform ETL operations between heterogeneous data sources and destinations, and
to raise awareness of the new connectors for Oracle, Teradata, and SAP BI that
are distributed and supported by Microsoft. The paper begins with an overview
of the many connection options for SQL Server 2008 Integration Services, and
then it discusses the out-of-the-box connectors that are built-in to
Integration Services. It continues with an introduction to add-on connectors
for Tier-1 data sources (Oracle, Teradata, and SAP BI), and then it outlines
the adapters for SAP R/3 and Siebel that are available with the adapter pack
for Microsoft BizTalk® Server. Finally, the paper introduces connectors for
Tier-2 data sources including DB2.
>Overview of Connectivity Options in SQL Server 2008
Integration Services
You can use Integration Services to create packages that
encapsulate a specific business requirement, such as extracting data from an
Oracle database, cleaning the data, and then loading it into a SQL Server
Analysis Services database. You create Integration Services packages by using
Business Intelligence Development Studio, a version of the Microsoft Visual
Studio® development system that has a built-in template for Integration
Services projects. This familiar interface, which uses drag-and-drop functionality
to select and connect tasks, can help to significantly reduce development time.
Packages consist of one or more control flow tasks, where
each task feeds into the next. Some control flow tasks, for example the Bulk
Insert task and the Execute SQL task, reference data sources. Others, such as
the Script task, do not always do so. You can use a special task called the
Data Flow task to configure data transformations. The Data Flow task has its
own toolbox, which contains data flow source items, data flow destination
items, and the various transformation items. Some data flow sources and data
flow destinations can reference data sources directly. For example, when you
configure a Raw File source, you can specify a file location. However, the
majority of data flow sources and destinations reference a connection manager.
A connection manager is a predefined logical representation
of a connection. When you create a connection manager by using Business
Intelligence Development Studio, you specify the information that is required
to build a connection string for your data source. At run time, the connection
string is used to create the connection as required by the elements within the
package. You can use a connection manager more than once within a package.
When you create a connection manager, you must first specify
the type of connection manager that you require in the Add SSIS Connection Manager dialog box (Figure 1). After you have
selected the type of connection manager that you require, you can configure the
connection to target your data source more specifically. For example, if you
select the OLE DB connection manager type, you can specify OLE DB for Oracle,
OLE DB for SQL Server, or other sources. These options are discussed in more
detail in the next section.
.jpg)
Figure 1:Add SSIS Connection Manager dialog box
There are also add-on connectors that offer connectivity to
sources that have no built-in connector, such as Teradata and SAP BI, or that
offer improved performance over existing connectors for sources that are
already supported, such as Oracle. After you have installed these connectors,
they are accessible and configurable through Business Intelligence Development
Studio in the same way as the standard connectors are. These connectors are
discussed in more detail later in this paper.
>The Built-in Connectors
SQL Server Integration Services includes built-in connectors
for many of the most commonly used data sources.
>>OLE DB
OLE DB is a set of Component Object Model (COM) interfaces
that facilitate access to a variety of data stores, including relational
databases. After you add an OLE DB connection manager to your Integration
Services project, you can then specify the nature of the data source that you
want to connect to by selecting a provider in the Connection Manager dialog box (Figure 2). The next few paragraphs
provide an overview of these OLE DB providers.
.jpg)
Figure 2: Connection Manager dialog box showing
OLE DB provider options
>Using OLE DB with Microsoft Office Excel and Access
Organizations frequently need to move data between Office
Excel and SQL Server and between Office Access and SQL Server. Microsoft OLE DB
Provider for Jet version 4.0 enables you to use Office Access databases up to
and including Office Access 2003 and Office Excel workbooks up to and including
Office Excel 2003 as data sources and destinations. Office Access 2007 and
Office Excel 2007 have a different file format, so you must use the Microsoft
OLE DB provider for Office 12.0 Access Database Engine for these files instead.
After you have chosen the correct provider, you must specify the file location
for the database or workbook.
For Office Excel, there is also a dedicated connection
manager, which you can select from the Add
SSIS Connection Manager dialog box. When you set up an Excel connection by
using this option, you must specify the version of Office Excel that you are
referencing. When you do this, the correct data provider (either the OLE DB
provider for Jet 4.0 or the OLE DB provider for Office 12.0 Access Database Engine)
is added automatically.
>Using OLE DB with SQL Server
You can access SQL Server databases by using several OLE DB
providers. The Microsoft OLE DB Provider for SQL Server provides access via the
standard OLE DB driver (SQLOLEDB). There is also the SQL Server Native Client
10.0, which is an updated version of the SQL Server Native Client first
introduced in SQL Server 2005. The SQL Server Native Client provides both OLE
DB connectivity and ODBC connectivity. It also enables you to utilize the
features of SQL Server 2005 that were not present in previous versions of SQL
Server, such as the XML data type and common language runtime (CLR)
user-defined data types. The SQL Server Native Client 10.0 is the SQL Server
2008 equivalent of the SQL Server Native Client, and it enables you to take
advantage of new features in SQL Server 2008, such as sparse columns and the
new date and time data types.
The OLE DB option that you choose for a SQL Server
connection depends in part upon whether you need to take advantage of these new
features. Whichever access method you choose for SQL Server, the configuration
of the connection manager is the same. You must provide the
name of the host server and the source database, in addition to authentication
information.
>
Using OLE DB with Oracle
You can use the Microsoft OLE DB Provider for Oracle
(MSDAORA) to access to Oracle data sources. You configure the provider by
supplying the name of the Oracle server and the authentication details for that
server. This information is then used to build the connection string. You can
use data flow transformations that rely on an OLE DB connection, such as Lookup
and Fuzzy Grouping, with this connector. There are some limitations when you
use this provider. For more information, see the article "INFO:
Limitations of Microsoft Oracle ODBC Driver and OLEDB Provider" on the
Microsoft Help and Support Web site at http://support.microsoft.com/kb/244661. Broadly, Oracle
features that were introduced after version 8i are not supported, which
precludes the use of some data types, for example. (You can use the provider
with later versions of Oracle, but these features will not be available.) There
is a built-in .NET provider for Oracle that provides access to Oracle features
up to and including version 9i. There is also a high-performance connector for
Oracle, the Microsoft connector for Oracle by Attunity 1.0, which will be
discussed later in this paper.
>Using OLE DB with SQL Server Analysis Services
The Microsoft OLE DB Provider for SQL Server Analysis
Services 10.0 provides access to dimensional data and cubes hosted on a server
running SQL Server 2008 Analysis Services. Using this connector, you can
connect to live Analysis Services databases as well as to Analysis Services
projects that are still in development in Business Intelligence Development
Studio. The Analysis Services Execute DDL, Analysis Services Processing, and
Data Mining Query control flow tasks use the OLE DB Provider for Analysis Services
10.0, as do the Data Mining Model Training, Dimension Processing, and Partition
Processing destinations in the data flow. The OLE DB Provider for Analysis
Services 9.0 is a provider for SQL Server 2005 Analysis Services, and the
Microsoft OLE DB Provider for OLAP Services 8.0 enables access to SQL Server
2000 Analysis Services. Additionally, the Microsoft OLE DB Provider for Data
Mining Services enables you to use data mining extensions (DMX) statements. For
more information, see the article "OLE DB for Data Mining Specification
1.0 Final" on the Microsoft Web site at http://www.microsoft.com/downloads/details.aspx?DisplayLang=en&FamilyID=01005f92-dba1-4fa4-8ba0-af6a19d30217.
>Using OLE DB with Other Data Sources
The Microsoft OLE DB Provider for Microsoft Directory
Services is used to access Active Directory® Service Interfaces (ADSI). It
provides read access to the Active Directory directory service and other
lightweight directory access protocol (LDAP) services such as Novell Directory
Services.
The OLE DB Simple Provider is intended to give access to
data sources that require only basic OLE DB support. You can use it to open
hierarchical ActiveX® Data Objects (ADO) Recordsets
over XML documents. For more information, see the article "Microsoft OLE
DB Simple Provider" on the MSDN Web site at http://msdn.microsoft.com/en-us/library/ms675260(VS.85).aspx.
You can also build hierarchical Recordsets
by using the MSDataShape provider.
Finally, the Microsoft OLE DB Provider for Indexing Service
enables you to perform full-text searches on files other than database files.
>Using OLE DB with Other Integration Services Tasks
and Transformations
In addition to being used as data sources and destinations,
OLE DB connection managers can be consumed by certain data flow
transformations, including Lookup, Fuzzy Lookup, Fuzzy Grouping, Slowly
Changing Dimensions, Term Extraction, Term Lookup, and OLE DB Command. The Bulk
Insert and Execute SQL control flow tasks also reference OLE DB connections.
>ADO.NET
SQL Server 2008 Integration Services provides greater
support for ADO.NET connectivity than SQL Server 2005 Integration Services. The
most visible example of this is the ADO.NET source and destination options that
are available in the data flow, which can read and write data directly to and
from databases by using ADO.NET. The ADO.NET source effectively replaces the
DataReader source, but the DataReader destination is still available alongside
the ADO.NET destination. The DataReader destination is useful when you want to
provide data for consumption by other applications. For example, you can use it
as input for a Reporting Services data source.
When you select an ADO.NET connection manager from the Add SSIS Connection Manager dialog box,
you can choose from a list of ADO.NET 2.0 providers, including the SqlClient
Data Provider, the OracleClient Data Provider, the ODBC Data Provider, and the
.NET OLE DB Data Provider. The latter enables access to the same list of OLE DB
providers that were discussed earlier, but with ADO.NET as an extra layer on
top of OLE DB. Because of this extra layer, you will generally achieve better
performance by accessing the OLE DB providers directly.
The SqlClient Data Provider and OracleClient Data Provider
give access to SQL Server databases and Oracle databases respectively, through
the ADO.NET interface. For Oracle and SQL Server connections, you can use
either an ADO.NET or an OLE DB provider. Factors that might influence your
choice of connection manager and provider include:
- Performance. In general, you can expect OLE DB
sources to offer slightly better performance, although you should always test
your packages in your own specific environment rather than relying on this to
always be the case.
- The availability of certain transformation
types. For example, Lookup transformations require an OLE DB connection
manager.
- 32-bit and 64-bit environment. The use of
ADO.NET providers can simplify deployment in mixed 32-bit and 64-bit
environments.
- Functionality. For example, the OracleClient
Data Provider offers support for some data types that were introduced after
Oracle version 8i, whereas the OLE DB Provider for Oracle does not.
For a more in-depth discussion of these issues, see the
article "ADO.NET vs OLEDB" on the Microsoft Integration Services
development team wiki at http://ssis.wik.is/Understanding_Connectivity_Stacks/ADO.NET_vs_OLEDB.
Open Database Connectivity (ODBC) support is provided
through the .NET ODBC Data Provider. This is the only way that you can access
ODBC data sources in SQL Server 2008 Integration Services, because there is no
ODBC connection manager.
>ADO
ADO connection managers enable access to the OLE DB data
sources that were discussed earlier through an ADO interface.
>FLATFILE and MULTIFLATFILE
Flat files can be a useful intermediary if you want to
exchange data between systems that have no way of connecting to one another, or
if there is some other reason that direct communication is not desirable. For
example, a mainframe system for which there is no connector might generate a
.csv file, and the data that it contains can be imported into SQL Server using
a FLATFILE connector. You can use the File System and FTP control flow tasks to
obtain and manipulate files, and the FLATFILE and MULTIFLATFILE connection
managers can then access the data that they contain. Various file formats are
supported, including delimited (.csv) and ragged right. Using the MULTIFLATFILE
connection manager, you can reference multiple files as long as they all have
the same format. When you specify the file locations, you separate each path
with a pipe (|).
>FILE
The FILE connection manager is not used to load data from
flat files. Instead it provides a package with access to a file at run time.
For example, an Execute SQL task that uses a FILE connection can extract and
run SQL statements that are contained within a file. Alternatively, you can use
a FILE connection as the connection manager for an XML task. FILE connection
managers, used with the File System task, can also create a file or folder at
run time.
>FTP
FTP connection managers can access FTP servers to transfer
files. The FTP task can use FTP connection managers to both send and receive
files. The FTP connection manager does not support Windows® authentication,
only basic and anonymous. The passive mode option enables the connection to be
initiated by the client as well as the server. You can then process files that
are transferred by an FTP task using a FLATFILE or FILE connector, as discussed
earlier.
>HTTP
The HTTP connection manager is used to access a Web server.
The required connection details can be seen in Figure 3. As with FTP
connectors, only basic and anonymous connections are supported; Windows
authentication is not. You can specify a client certificate for additional
security, and the option to use a proxy server is also available.
The Web Service task uses an HTTP connection manager. You
can configure a Web Service task to access a Web service method that will
recover data from the Web server and deliver it to a file or a variable for
further processing. For example, you could access product information from a
Web server and load it into your database. The Web Services Description
Language (WSDL) file that describes the methods that the Web service offers
must be available locally when you configure the Web Service task.
.jpg)
Figure 3: HTTP Connection Manager Editor
dialog box
>MSMQ
The Message Queue task uses the MSMQ connection manager to
enable Integration Services packages that are located on multiple systems throughout
your corporate network to communicate and interact with one another. This can
be useful if, for example, you have a central database that needs to receive
data from several distributed sources and then perform some kind of aggregation
on the data. You can create packages on the distributed servers that use
Message Queue tasks to send their data to the central server. A package on the
central server, which also contains Message Queue tasks, can pick up the data
from each server and then, after all of the data has arrived, process it all
together. Using Message Queuing (also known as MSMQ), packages communicate with
one another at run time, so the central package in this example might have to
run for a relatively long time to gather all of the data. Message Queuing is an
asynchronous communication mechanism. If a server cannot deliver its data at
the first attempt, it attempts to deliver it later, which adds a degree of
reliability to the process.
The Message Queue task enables you to send data as a string,
a variable, or a file, and you can configure a task to send or receive. If you
choose the receive option, you can have the message automatically converted to
a variable by using the String message
to variable option in the Message
Queue Task Editor dialog box (Figure 4). This makes further processing of
the data potentially easier.
.jpg)
Figure 4: Message Queue Task Editor
dialog box
>MSOLAP100
Choosing the MSOLAP100 connector opens the same dialog box
as selecting the OLE DB provider for SQL Server Analysis Services 10.0, which
was discussed earlier. It is simply another way to configure this provider.
>SMOSERVER
SQL Server Management Objects (SMO) connection managers
enable Integration Services to perform transfer tasks. Transfer tasks move
databases, logins, jobs, error messages, stored procedures from the master database, and SQL Server objects
such as tables, views, and schemas between instances of SQL Server, including
SQL Server 2000, SQL Server 2005, and SQL Server 2008. These tasks can be
useful when you need to migrate database objects to a newer version of SQL
Server. When you configure an SMO connection manager, you only need to supply a
SQL Server name and authentication details.
>SMTP
The Simple Mail Transfer Protocol (SMTP) connection manager
enables access to an SMTP server. The Send Mail task uses this connection
manager to send e-mail messages, for example, to indicate success or failure of
components within a package. The SMTP connection manager has no reliance on
MAPI so the Microsoft Office Outlook® messaging and collaboration client is not
required for the connector to function.
>SQLMOBILE
The SQLMOBILE connector provides connectivity to databases
on SQL Server Compact. When you configure the connector, you specify the
location of the database file (which has an .sdf extension) and the
authentication details. The connection manager can be used by the Execute SQL
task and also by the SQL Server Compact destination in the data flow to load data
into a SQL Server Compact database.
>WMI
The Windows Management Instrumentation (WMI) connection
manager provides access to the WMI scripting interface and is used by the WMI
Data Reader and WMI Event Watcher tasks. The WMI scripting interface accepts WMI
Query Language (WQL) queries that return information about the Windows
environment. WQL is syntactically similar to SQL. You can use WQL queries in a
WMI Data Reader task to check the amount of available disk space before you
start a data load, for example.
The WMI Event Watcher task can respond to WMI events, such
as a file being placed into a particular folder. This is useful when you are
processing flat files. For example, when a file arrives in a particular folder,
the Event Watcher task can respond by initiating the processing of the file by
an Integration Services package. When you create a WMI connection manager, you
supply the server name, the default WMI namespace, and the authentication
details.
>XML
Integration Services does not include an XML connection
manager, but it does provide extensive XML handling capabilities through the
XML task and the XML source in the data flow.
When you create an XML task, you must specify the XML data
that the task will process in the Source
Type field of the XML Task Editor
dialog box. You can choose a FILE connection manager to reference an XML file,
a variable to pass the XML data, or you can enter the data into the XML task.
You can select the XML modification that you want to perform on the XML data
when the package runs in the Operation
Type field of the XML Task Editor
dialog box(Figure 5). You can
validate the XML against an XML Schema Definition (XSD) document, alter the
format of the XML by using an Extensible Stylesheet Language for
Transformations (XSLT) stylesheet, use an XML Path Language (XPath) query to
obtain just a subsection of the XML data, and merge XML documents to form a
single XML document. You can use the Diff option to compare two XML documents
and output the differences to a separate diffgram file. You can use the Patch
option to create a new XML document by combining a diffgram file created by the
Diff option with another XML document.
.jpg)
Figure 5: XML Task Editor dialog box
The XML data flow
source provides access to XML data in the data flow, in the form of a document
or contained in a variable. You can validate the XML data against an XSD schema
document.
>Add-on Connectors
In addition to the extensive range of built-in connectors,
there are many more that you can install as add-ons. Some of these connectors
are provided by Microsoft and others by third parties. There are two main
reasons that vendors create add-on connectors:
- To facilitate access to a data source that is
not supported by any of the built-in connectors
- To provide an improvement in performance over
existing connectors
Vendors of database management systems such as IBM and
Oracle usually provide generic adapters that can be used with Integration
Services. Figure 6 shows the Select Data
Providers page of the Integration Services Connections Project Wizard,
which displays a list of vendor-based data sources.
.jpg)
Figure 6: Integration Services Connections Project Wizard
You can see from Figure 6 that there is no built in support
for DB2 in Integration Services. A list of Web sites (not visible in Figure 6)
is provided in the Select Data Providers
page of the Integration Services Connections Project Wizard so that you can
download a connector for use in your project. In this case, you can choose from
an OLE DB connector from IBM or an OLE DB connector from Microsoft. For more
information about the OLE DB connector from Microsoft, see “OLE DB” earlier in
this white paper.
A list of connectors for Integration Services is available
in the article “Data Sources” on the Microsoft Integration Services development
team wiki at http://ssis.wik.is/Data_Sources.
>Add-on Connectors for Tier-1 Data Sources
>Microsoft Connectors for Oracle and Teradata by
Attunity (New in SQL Server 2008)
Attunity, a Microsoft OEM partner that has produced several
connectors for Integration Services in the past, has created two Tier-1
connectors for use with Integration Services running on SQL Server 2008
Enterprise Edition:
- The Microsoft Connector for Oracle by Attunity
- The Microsoft Connector for Teradata by Attunity
Attunity developed these connectors in partnership with the
Microsoft SQL Server Integration Services Development Team, and both connectors
are distributed and supported by Microsoft, unlike other Partner provided
connectors. The connectors offer a significant performance advantage over other
connectors because they access internal buffering application programming
interfaces (APIs) directly rather than having to go through a managed
interface. The Microsoft connectors by Attunity are designed to look and
function as if they are built-in Integration Services connectors such as Excel,
or Flat File connection managers. They handle metadata changes in a manner that
is consistent with built-in connectors, and they capture and report errors that
are generated by the source systems to which they connect. After you install
the connectors, you can create connection managers that use them in the usual
way (Figure 7).
.jpg)
Figure 7: Add SSIS Connection Manager dialog box
showing the Attunity connectors
New data flow sources and data flow destinations, which
reference these connection managers, are added to the data flow toolbox in
Business Intelligence Development Studio (Figure 8).
.jpg)
Figure 8: Data Flow Sources in the Toolbox showing the Attunity connectors
Figure 9 shows an Oracle
Source and Oracle Destination in
use as part of the data flow. The Teradata
Source and Teradata Destination
have a very similar appearance.
.jpg)
Figure 9: Data
Flow task in an Integration Services package containing the Oracle source and
destination
Microsoft Connector
for Oracle by Attunity
The Oracle connector is compatible with Oracle database
versions 9.2.0.4 and later, and it supports x86, x64, and Itanium IA-64
platforms. The Oracle data source that it provides can use tables, views, or an
SQL statement to access data. The Oracle data destination can move data into
Oracle databases by using arrayed load mode, which loads data in batches as a
single transaction, and fast load mode, which uses the OCI direct path protocol
to load data more quickly, but with a few restrictions.
Microsoft Connector
for Teradata by Attunity
The Teradata connector is compatible with Teradata database
versions 2R 6.0, 2R 6.1, 2R 6.2, and 12.0. It supports x86 and x64 platforms.
The data source that it provides can extract data from tables and views, or by
using an SQL statement. The data destination can load data into Teradata
databases incrementally by using the TPT Stream operator, or you can implement
fast loading, which uses the TPT Load operator.
Both of these connectors are available from the “Microsoft
Connectors for Oracle and Teradata by Attunity” Web site at http://www.microsoft.com/downloads/details.aspx?FamilyId=D9CB21FE-32E9-4D34-A381-6F9231D84F1E&displaylang=en.
Both connectors are designed to be used with the SQL Server 2008 Enterprise and
SQL Server 2008 Developer.
>Microsoft Connector 1.0 for SAP BI (New in SQL
Server 2008)
The Microsoft Connector 1.0 for SAP BI is included in the
SQL Server 2008 Feature Pack, October 2008 release. It provides connectivity
between SQL Server 2008 Enterprise and SAP NetWeaver BI 7.0 through Integration
Services. It supports both full and delta data extraction modes from SAP BI 7.0
into SQL Server. It also enables you to move data from non-SAP data sources
into SAP BI InfoProviders.
With the Microsoft Connector 1.0 for SAP BI, you can extract
data from SAP InfoProviders like InfoCubes, Data Store Objects (DSO), and
InfoObjects. Data from these objects is fed into downstream SQL Server
databases through a SAP BI Open Hub Destination, which you must create on the
SAP system. The Open Hub Destination renders the multidimensional data as two
dimensional relational data, which is then loaded into SQL Server using an
Integration Services package that references a SAP connection manager. Analysis
Services can then build its cubes from here, or you can process the data in
some other way.
After you install the Microsoft Connector 1.0 for SAP BI,
you can create a SAP BI connection manager (Figure 10) in Business Intelligence
Development Studio. You insert the connection details that will be used to
generate the connection string, including the host and authentication
information. You can also test the connection from here.
.jpg)
Figure 10: SAP BI Connection Manager
dialog box
SAP BI source and SAP BI destination items are available in
the data flow toolbox, and they reference a SAP BI connection manager. You can
configure a SAP BI source to trigger the data extraction process itself (P -
Trigger Process Chain), allow it to be started by SAP (W - Wait for Notify), or
use it for extracting data only (E – Extract Only). The SAP BI destination
requires you to specify an InfoPackage that is configured in SAP. The
InfoPackage specifies where and how to load the data into SAP BI.
The Microsoft Connector 1.0 for SAP BI is currently going
through the SAP certification process. It is anticipated that the connector
will achieve certified status by the end of 2008. Additionally, a white paper
that discusses the use of the Microsoft Connector 1.0 for SAP BI in more detail
will soon be available on the SQL Server Integration Services developer center
Web site at http://msdn.microsoft.com/en-us/sqlserver/cc511477.aspx.
>Adapters for SAP R/3 and Siebel eBusiness
Applications
You can extend Integration Services connectivity to include
SAP R/3 and Siebel eBusiness applications by using adapters available in the
BizTalk Adapter Pack 1.0. You can obtain a free download of the evaluation
version from the Microsoft Web site at http://www.microsoft.com/downloads/details.aspx?familyid=F3EA0659-B0ED-4D5A-891E-53268128A5F0&displaylang=en.
The pack contains the following adapters that you can use
with Integration Services:
- BizTalk Adapter 3.0 for Siebel eBusiness
Applications
- BizTalk Adapter 3.0 for mySAP Business Suite
The adapter for mySAP Business Suite supports the following
versions:
- SAP R/3 4.6c Non-Unicode
- SAP R/3 4.7 Non-Unicode
- SAP R/3 4.7 Unicode
- SAP R/3 5.0 Non-Unicode
- SAP R/3 5.0 Unicode
- SAP R/3 6.0 Unicode
The Microsoft adapter for mySAP Business Suite has achieved
SAP Certification status. For more information, see the article "BizTalk
Adapter for SAP Certification" on the Microsoft Web site at http://www.microsoft.com/biztalk/evaluation/adapter/adapters/sap/sapcertification.mspx.
The adapter for Siebel eBusiness Applications supports the
following versions:
- Siebel 7.5 (7.5.3.15)
- Siebel 7.7 (7.7.2.8)
- Siebel 7.8 (7.8.2.6)
- Siebel 8.0 (8.0.0.1)
Before you install the BizTalk Adapter Pack 1.0, you must
install the Windows Communication Foundation (WCF) Line-of-Business (LOB)
Adapter software development kit (SDK), which is available from the Microsoft
Web site at http://www.microsoft.com/downloads/details.aspx?FamilyID=56278FDE-B708-469C-987E-DED9C6C5E580&displaylang=en.
Despite the name of the adapter pack, you do not need to
install BizTalk Server to use these adapters; you can make direct connections
between Integration Services and SAP or Siebel through an ADO.NET connection
manager that uses the relevant .NET provider (Figure 11).
.jpg)
Figure 11: Connection
Manager dialog box showing .NET providers for Siebel and SAP from the
BizTalk Adapter Pack 1.0
To access
the SAP and Siebel connections in the data flow, you can use the new ADO.NET
source, which was discussed earlier. You can then use transformations and data
flow destinations in the usual manner.
The Siebel
adapter accesses business components and business objects that the Siebel
application exposes. A business component is analogous to a view, in that it is
a logical grouping of columns from one or more tables. A business object is a
set of related business components.
The SAP adapter accesses remote function calls (RFCs),
business application programming interfaces (BAPIs), and intermediate documents
(IDOCs) exposed by a SAP system.
Detailed technical documentation on the use of these
adapters is included in the BizTalk Adapter Pack download.
>Add-on Connectors for Tier-2 Data Sources
>Microsoft OLE DB Provider for DB2
The Microsoft OLE DB Provider for DB2 facilitates
connectivity to IBM DB2 data sources. The connector is available as part of the
Microsoft SQL Server 2008 Feature Pack from the Microsoft Web site at http://www.microsoft.com/downloads/details.aspx?FamilyID=C6C3E9EF-BA29-4A43-8D69-A2BED18FE73C&displaylang=en.
The OLE DB Provider for DB2 provides two-way data flow
between Integration Services and the following versions of DB2:
- IBM DB2 for Linux, UNIX, and Windows versions 8
and 9
- IBM DB2 for z/OS versions 8 and 9
- IBM DB2 for i/Series V5R4 and higher
You can use the connector with x86, x64, and Itanium IA-64
editions of the Windows operating system. All connections must be over TCP/IP
because this version of the connector does not support IBM’s Systems Network
Architecture (SNA). If you require SNA support, you can use Microsoft Host
Integration Server 2006 to achieve this.
To implement a connection to a DB2 source by using Business
Intelligence Development Studio, create an OLE DB connection manager, and then
select Microsoft OLE DB Provider for DB2 for the provider type (Figure 12).
.jpg)
Figure 12: The
Connection Manager dialog box showing the provider for DB2 selected
When you click Data
Links in the Connection Manager
dialog box, the Data Link Properties
dialog box opens and you can configure connection options including the IP
address, authentication method, initial catalog, and default schema (Figure
13).
The Advanced and All tabs of the Data Link Properties dialog box provide further connection options.
For example, you can set the Derive
Parameters option to TRUE, which causes the provider to derive parameter
information specified in the WHERE
clause of an INSERT, UPDATE, DELETE, or SELECT
statement from the data source as requested by the consumer.
.jpg)
Figure 13: Data Link Properties dialog box
After you have created the OLE DB Provider for DB2
connection manager, you can create OLE DB data sources and destinations in the
data flow that reference it just as you can for other OLE DB providers.
>Connectors for Other Data Sources
Microsoft offers connectivity to other data sources through
its deep Partner ecosystem. However, unlike the connectors and adapters that
are discussed in this paper, Partner provided connectors are supported by the
vendor only, and not by Microsoft.
>Conclusion
SQL Server 2008 Integration Services supports an
ever-expanding range of connectivity options for a wide variety of data
sources. The addition of new connectors that have an increased emphasis on
performance, and of ADO.NET data sources and destinations, contributes to
making Integration Services an excellent choice for ETL operations.
For more information:
http://www.microsoft.com/sqlserver/: SQL Server Web site
http://technet.microsoft.com/en-us/sqlserver/: SQL Server TechCenter
http://msdn.microsoft.com/en-us/sqlserver/: SQL Server DevCenter
http://www.microsoft.com/sqlserver/2008/en/us/integration.aspx:
SQL Server Integration Services Web site
http://technet.microsoft.com/en-us/sqlserver/cc510302.aspx:
SQL Server Integration Services TechCenter
http://msdn.microsoft.com/en-us/sqlserver/cc511477.aspx:
SQL Server Integration Services DevCenter
http://ssis.wik.is/: SQL
Server Integration Services Connectivity Wiki
http://www.microsoft.com/downloads/details.aspx?FamilyId=D9CB21FE-32E9-4D34-A381-6F9231D84F1E&displaylang=en:
Download site for Microsoft connectors for Oracle and Teradata by Attunity
http://www.microsoft.com/downloads/details.aspx?FamilyID=C6C3E9EF-BA29-4A43-8D69-A2BED18FE73C&displaylang=en:
Download site for SQL Server 2008 Feature Pack (includes OLE DB Provider for
DB2)
http://www.microsoft.com/downloads/details.aspx?familyid=F3EA0659-B0ED-4D5A-891E-53268128A5F0&displaylang=en:
Download site for BizTalk Adapter Pack
1.0 (includes BizTalk Adapter 3.0 for
Siebel eBusiness Applications and BizTalk Adapter 3.0 for mySAP Business Suite)
http://download.microsoft.com/download/2/7/c/27cd7357-2649-4035-84af-e9c47df4329c/ConnectivitySSIS.doc:
White paper: “Connectivity and SQL
Server 2005 Integration Services”
Did this
paper help you? Please give us your feedback. Tell us on a scale of 1 (poor) to
5 (excellent), how would you rate this paper and why have you given it this
rating? For example:
- Are
you rating it high due to having good examples, excellent screen shots, clear
writing, or another reason?
- Are
you rating it low due to poor examples, fuzzy screen shots, or unclear writing?
This feedback
will help us improve the quality of white papers we release.
Send feedback.