Click to Rate and Give Feedback
MSDN
MSDN Library
BizTalk Server
Technical Articles
 Scaling Out the BizTalk Server Data...

  Switch on low bandwidth view
Scaling Out the BizTalk Server Databases

To provide high availability for the BizTalk Server databases, configure two computers that are running SQL Server 2000 or SQL Server 2005 in a Windows cluster. These computers can run in an active/active, active/passive, or active/active/passive (requires three computers) configuration for redundancy and can store data on a shared drive (such as a RAID 0+1 SCSI disk array) or storage area network (SAN).

If the SQL Server service becomes unavailable for any reason, the database cluster transfers resources from the active computer to the passive computer. During this failover process, the BizTalk Server service instances experience database connection failures and automatically restart to reconnect to the databases. The functioning database computer (previously the passive computer) begins processing the database connections after assuming the resources during failover.

Clustering the BizTalk Server databases is discussed in Clustering the BizTalk Server Databases. This section focuses on scaling out the BizTalk Server databases to provide high availability.

Running Multiple MessageBox Databases

To enhance the scalability of the BizTalk Server databases and to address high CPU utilization on the MessageBox database SQL Server computer, you can configure BizTalk Server to store data across multiple MessageBox databases. You create the first MessageBox database when you run the Configuration Wizard. This MessageBox database is the master MessageBox database. There is a single master MessageBox database in your BizTalk Server deployment. The master MessageBox database contains the master subscription information and routes messages to the appropriate MessageBox database. Typically, you want to dedicate the master MessageBox database to do routing only and let the other MessageBox databases do the processing. To make a MessageBox database to do routing only, select Disable new message publication from the MessageBox properties in BizTalk Administration console.

An example of MessageBox database processing flow is as follows:

  1. When the master MessageBox database receives a new activation message—a brand new instance of a business process or a subscription message—the master MessageBox database distributes the activation message to the next available MessageBox database. For example, if you have one master MessageBox database and two MessageBox databases, the master MessageBox database routes the first activation message to MessageBox database 1, the second activation message to MessageBox database 2, the third activation message to MessageBox database 1, and so on in a round-robin pattern. The master MessageBox database uses built-in logic to load balance, and does not need additional load-balancing mechanisms.
  2. After the master MessageBox database routes the activation message to a particular MessageBox database (for example, MessageBox database 1), the business process goes into memory and runs.
  3. If the business process has to wait for a message, and the wait time is longer than several seconds, the business process is persisted back into MessageBox database 1. The business process is waiting for a correlation message.
  4. When the correlation message arrives at the master MessageBox database, the Message Engine does a lookup operation in the database for the MessageBox database that contains the state for the correlation message (in this example, MessageBox 1). The master MessageBox database delivers the message to the MessageBox database that contains the business process.
  5. The business process is brought back into memory to continue processing until it finishes or until it has to wait for another correlation message.

To configure multiple MessageBox databases, you use the BizTalk Server Administration console to add the computers that are running SQL Server 2000 or SQL Server 2005. From an administration perspective, you only have to add new MessageBox databases. BizTalk Server automatically handles the round-robin distribution of activation messages and sends correlation messages to the correct MessageBox databases.

If you have to set a MessageBox database computer offline for maintenance, you can disable new message publication on that computer through the BizTalk Server Administration console. Messages will continue to go to that MessageBox database until all instances are completed, while any new messages are routed to the other MessageBox databases where new message publication is enabled.

If you configure multiple MessageBox databases in your environment you should create a minimum of three MessageBox databases for your BizTalk Server group and you should disable message publication on the master MessageBox database. This recommendation is made because adding additional MessageBox databases incurs overhead by the master MessageBox database for routing messages between the MessageBox databases. If you only configure two MessageBox databases then most of the benefit gained by the additional MessageBox database is offset by the overhead consumed by the master MessageBox database for message routing.

Important
BizTalk Server stores all the states in the MessageBox databases, and each MessageBox database contains state information for different business processes. For reliability, you must cluster all the MessageBox databases, including the master and secondary MessageBox databases.

Behavior of MSMQT When Disabling New Message Publication

While most adapters follow the behavior described earlier in this topic, the BizTalk Message Queuing Adapter (MSMQT) behaves differently when you disable new message publication on the MessageBox database that the adapter uses. Its behavior for processing inbound and outbound messages is as follows:

  • Inbound. When the BizTalk Message Queuing adapter receives an inbound message, BizTalk Server routes the message to its subscribers. The subscribers can be one or more business process instances or send ports. When BizTalk Server posts an incoming BizTalk Message Queuing message, it routes the message to the subscribers and not to the BizTalk Message Queuing instance that was created upon receipt of the first BizTalk Message Queuing message on a particular queue from a particular client.
    If you disable new message publication on a MessageBox database, BizTalk Server creates new business process instances for the incoming messages on the remaining publishing-enabled MessageBox databases.
    However, BizTalk Message Queuing also posts some internal messages to itself to manage its own state information. BizTalk Server continues to post these messages to the BizTalk Message Queuing instance that resides on the disabled MessageBox database. These internal messages are separate from the actual business message.
  • Outbound. If the message is routed through BizTalk Message Queuing to a remote Windows Message Queuing or BizTalk Message Queuing queue, BizTalk Server first posts the message in a MessageBox database under the identity of a BizTalk Message Queuing instance, and the BizTalk Message Queuing adapter receives these messages from this MessageBox database and sends them to the remote queue.
    For each outgoing queue, BizTalk Server creates one BizTalk Message Queuing instance in the MessageBox database that handles all outgoing messages for that queue. Therefore, after BizTalk Server creates such an instance on a MessageBox database, BizTalk Server continues to post all outgoing messages for that queue on that MessageBox database, even if you disable message publishing on that MessageBox database.
    If there is no traffic to the outgoing queue for 24 hours, BizTalk Message Queuing deletes the instance from the MessageBox database and re-creates it when the next message arrives. Therefore, if you disable MessageBox database publication and there is no activity for 24 hours, BizTalk Server starts to post new messages to the enabled MessageBox databases only.

Depending on the requirements of your particular deployment, you might want to enhance performance for tracking by isolating the BizTalk Tracking database onto a separate SQL Server computer and by creating a separate BizTalk Host dedicated to host tracking. The following figure shows a dedicated tracking host with two host instances and clustered databases.

Scaling out the tracking databases

If your deployment has high throughput and involves tracking lots of data for these messages, the tracking overhead could potentially consume lots of resources on the computer that is running SQL Server. If this situation occurs and a high rate of incoming messages continues, BizTalk Server reaches a point where it cannot process new messages because the resources required to track messages are greater than the resources required to run the other BizTalk Server components (such as receiving messages and persisting them to the MessageBox database).

To improve performance and security, we recommend that you dedicate a host for tracking that does not contain any other items (such as receive locations, orchestrations, or pipelines) and that you disable tracking from the receiving, processing, and sending hosts. To provide high availability for the tracking host, create more than one host instance of the tracking host. For more information about creating a tracking host, see "How to Create a New Host" in BizTalk Server 2006 R2 Help at http://go.microsoft.com/fwlink/?LinkId=100422.

For each MessageBox database, BizTalk Server uses only one tracking host instance to move messages from the MessageBox database to the BizTalk Tracking database (BizTalkDTADb). If additional computers run instances of the tracking host, BizTalk Server automatically scales out the handling of each MessageBox database to a separate instance of the tracking host. If the number of MessageBox databases is larger than the number of tracking host instances, one or more tracking host instances will service more than one MessageBox database.

To provide high availability for the BizTalk Tracking database, use Windows Clustering to configure two database computers that are running SQL Server 2000 or SQL Server 2005 in an active/passive configuration.

Business Activity Monitoring (BAM) provides visibility into business processes independent of the IT implementation or across a heterogeneous IT implementation. The BAM SQL Server databases (BAM Star Schema database, BAM Primary Import database, and BAM Archive database) and the BAM Analysis database store the business activity data that is different from the operational monitoring data. The following diagram shows the BAM database infrastructure.

BAM database infrastructure

To make sure your BAM infrastructure is highly available, do the following:

  • Cluster the BAM Primary Import database and the BAM Analysis database. The BAM Primary Import database is the center of the Business Activity Monitoring system. It is therefore important that you make this database highly available by using Windows Clustering and that you follow the next two recommendations to prevent this database from filling up. The BAM Analysis database is an Analysis Services database that stores the data that business analysts use to build activity aggregations and OLAP cubes, and therefore any downtime of this database affects their productivity. While you do not have to cluster the BAM Archive database, we recommend that you monitor the event log for errors when the Data Transformation Services (DTS) packages run to make sure the data has been transferred successfully, and to monitor the size of the database so that you can replace it before it fills up.
  • Define an online window. To allow for better performance and avoid downtime, BAM partitions the data in the BAM Primary Import database into tables based on the timestamp when the activity was completed. BAM achieves this by regularly swapping the completed table with another empty table of identical format. After BAM does this, the additional completed activities go into the new partition (table), while BAM keeps the old partitions for the time defined in the online window. You must define an online window to make sure the number of partitions in the BAM Primary Import database does not grow too large. For more information about scheduling online windows, see "Archiving Primary Import Database Data" in BizTalk Server 2006 Help.
  • Schedule DTS packages to run periodically. Defining an online window makes sure your BAM Primary Import database does not fill up with old activity partitions. You must also schedule DTS packages to run periodically to create a new partition for the activity data and to move the data from the old partitions in the BAM Primary Import database into the BAM Archive database. For more information about scheduling DTS packages, see "Scheduling the DTS Packages" in BizTalk Server 2006 Help.
  • Carefully choose small sets of data items (checkpoints), and avoid including unnecessary data items when defining an activity.
  • Understand the trade-offs between scheduled and real-time aggregations when you design your aggregations. Real-time aggregations are automatically maintained by SQL Server triggers and have zero latency. They are ideal for some mission-critical low-latency scenarios, but they incur a performance cost whenever the events are being written to the BAM Primary Import database. Scheduled aggregations rely on scheduled cubing DTS packages to update their aggregation data. Their latency is equal to or greater than the DTS schedule interval, but overall they have a smaller performance impact on the BAM Primary Import database.
  • If you choose scheduled aggregations, make sure you schedule the cubing DTS to run more frequently than the archiving DTS. This is because the archiving DTS will not move the activity data that has been processed for scheduled aggregation to the BAM Archive database.
  • Enable the BAM Event Bus service in multiple computers to obtain failover functionality.

To provide high availability for the other BizTalk Server databases, configure two computers that are running SQL Server 2000 or SQL Server 2005 in a Windows cluster. These computers can run in an active/active or active/passive configuration for redundancy and can store data on a shared drive (such as a RAID 0+1 SCSI disk array) or storage area network (SAN).

© 2009 Microsoft Corporation. All rights reserved. Terms of Use  |  Trademarks  |  Privacy Statement
Page view tracker