Export (0) Print
Expand All

Identifying Bottlenecks in the BizTalk Tier

The BizTalk tier can be divided into the following functional areas:

  • Receiving

  • Processing

  • Transmitting

  • Tracking

  • Other

For these areas, if the system resources (CPU, memory, and disk) appear to be saturated, upgrade the server by scaling up. If the system resources are not saturated, perform the steps described in this section.

If messages build up at the receive location (for example, file receive folder grows large), this indicates that the system is unable to absorb data fast enough to keep up with the incoming load. This is due to internal throttling. That is, BizTalk Server reduces the receiving rate if the subscribers are unable to process data fast enough causing backlog buildup in the database tables. If the bottleneck is caused by hardware limitations, try scaling up. For more information about scaling up, see Scaling Your Solutions.

It is also possible to scale out by adding a host instance (server) to the host mapped to the receive handler. For more information about scaling out, see Scaling Your Solutions. Use Perfmon to monitor the resource use on the system. It is important to confirm that the external receive location is not the cause of the bottleneck. For example, confirm whether the remote file share is saturated due to high disk I/O, the server hosting the remote outgoing queue is not saturated, or the client used to generate HTTP/SOAP load is not starved on threads.

If the Host Queue - Length count is climbing, it indicates that the orchestrations are not completing fast enough. For more information, see the Perfmon counter table in this topic. This could be due to memory contention or CPU saturation.

If the orchestration servers are the bottleneck, use Perfmon to identify the source.

If the server is CPU bound, consider the following:

  • If the workflow is complex consider splitting the orchestration into multiple smaller orchestrations

    Cc296670.note(en-US,BTS.10).gifNote
    Splitting an orchestration into multiple workflows can cause additional latency and add complexity.

  • If you use complex maps, consider whether they can be moved to the Receive/Send ports. Be sure to verify which ports have additional bandwidth.

  • Consider scaling up the hardware or scaling out by configuring an additional processing server.

If the transmitting server is saturated on resources (for example, disk, memory, or CPU), consider scaling-up the server or scaling-out to additional send host servers. The sending tier could become the bottleneck if the destination (external to BizTalk) is unable to receive data fast enough. This will cause messages to buildup in the MessageBox database (Application SendHostQ).

If all the endpoints are within the scope of the topology, isolate the cause at the destination. For example, determine if the HTTP/SOAP location is optimally configured to receive load. If not, consider scaling out. Also determine if the destination is growing due to excessive output messages delivered by BizTalk. If yes, you might need a maintenance plan to archive and purge the destination messages. Large numbers of files in a destination folder can severely impact the ability of the BizTalk service to commit data to the disk drive.

The Tracking host instance is responsible for moving the Business Activity Monitoring (BAM) and Health and Activity Tracking (HAT) data from the MessageBox database (TrackingData table) to the BizTalk Tracking and/or BAM Primary Import database tables. If multiple MessageBox databases are configured, the tracking host instance uses four threads per MessageBox database.

It is possible that the Tracking host instance is CPU bound. If it is, consider scaling-up the server or scale-out by configuring an additional server with Host Tracking enabled. The multiple host instances will automatically balance load for the multiple MessageBox databases configured. For more information about scaling, see Scaling Your Solutions.

If the TrackingData table in the MessageBox database grows large, it is usually because the data maintenance jobs on the BizTalk Tracking and/or BAM Primary Import database are not running as configured, causing growth of the BizTalk Tracking and/or BAM Primary Import databases. After these databases grow too large it can have a negative impact on the ability of the Tracking host to insert data into the TrackingData table. This causes tracked data to back up in the MessageBox database tables. The growth of the TrackingData table cause throttling to start.

Configure the deployment topology such that different functionality runs in dedicated isolated host instances. This way each host instance gets its own set of resources (for example, on a 32-bit system, 2GB virtual memory address space, handles, threads). If the server is has sufficient CPU headroom and memory to host multiple host instances, they can be configured to run on the same physical computer. If not, consider scaling out by moving the functionality to dedicated servers. Running the same functionality on multiple servers also serves to provide a highly available configuration.

Object Instance Counter Monitoring Purpose

Processor

_Total

% Processor Time

Resource Contention

Process

BTSNTSvc

Virtual Bytes

Memory Leak/Bloat

Process

BTSNTSvc

Private Bytes

Memory Leak/Bloat

Process

BTSNTSvc

Handle Count

Resource Contention

Process

BTSNTSvc

Thread Count

Resource Contention

Physical Disk

_Instance

% Idle Time

Resource Contention

Physical Disk

_Instance

Average Disk Queue Length

Resource Contention

CPU Contention

If the processor is saturated, you can fragment the application by separating the receiving from the sending and orchestration. To do this, create separate hosts, map the hosts to specific functionality (receive/send/orchestrations/tracking) and add dedicated servers to these separate hosts. Orchestration functionality is often CPU-intensive. If you configure the system so the orchestrations execute on a separate dedicated server, this can help improve overall system throughput.

If multiple orchestrations are deployed, you can enlist them to different dedicated orchestration hosts. Mapping different physical servers to the dedicated orchestration hosts ensures that the different orchestrations are isolated and do not contend for shared resources either in the same physical address space or on the same server.

Memory Starvation

High throughput scenarios can have increased demand on system memory. Since a 32-bit process is limited by the amount of memory it can consume, it is recommended to separate the receive/process/send functionality into separate host instances such that each host receives its own 2GB address space. In addition, if multiple host instances are running on the same physical server, you can upgrade to 4/8GB memory to avoid swapping data to disk from real memory. Long running orchestrations can hold onto allocated memory longer. This can cause memory bloat and throttling to start. Large messages can also cause high memory consumption.

You can ease the memory bloat problem that occurs when large messages are processed by lowering the Internal Message Queue Size and In-process Messages per CPU values for the specific host.

Disk Contention

If the disks are saturated (for example, with a large number of FILE/MSMQ transports) consider upgrading to multiple spindles and striping the disks with RAID 10. In addition, whenever using the FILE transport it is important to ensure that the receive and send folders do not grow larger than 50,000 files.

The receive folder can grow large if BizTalk Server throttles incoming data into the system. It is important to move data from the send folder so that growth in this folder does not impact the ability of BizTalk Server to write additional data. For non-transactional MSMQ queues, it is recommended to remotely create the receive queues so that disk contention is reduced on the BizTalk Server.

The remote non-transactional queue configuration also provides high availability as the remote server hosting the queue can be clustered.

Other System Resource Contention

Depending on the type of transport, it may be necessary to configure system resources like IIS for HTTP/SOAP (for example, MaxIOThreads, MaxWorkerThreads).

Downstream Bottlenecks

If the downstream system is unable to receive data fast enough from BizTalk, this output data will back up in the BizTalk databases. This results in bloat, causes throttling to start, shrinks the receive pipe, and impacts the overall throughput of the BizTalk system. A direct indication of this is Spool table growth. For more information about bottlenecks and the Spool table, see Identifying Bottlenecks in the MessageBox Database.

Throttling Impact

Throttling will eventually start to protect the system from reaching an unrecoverable state. Thus, you can use throttling to verify whether the system is functioning normally and discover the source of the problem. After you identify the cause of the bottleneck from the throttling state, analyze the other performance counters to determine the source of the problem.

For example, high contention on the MessageBox database could be due to high CPU use, caused by excessively paging to disk that is caused by low memory conditions. High contention on the MessageBox database could also be caused by high lock contention due to saturated disk drives.

Object Instance Counter Description

BizTalk Messaging

RxHost

Documents Received/Sec

Incoming Rate

BizTalk Messaging

TxHost

Documents Processed/Sec

Outgoing Rate

XLANG/s Orchestrations

PxHost

Orchestrations Completed/Sec.

Processing Rate

BizTalk : MessageBox: General Counters

MsgBoxName

Spool Size

Cumulative size of all Host Queues

BizTalk : MessageBox: General Counters

MsgBoxName

Tracking Data Size

Size of TrackingData table on the MessageBox

BizTalk:MessageBox:Host Counters

PxHost:MsgBoxName

Host Queue - Length

Number of messages in the specific Host Queue

BizTalk:MessageBox:Host Counters

TxHost:MsgBoxName

Host Queue - Length

Number of messages in the specific Host Queue

BizTalk:Message Agent

RxHost

Database Size

Size of publishing (PxHost) Queue

BizTalk:Message Agent

PxHost

Database Size

Size of publishing (TxHost) Queue

BizTalk:Message Agent

HostName

Message Delivery Throttling State

Affects XLANG and Outbound transports

BizTalk:Message Agent

HostName

Message Publishing Throttling State

Affects XLANG and Inbound transports

Where Do I start?

Monitoring the Message Delivery Throttling State and the Message Publishing Throttling State for each host instance is a good place to start. If the value of these counters is not zero, this indicates throttling in the BizTalk system and you can further analyze the cause of the bottleneck. For descriptions on the other performance counters, see Identifying Bottlenecks in the Database Tier.

For a 1-1 deployment scenario where 1 message received results in 1 message processed and transmitted, if the outgoing rate does not equal the incoming rate, there is a backlog in the system. In this situation, you can monitor the Spool Size.

If the Spool is growing linearly, determine which Application Queue is responsible for the Spool growth.

If none of the Application Queues are growing and the Spool continues to grow, it could mean that the purge jobs are unable to keep up. This occurs if the agent is not running or there is other system resource contention on the SQL Server.

If one of the Application Queues is growing, diagnose the cause of this growth. Monitor the system resources on the system that is unable to drain the specific Application Queue (for example, Orchestration Host-Q is growing due to CPU starvation on the server). In addition, verify the values of the throttling counter for the specific host instance.

If the Delivery/Publishing State is not zero, check the value to confirm the reason for throttling (for example, memory threshold exceeded, in-flight message count too high, and so on).

You can use performance counters to detect the location of the bottleneck at a high level. However, once narrowed down, you might need to examine the code more closely to help ease the problem. The F1 Profiler that ships with Visual Studio can be a very helpful tool to help diagnose where the code is spending most of its cycles.

Symbols can help to create a more meaningful stack (especially for unmanaged code). For example, the F1-Profiler can help pinpoint the number of invocations and the amount of time an API call takes to return. Drilling further down the stack, it may be possible to detect the underlying cause of the high latency. It could be a blocking call to a database query or a call to wait on an event.

From a hardware perspective, you can gain the biggest benefits by using the onboard CPU cache. Higher CPU cache helps increase cache hit rate reducing the need for the system to page data in and out of memory to disk.

Performance on 64-bit systems may appear lower than what can be achieved on 32-bit systems. This is possible for a few reasons, the most important one being memory.

Measuring performance on a 32-bit system with 2-GB of memory and comparing the results to a similar 64-bit system with 2-GB of memory is not comparing the same thing. The 64-bit system will appear to be disk-I/O bound (low % Disk Idle time & high Disk Queue Length) and CPU bound (max CPU and high context switching). However, this is not because performing file I/O on a 64-bit system is more expensive.

The 64-bit system is more memory intensive (64-bit addressing) which results in the operating system consuming most of the 2 GB available memory. When this happens, most other operations cause paging to disk which stresses the file subsystem. Therefore, the system spends CPU cycles paging in/out of memory both data and code and is impacted by the high disk latency cost. This manifests itself as both higher disk contention and higher CPU consumption.

The way to alleviate this problem is to scale-up the server by upgrading the memory. Scaling-up to 8 GB is idea. However, adding more memory will not help improve throughput unless the source of the problem is CPU starvation due to low memory conditions.

When low latency is important, you can use BAM to measure the time the system takes to complete each stage within the BizTalk system. Although you can use HAT to debug the state of messages and diagnose the source of problems in routing messages, you can use BAM to track various points through the message flow. By creating a BAM tracking profile that defines an activity with continuations, you can measure latency between different parts of the system to help track the most expensive stages within the workflow process.

Show:
© 2014 Microsoft