Optimizing Pipeline Performance
This topic describes guidelines for optimizing performance of pipelines in a BizTalk Server solution.
-
Because pipeline components have a significant impact on performance (for example, a pass-through pipeline component performs up to 30 percent better than an XML assembler/disassembler pipeline component), make sure that any custom pipeline components perform optimally before implementing them in your deployment. Minimize the number of pipeline components in your custom pipelines if you want to maximize the overall performance of your BizTalk application.
-
You also can improve overall performance by reducing the message persistence frequency in your pipeline component and by coding your component to minimize redundancy. Every custom assembly and in particular artifacts that could potentially disrupt performance, like custom tracking components, should be tested separately under heavy load condition to observe their behavior when the system is working at full capacity and to find any possible bottlenecks.
-
If you need to read the inbound message inside a pipeline component, avoid loading the entire document into memory using an XmlDocument object. The amount of space required by an instance of the XmlDocument class to load and create an in-memory representation of a XML document is up to 10 times the actual message size. In order to read a message, you should use an XmlTextReader object along with an instance of the following classes:
-
VirtualStream (Microsoft.BizTalk.Streaming.dll) - The source code for this class is located in two locations under the Pipelines SDK as follows: SDK\Samples\Pipelines\ArbitraryXPathPropertyHandler and SDK\Samples\Pipelines\SchemaResolverComponent\SchemaResolverFlatFileDasm.
-
ReadOnlySeekableStream (Microsoft.BizTalk.Streaming.dll).
-
SeekAbleReadOnlyStream - The source code for this class is located in two locations under the Pipelines SDK as follows: SDK\Samples\Pipelines\ArbitraryXPathPropertyHandler and SDK\Samples\Pipelines\SchemaResolverComponent\SchemaResolverFlatFileDasm.
-
VirtualStream (Microsoft.BizTalk.Streaming.dll) - The source code for this class is located in two locations under the Pipelines SDK as follows: SDK\Samples\Pipelines\ArbitraryXPathPropertyHandler and SDK\Samples\Pipelines\SchemaResolverComponent\SchemaResolverFlatFileDasm.
-
Use the PassThruReceive and the PassThruTransmit standard pipelines whenever possible. They do not contain any pipeline component and do not perform any processing of the message. For this reason, they ensure maximum performance in receiving or sending messages. You can use a PassThruReceive pipeline on a receive location if you need to publish a binary document to the BizTalk MessageBox and a PassThruTransmit pipeline on a send port if you need to send out a binary message. You can also use the PassThruTransmit pipeline on a physical send port bound to an orchestration if the message has been formatted and is ready to be transmitted. You will need to use a different approach if you need to accomplish one of the following actions:
-
Promote properties on the context of an inbound XML or Flat File message.
-
Apply a map inside a receive location.
-
Apply a map in an orchestration that subscribes to a message.
-
Apply a map on a send port that subscribes to a message.
-
Promote properties on the context of an inbound XML or Flat File message.
-
Acquire resources as late as possible and release them as early as possible. For example, if you need to access data on a database, open the connection as late as possible and close it as soon as possible. Use the C# using statement to implicitly release disposable objects or the finally block of a try-catch-finally statement to explicitly dispose your objects. Instrument your source code to make your components simple to debug.
-
Eliminate any components from your pipelines that are not strictly required to speed up message processing.
-
Inside a receive pipeline, you should promote items to the message context only if you need them for message routing (Orchestrations, Send Ports) or demotion of message context properties (Send Ports).
-
If you need to include metadata with a message, and you don't use the metadata for routing or demotion purposes, use the IBaseMessageContext.Write method instead of the IBaseMessageContext.Promote method.
-
If you need to extract information from a message using an XPath expression, avoid loading the entire document into memory using an XmlDocument object just to use the SelectNodes or SelectSingleNode methods. Alternatively, use the techniques described in Optimizing Memory Usage with Streaming.
The following techniques describe how to minimize the memory footprint of a message when loading the message into a pipeline.
Use ReadOnlySeekableStream and VirtualStream to process a message from a pipeline component
It is considered a best practice to avoid loading the entire message into memory inside pipeline components. A preferable approach is to wrap the inbound stream with a custom stream implementation, and then as read requests are made, the custom stream implementation reads the underlying, wrapped stream and processes the data as it is read (in a pure streaming manner). This can be very hard to implement and may not be possible, depending on what needs to be done with the stream. In this case, use the ReadOnlySeekableStream and VirtualStream classes exposed by the Microsoft.BizTalk.Streaming.dll. An implementation of these is also provided in Arbitrary XPath Property Handler (BizTalk Server Sample) (http://go.microsoft.com/fwlink/?LinkId=160069) in the BizTalk SDK.ReadOnlySeekableStream ensures that the cursor can be repositioned to the beginning of the stream. The VirtualStream will use a MemoryStream internally, unless the size is over a specified threshold, in which case it will write the stream to the file system. Use of these two streams in combination (using VirtualStream as persistent storage for the ReadOnlySeekableStream) provides both “seekability” and “overflow to file system” capabilities. This accommodates the processing of large messages without loading the entire message into memory. The following code could be used in a pipeline component to implement this functionality.
int bufferSize = 0x280; int thresholdSize = 0x100000; Stream vStream = new VirtualStream(bufferSize, thresholdSize); Stream seekStream = new ReadOnlySeekableStream(inboundStream, vStream, bufferSize);
This code provides an “overflow threshold” by exposing bufferSize and thresholdSize variables on each receive location or send port configuration. This overflow threshold can then be adjusted by developers or administrators for different message types and different configurations (such as 32-bit vs. 64-bit).
Using XPathReader and XPathCollection to extract a given IBaseMessage object from within a custom pipeline component.
If specific values need to be pulled from an XML document, instead of using the SelectNodes and SelectSingleNode methods exposed by the XmlDocument class, use an instance of the XPathReader class provided by the Microsoft.BizTalk.XPathReader.dll assembly as illustrated in the following code example.
Note |
|---|
| For smaller messages especially, using an XmlDocument with SelectNodes or SelectSingleNode may provide better performance than using XPathReader, but XPathReader allows you to keep a flat memory profile for your application. |
This example shows how to use the XPathReader and XPathCollection to extract a given IBaseMessage object from within a custom pipeline component.
public IBaseMessage Execute(IPipelineContext context, IBaseMessage message)
{
try
{
...
IBaseMessageContext messageContext = message.Context;
if (string.IsNullOrEmpty(xPath) && string.IsNullOrEmpty(propertyValue))
{
throw new ArgumentException(...);
}
IBaseMessagePart bodyPart = message.BodyPart;
Stream inboundStream = bodyPart.GetOriginalDataStream();
VirtualStream virtualStream = new VirtualStream(bufferSize, thresholdSize);
ReadOnlySeekableStream readOnlySeekableStream = new ReadOnlySeekableStream(inboundStream, virtualStream, bufferSize);
XmlTextReader xmlTextReader = new XmlTextReader(readOnlySeekableStream);
XPathCollection xPathCollection = new XPathCollection();
XPathReader xPathReader = new XPathReader(xmlTextReader, xPathCollection);
xPathCollection.Add(xPath);
bool ok = false;
while (xPathReader.ReadUntilMatch())
{
if (xPathReader.Match(0) && !ok)
{
propertyValue = xPathReader.ReadString();
messageContext.Promote(propertyName, propertyNamespace, propertyValue);
ok = true;
}
}
readOnlySeekableStream.Position = 0;
bodyPart.Data = readOnlySeekableStream;
}
catch (Exception ex)
{
if (message != null)
{
message.SetErrorInfo(ex);
}
...
throw ex;
}
return message;
}
Another method for implementing a custom pipeline component which uses a streaming approach is to use the .NET XmlReader and XmlWriter classes in conjunction with the XmlTranslatorStream class provided by BizTalk Server. For example, the NamespaceTranslatorStream class contained in the Microsoft.BizTalk.Pipeline.Components assembly inherits from XmlTranslatorStream and can be used to replace an old namespace with a new one in the XML document contained in the stream. In order to use this functionality inside a custom a pipeline component, you can wrap the original data stream of the message body part with a new instance of the NamespaceTranslatorStream class and return the latter. In this way the inbound message is not read or processed inside the pipeline component, but only when the stream is read by a subsequent component in the same pipeline or is finally consumed by the Message Agent before publishing the document to the BizTalk Server MessageBox.
The following example illustrates how to use this functionality.
public IBaseMessage Execute(IPipelineContext context, IBaseMessage message)
{
IBaseMessage outboundMessage = message;
try
{
if (context == null)
{
throw new ArgumentException(Resources.ContextIsNullMessage);
}
if (message == null)
{
throw new ArgumentException(Resources.InboundMessageIsNullMessage);
}
IBaseMessagePart bodyPart = message.BodyPart;
Stream stream = new NamespaceTranslatorStream(context,
bodyPart.GetOriginalDataStream(),
oldNamespace,
newNamespace);
context.ResourceTracker.AddResource(stream);
bodyPart.Data = stream;
}
catch (Exception ex)
{
if (message != null)
{
message.SetErrorInfo(ex);
}
throw ex;
}
return outboundMessage;
}
The following information was taken from Nic Barden’s blog, http://blogs.objectsharp.com/cs/blogs/nbarden/archive/2008/04/14/developing-streaming-pipeline-components-part-1.aspx (http://go.microsoft.com/fwlink/?LinkId=160228). This table provides a summarized comparison of loading messages into pipelines using an in-memory approach and using a streaming approach.
| Comparison of... | Streaming | In memory |
|---|---|---|
|
Memory usage per message |
Low, regardless of message size |
High (varies depending on message size) |
|
Common classes used to process XML data |
Built in and custom derivations of: XmlTranslatorStream, XmlReader and XmlWriter |
XmlDocument, XPathDocument, MemoryStream and VirtualStream |
|
Documentation |
Poor – many un-supported and un-documented BizTalk classes |
Very good - .NET Framework classes |
|
Location of “Processing Logic” code |
|
Directly from the Execute method of the pipeline component. |
|
Data |
Re-created at each wrapping layer as data is read through it. |
Read in, modified and written out at each component prior to next component being called. |
Note