Optimising Pipeline Performance: Do not read the message into a string

Continuing on from my last post, Junior spoke to Bill and he said “the key is anything the reads the message into a string, XMLDocument or XDocument is to be avoided, works great with small message does not work with large messages”. I had already said to Junior that I was suspicious of reading the message into the dataOut String. Note that the MSDN example also reads the entire message in to a string and will suffer the same problems that this pipeline component does. //Create a string string dataOut = new StreamReader(vStream).ReadToEnd(); Junior discovered that the XmlWriter class has a WriteBase64 Method which encodes the specified binary bytes as Base64 and writes out the resulting text. Could this write directly to the stream by passing the string and decreasing the memory usage? The refactored code is shown below;

OPTIMIZED CODE

public Microsoft.BizTalk.Message.Interop.IBaseMessage Execute(Microsoft.BizTalk.Component.Interop.IPipelineContext pc, Microsoft.BizTalk.Message.Interop.IBaseMessage inmsg)
{
var callToken = TraceManager.PipelineComponent.TraceIn(“START PIPELINE PROCESSING”);
BinaryReader binReader = new BinaryReader(inmsg.BodyPart.Data);
int bufferSize = 1000;
byte[] buffer = new byte[bufferSize];
int readBytes = 0;
VirtualStream vStream = new VirtualStream(VirtualStream.MemoryFlag.AutoOverFlowToDisk);

     //Write the output using XMLWriter and Virtual Stream
// This is the AttachedDoc XML message that we are creating
//<ns0:AttachedDocument xmlns:ns0=http://BT.Schemas.Internal/AttachedDocument>
//    <ns0:FileName>FileName_0</ns0:FileName>
//    <ns0:FilePath>FilePath_0</ns0:FilePath>
//    <ns0:DocumentType>DocumentType_0</ns0:DocumentType>
//    <ns0:StreamArray>GpM7</ns0:StreamArray>
//</ns0:AttachedDocument>s
XmlWriterSettings settings = new XmlWriterSettings();
settings.OmitXmlDeclaration = true;

XmlWriter xmlWriter = XmlWriter.Create(vStream, settings);

xmlWriter.WriteStartDocument();
xmlWriter.WriteStartElement(“ns0”, “AttachedDocument”, “http://BT.Schemas.Internal/AttachedDocument”);

xmlWriter.WriteStartElement(“FileName”);
xmlWriter.WriteString(“FileName_0”);
xmlWriter.WriteEndElement();

xmlWriter.WriteStartElement(“FilePath”);
xmlWriter.WriteString(“FilePath_0”);
xmlWriter.WriteEndElement();

xmlWriter.WriteStartElement(“DocumentType”);
xmlWriter.WriteString(“DocumentType_0”);
xmlWriter.WriteEndElement();

xmlWriter.WriteStartElement(“StreamArray”);

//Base64 encode the message stream and write to the output stream directly
do
{
readBytes = binReader.Read(buffer, 0, bufferSize);
xmlWriter.WriteBase64(buffer, 0, readBytes);
} while (bufferSize <= readBytes);
binReader.Close();

xmlWriter.WriteEndDocument();
xmlWriter.Close();

vStream.Position = 0;

IBaseMessage outmsg = pc.GetMessageFactory().CreateMessage();
outmsg.Context = pc.GetMessageFactory().CreateMessageContext();

// Iterate through inbound message context properties and add to the new outbound message
for (int contextCounter = 0; contextCounter < inmsg.Context.CountProperties; contextCounter++)
{
string Name;
string Namespace;

object PropertyValue = inmsg.Context.ReadAt(contextCounter, out Name, out Namespace);

// If the property has been promoted, respect the settings
if (inmsg.Context.IsPromoted(Name, Namespace))
{
outmsg.Context.Promote(Name, Namespace, PropertyValue);
}
else
{
outmsg.Context.Write(Name, Namespace, PropertyValue);
}
}

outmsg.AddPart(“Body”, pc.GetMessageFactory().CreateMessagePart(), true);

outmsg.BodyPart.Data = vStream;

//pc.ResourceTracker.AddResource(ms);
pc.ResourceTracker.AddResource(vStream);
outmsg.BodyPart.Data.Position = 0;

TraceManager.PipelineComponent.TraceInfo(“END PIPELINE PROCESSING”, callToken);
TraceManager.PipelineComponent.TraceOut(callToken);

return outmsg;

}

The comparison of the memory usage to the partially optimized version shown below is truly amazing. The memory usage is virtually nothing for the new optimized version. Furthermore if we submitted a 1.2Gb binary file the partially optimized pipeline crashes with an out of memory error. The newly optimized version does not, it just burps swallows the file and writes it out. AMAZING…..

In summary I have shown in this post and the previous one how to make a pipeline scream and remember “anything the reads the message into a string, XMLDocument or XDocument is to be avoided”.