Optimising Pipeline Performance: XMLWriter vs. XDocument

Posted: January 9, 2015  |  Categories: BizTalk Uncategorized
Tags: Pipeline

Junior and I  looked at the pipeline component listed below. This component  reads a  document , converts it to Base64 encoded string, adds the string to a typed XML message and sends it back. We decided to refactor the code  because it is less than optimal because it loads the entire document into memory using an XDocument object.  We used Steef-Jan’s best practice example code as a starting point.

#1 Azure Monitoring Platform

LESS THAN OPTIMAL CODE

public Microsoft.BizTalk.Message.Interop.IBaseMessage Execute(Microsoft.BizTalk.Component.Interop.IPipelineContext pc, Microsoft.BizTalk.Message.Interop.IBaseMessage inmsg)
{
var callToken = TraceManager.PipelineComponent.TraceIn(“START PIPELINE PROCESSING”);

//Reading in a non-streaming way. REQUIRES OPTIMIZATION
BinaryReader binReader = new BinaryReader(inmsg.BodyPart.Data);
byte[] dataOutAsBytes = binReader.ReadBytes((int)inmsg.BodyPart.Data.Length);
binReader.Close();

string dataOut = System.Convert.ToBase64String(dataOutAsBytes);

    // This is the AttachedDoc XMLmessage that we are creating
//<ns0:AttachedDocument xmlns:ns0=http://BT.Schemas.Internal/AttachedDocument>
//    <ns0:FileName>FileName_0</ns0:FileName>
//    <ns0:FilePath>FilePath_0</ns0:FilePath>
//    <ns0:DocumentType>DocumentType_0</ns0:DocumentType>
//    <ns0:StreamArray>GpM7</ns0:StreamArray>
//</ns0:AttachedDocument>

    XNamespace nsAttachedDoc = XNamespace.Get(@”http://BT.Schemas.Internal/AttachedDocument”);

    //This loads a XML document into DOM. REQUIRES OPTIMIZATION
     XDocument AttachedDocMsg = new XDocument(
new XElement(nsAttachedDoc + “AttachedDocument”,
new XAttribute(XNamespace.Xmlns + “ns0”, nsAttachedDoc.NamespaceName),
new XElement(nsAttachedDoc + “FileName”, “FileName_0”),
new XElement(nsAttachedDoc + “FilePath”, “FilePath_0”),
new XElement(nsAttachedDoc + “DocumentType”, “DocumentType_0”),
new XElement(nsAttachedDoc + “StreamArray”, dataOut)
)
);

    dataOut = AttachedDocMsg.ToString();

    //Uses memory to store large messages. REQUIRES OPTIMIZATION
MemoryStream ms = new System.IO.MemoryStream(System.Text.Encoding.ASCII.GetBytes(dataOut));

    IBaseMessage outmsg = pc.GetMessageFactory().CreateMessage();
outmsg.Context = pc.GetMessageFactory().CreateMessageContext();

    // Iterate through inbound message context properties and add to the new outbound message
for (int contextCounter = 0; contextCounter < inmsg.Context.CountProperties; contextCounter++)
{
string Name;
string Namespace;

        object PropertyValue = inmsg.Context.ReadAt(contextCounter, out Name, out Namespace);

        // If the property has been promoted, respect the settings
if (inmsg.Context.IsPromoted(Name, Namespace))
{
outmsg.Context.Promote(Name, Namespace, PropertyValue);
}
else
{
outmsg.Context.Write(Name, Namespace, PropertyValue);
}
}

    outmsg.AddPart(“Body”, pc.GetMessageFactory().CreateMessagePart(), true);
outmsg.BodyPart.Data = ms;

    pc.ResourceTracker.AddResource(ms);
outmsg.BodyPart.Data.Position = 0;

    TraceManager.PipelineComponent.TraceInfo(“END PIPELINE PROCESSING”, callToken);
TraceManager.PipelineComponent.TraceOut(callToken);

    return outmsg;

}

image

Our first attempt to optimize the code was to replace the memory stream with a virtual stream. First of all we set up a performance to lab to measure the difference between the original and optimised components as follows;

  1. Deploy two pipelines to the BizTalk server runtime; one using the optimized pipeline component and one with the original component.
  2. Configured two send ports with the two pipelines, two dedicated hosts and bound both send ports to a pass-through receive location as shown above.
  3. Set up the Performance Monitor  to capture the private bytes counter for two dedicated BizTalk hosts.

After the change to use virtual stream instead of memory stream we observed very little difference in the memory used.  It was not until we replaced XDocument with XMLWriter  that we noticed a change in memory usage shown below.

imageimage

The difference in memory usage is more pronounced if we used a larger file. The picture on the left was for a 14.3Mb PDF file where as the one on the right is for a 147MB Binary file.

PARTIALLY OPTIMISED CODE

public Microsoft.BizTalk.Message.Interop.IBaseMessage Execute(Microsoft.BizTalk.Component.Interop.IPipelineContext pc, Microsoft.BizTalk.Message.Interop.IBaseMessage inmsg)
{
var callToken = TraceManager.PipelineComponent.TraceIn(“START PIPELINE PROCESSING”);

     //Create StreamReader, VirtualStream and StreamWriter instance https://code.msdn.microsoft.com/BizTalk-2013-Custom-489a1cde
     BinaryReader binReader = new BinaryReader(inmsg.BodyPart.Data);
VirtualStream vStream = new VirtualStream(VirtualStream.MemoryFlag.AutoOverFlowToDisk);
StreamWriter sWriter = new StreamWriter(vStream);

     //Write message body to a virtual memory stream
sWriter.Write(System.Convert.ToBase64String(binReader.ReadBytes((int)inmsg.BodyPart.Data.Length)));
sWriter.Flush();
binReader.Close();

     vStream.Seek(0, SeekOrigin.Begin);

//Create  a string
string dataOut = new StreamReader(vStream).ReadToEnd();
//Are we reading in a non-streaming way? REQUIRES OPTIMIZATION

    // This is the AttachedDoc XML message that we are creating
//<ns0:AttachedDocument xmlns:ns0=http://BT.Schemas.Internal/AttachedDocument>
//    <ns0:FileName>FileName_0</ns0:FileName>
//    <ns0:FilePath>FilePath_0</ns0:FilePath>
//    <ns0:DocumentType>DocumentType_0</ns0:DocumentType>
//    <ns0:StreamArray>GpM7</ns0:StreamArray>
//</ns0:AttachedDocument>

     vStream.Position = 0;

     //Write the output using XMLWriter and Virtual Stream
XmlWriterSettings settings = new XmlWriterSettings();
settings.OmitXmlDeclaration = true;

     XmlWriter xmlWriter = XmlWriter.Create(vStream, settings);

     xmlWriter.WriteStartDocument();
xmlWriter.WriteStartElement(“ns0”, “AttachedDocument”);
xmlWriter.WriteAttributeString(“xmlns”, “ns0”, null, “http://BT.Schemas.Internal/AttachedDocument”);

     xmlWriter.WriteStartElement(“ns0”, “FileName”);
xmlWriter.WriteString(“FileName_0”);
xmlWriter.WriteEndElement();

xmlWriter.WriteStartElement(“ns0”, “FilePath”);
xmlWriter.WriteString(“FilePath_0”);
xmlWriter.WriteEndElement();

xmlWriter.WriteStartElement(“ns0”, “DocumentType”);
xmlWriter.WriteString(“DocumentType_0”);
xmlWriter.WriteEndElement();

xmlWriter.WriteStartElement(“ns0”, “StreamArray”);
xmlWriter.WriteString(dataOut);

 xmlWriter.WriteEndDocument();
xmlWriter.Close();

vStream.Position = 0;

 IBaseMessage outmsg = pc.GetMessageFactory().CreateMessage();
outmsg.Context = pc.GetMessageFactory().CreateMessageContext();

    // Iterate through inbound message context properties and add to the new outbound message
for (int contextCounter = 0; contextCounter < inmsg.Context.CountProperties; contextCounter++)
{
string Name;
string Namespace;

        object PropertyValue = inmsg.Context.ReadAt(contextCounter, out Name, out Namespace);

        // If the property has been promoted, respect the settings
if (inmsg.Context.IsPromoted(Name, Namespace))
{
outmsg.Context.Promote(Name, Namespace, PropertyValue);
}
else
{
outmsg.Context.Write(Name, Namespace, PropertyValue);
}
}

    outmsg.AddPart(“Body”, pc.GetMessageFactory().CreateMessagePart(), true);

    outmsg.BodyPart.Data = vStream;

    //pc.ResourceTracker.AddResource(ms);
pc.ResourceTracker.AddResource(vStream);
outmsg.BodyPart.Data.Position = 0;

    TraceManager.PipelineComponent.TraceInfo(“END PIPELINE PROCESSING”, callToken);
TraceManager.PipelineComponent.TraceOut(callToken);

    return outmsg;

}

#1 Azure Monitoring Platform

SUMMARY

The shows how using using XMLWriter in a virtual stream consumes less memory than instantiating  an  XMLDocument. This is evidence that supports the MSDN recommendations for optimizing pipelines.  Junior has learnt to avoid loading an entire XMLDocument into memory by using XMLWriter. Junior has now been set the task of improving this pipeline component further by writing it in a streaming fashion and is reading an article referenced by Steef-Jan. Wish them luck.

See the next post for the optimized code.

turbo360

Back to Top