XMLReader converts carriage return line feed (0xD0xA) to line feed(0XA)

Posted: January 16, 2015  |  Categories: BizTalk Uncategorized
Tags:

I had a fight with the XMLReader class this week because #xD, #xA, and #xD#xA character combinations are converted to a single #xA character when you use this class. I am using the .net 4.5 framework but this old blog tells me why. How did i get here? It all started with troubleshooting why a flat file contained 0xA end of line(EOL) terminators instead of the expected 0xD0xA EOL. The source of the problem was some code that used the XMLReader class. I was not able to find a way around the problem but I am going to describe here how i proved it.

I started with this XML file;

<FFWrappedInXMLResponse xmlns=”loopbackv2://datacom.adapters.loopback”><FFWrappedInXMLResult>hello
my name is mark
What is your name?&gt;</FFWrappedInXMLResult></FFWrappedInXMLResponse>

Now if you look at this file under debug in Visual Studio you see;

<FFWrappedInXMLResponse xmlns=”loopbackv2://datacom.adapters.loopback”><FFWrappedInXMLResult>hello\nmy name is mark\r\nWhat is your name?&gt;</FFWrappedInXMLResult></FFWrappedInXMLResponse>

Note I have put a mixture of linefeed and carriage return + linefeed combinations into the file for testing purposes. Looking at the same file using a hex editor you see.

image

Note the 0a and 0a0d EOL’s

Now if this XML file is read using an XML Reader in the code below what do we see in the output.txt file.

using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Xml;

namespace XMLReaderTestConsole
{
class Program
{
static void Main(string[] args)
{
StreamWriter sWriter = new StreamWriter(“Output.txt”);
sWriter.AutoFlush = true;
FileStream fs = new FileStream(@”testffdata.xml”, FileMode.OpenOrCreate,
FileAccess.Read, FileShare.Read);
XmlReaderSettings settings = new XmlReaderSettings();
settings.IgnoreWhitespace = false;
String checkString;
XmlReader reader = XmlReader.Create(fs,settings);

            while (reader.Read())
{
if (reader.IsStartElement(“FFWrappedInXMLResult”))
{
checkString = reader.ReadElementContentAsString();
Console.Write(checkString);
sWriter.Write(checkString);
}
}
Console.ReadKey();
}
}
}

The output.txt file has all the text in one line like so;

hellomy name is markWhat is your name?>

If you look at this through a debug session in Visual Studio you only see linefeeds in the file. All the carriage returns have disappeared.

hello\nmy name is mark\nWhat is your name?>

The Hex editor only shows 0xA.

image

I hope this convinces you that the XMLReader converts 0xD0xA to 0xA.

Why i am troubleshooting? Firstly some of my custom WCF LOB Adapters and custom BizTalk pipeline components read incoming XML messages using XMLReader under the hood and this behaviour is observed for e.g.

//LOB Adapter code fragment

private Message ExecuteLoopbackXMLFileCopy(Message requestMessage)
{
XmlReader reader;
Message message;

            try
{
reader = requestMessage.GetReaderAtBodyContents();

At the end of the day I have just accepted this and have to workaround this now that I know this limitation of the XMLReader class. I tried to find a setting or encoding that got me around this problem but everything I tried did not work.

turbo360

Back to Top