De-batching within an orchestration using XPath or calling a pipeline

A colleague of mine recently asked me what is the best way to de-batch a message in an orchestration and I said i thought that the best way was to call a pipeline. Another colleague then chipped in and said why not use XPath to split the message instead. I said that i thought that XPath was OK but gave slower performance with large message sizes. I was surprized to find that when i searched for something to support my theory I could not find anything. Thus I created some test orchestrations to prove my point and I am going to share the results of tests here.

A long time ago now Stephen Thomas wrote an interesting blog where he examined the performance characteristics of de-batching messages called Debatching Options and Performance Considerations in BizTalk 2004. At this time it was not possible to call a pipeline within an orchestration and Stephen did not discuss this option. Later on he did describe how to call a pipeline within an orchestration, namely Calling A Receive Pipeline Inside an Orchestration in BizTalk 2006 but he did not follow it up within any examination of how it performed.

The XPath de-batching orchestration is based on Stephen’s example, Envelope and XPath Debatching in an Orchestration Lab , except i added namespaces to the messages and it looked like this;

The Call pipeline de-batching orchestration is also based on Stephen example,Calling a Receive Pipeline Inside an Orchestration , except I used a receive pipeline that uses an envelope schema and message schema and it looked like this;

I created a test message that contained the envelope and 500 messages for my test. The solution I used for the tests can be downloaded from here. My tests were run on the same not too flash BizTalk 2006 R2 virtual machine on my desktop. The results are shown below.

Type	Input XML Size(bytes)	Output XML Size( bytes)	Documents Processed	Documents processed/sec	Orch end – Orch Start ( sec)
Call receive pipeline	69234	133	500	893	0.56
Call receive pipeline	69234	133	500	1121	0.446
Call receive pipeline	69234	133	500	1064	0.47
Call receive pipeline	69234	133	500	1101	0.454
Call receive pipeline	69234	133	500	1106	0.452
XPath	69234	133	500	59	8.496
XPath	69234	133	500	67	7.503
XPath	69234	133	500	67	7.463
XPath	69234	133	500	66	7.56
XPath	69234	133	500	69	7.242

My theory is correct. Calling a receive pipeline in an orchestration to split a message in this test is an order of magnitude faster than using XPath to split the message. Now why is calling a receive pipeline perfrom so much better. I decided look at the performance counters for the two orchestrations. The Xpath split creates over 500 persistence points whereas the calling the receive pipeline only creates one persistence point! I looks as if the reason the XPath split is so bad is because a persistence point is created every time you call the XPath in the loop.

In summary if you want to de-batch a message with in an orchestration then calling a pipeline to split a message will give much better performance than using XPath.