cXML processing in Azure

Posted: September 20, 2023  |  Categories: Azure
Tags: cXML

This is a story about cXML processing in Azure. To begin with our Azure exchange processes cXML orders that contain a DTD declaration without any issues. Then on 20/9/2023 all our Logic Apps failed with this error. And yet we have been using Azure for this for over five years now.

Unable to process template language expressions for trigger 'When_a_file_is_added_or_modified_by_Spotless' at line '1' and column '7597': 'The template language function 'xml' parameter is not valid. The provided value cannot be converted to XML: 'For security reasons DTD is prohibited in this XML document. To enable DTD processing set the DtdProcessing property on XmlReaderSettings to Parse and pass the settings into XmlReader.Create method.'.

cXML Processing

In the first place let me talk about cXML processing, just in case you are not familiar with this message format. Only then will I diagnose this issue.

A typical cXML has a DTD declaration that starts with “<!DOCTYPE cXML”.

A DOCTYPE declaration presents some problems for an integration developer because most XML parsers do not work when this present. Back in my BizTalk days we used a custom pipeline to remove this declaration before parsing the XML. Indeed, when migrating our cXML interfaces to Azure I also use an Azure helper function to do the same thing before submitting to a Logic App XML transform action. I talk about this in more detail in one of my old blogs.

Thus, a typical Logic App that processes cXML in our exchange looks like this.

The arrow points out the action that calls a function that removes the DTD and adds a namespace to the incoming XML. The incoming message fails on submission to the Transform XSLT shape without that action.

The problem

To get back to the point let’s analyse why my cXML processing is failing. As part of or architectural design, we assign a unique tracking id on every trigger. Thus, if we are processing orders, we use a purchase order number or in the case of invoices we use an invoice number. Thus, we can easily query Logic App Management to find the Logic App run for any order or invoice.

For example, on the Logic App run above we get the tracking id using this piece of Logic App Workflow definition language.

@if(empty(triggerBody()),guid(),json(xml(triggerBody()))['cXML']['Request']['OrderRequest']['OrderRequestHeader']['@orderID'])

Firstly, the check for an empty body stops trigger failure when the trigger skips when there is nothing to pick up.

Secondly I use the xml() function to convert the incoming object to XML. Thirdly, convert the XML to using the json() function conversion. Finally use JSON path to get the orderID.

The xml() function until now accepts the cXML with a DTD declaration without complaint. Yesterday it will no longer parse cXML with a DTD declaration. Discovering that this function parses cXML without complaint was a pleasant surprise to me five years ago. There is no comment either way in the Microsoft documentation, but I thought this was rather cute because it saved me some extra work.

For example, in our cXML invoice logic it saved us an extra action.

The hotfix

Consequently, we had to roll up our sleeves and workaround this problem so our business could still continue to transact. What we choose to do was;

  • Firstly, remove the tracking id that uses xml() function on the trigger body and then reprocess all the failed orders. We created a custom tracking id for the purchase order number after the azure function call.
  • Secondly, adding an extra action calling the same azure function to remove the DTD before process the invoice response.

Notwithstanding this was a lot of work and caused disruption to our business.

Conclusion

I think this story highlights one of the drawbacks of using Azure Platform As a Service for integration (PAS). While we have devolved much of the code maintenance to Microsoft we still have to be vigilant for any changes they make in the PAS.

I think that there was a change to the xml() function in the Logic App definition language on 20/9/2023. This means that DTD are longer supported in this function. But I will probably never know. Does anyone else know about this change?

I think this is a shame that this has happened because it makes it harder to process cXML in Azure now. I think that there should be a new function that removes a DTD declaration and adds a namespace.

We will be reaching out to Microsoft for an explanation. In the meantime, if anyone has a way of changing the Logic app definition code so that I can extract one XML element value for a DTD can you please let me know?

@if(empty(triggerBody()),guid(),json(xml(triggerBody()))['cXML']['Request']['OrderRequest']['OrderRequestHeader']['@orderID'])

POSTSCRIPT 2023/09/22

Another workaround is to change the tracking id to

                "correlation": {
                    "clientTrackingId": "@if(empty(triggerBody()),guid(),json(xml(replace(string(triggerBody()),'<!DOCTYPE cXML SYSTEM \"http://xml.cxml.org/schemas/cXML/1.2.014/cXML.dtd\">','')))['cXML']['Request']['OrderRequest']['OrderRequestHeader']['@orderID'])"
                },

This is far from ideal because it breaks if the version of the DTD changes.

POSTSCRIPT 2023/09/26

UPDATE! Microsoft are reverting the change to the xml() function. Thus, restoring support for DTD’s.

turbo360

Back to Top