CDA In The Wild: Basic XML Issues (Installment #2)

 

Move in closer now…do you see that? It’s a wild CDA lying motionless in the grass. Surely it’s waiting for prey to…wait, it’s not moving – something’s wrong…

 

Ah, this one was dead when we got here.

 

A fair number of wild CDAs suffer from basic XML issues. This means they are not parsible, and effectively useless from a processing application’s perspective.

 

In the simplest case, an implementer sends a file that’s not even an XML file, such as a Microsoft Word document or PDF. This can usually be chalked up to simple unawareness. In extreme cases, an implementer has heard that CDA supports a “non-XML Body”, and has not yet created a header to reference his or her non-XML content. Fortunately, this is pretty rare.

 

A more common problem occurs when using a print statement to generate XML. Something innocuous, like the following example, can corrupt an XML file completely:

 

System.out.println(“<title>History & Physical<title>”);

 

The print method results in an unescaped ampersand, which is a reserved character, and does not properly close the title tag. XML must be “well-formed”, meaning it follows a few simple rules. Elements must have start and end tags, and certain characters must be escaped. “<title>” is a start tag, and you must use “</title>” to properly close it. <, >, and & in normal content for example, and single and double quotes when using inside attributes. The simple solution to the method above is to write something like:

 

System.out.println(“<title>History &amp; Physical</title>”);

 

A much better solution, however, is to use an XML API, like the Document Object Model (DOM), as it will ensure elements begin and end properly, special characters are properly escaped, etc. E.g.:

 

// assume you have created a DOM Document variable called doc

 

Element title = doc.createElement(“title”);
title.setTextContent(“History & Physical”);

 

This will ensure the title element starts and ends correctly, and that all special characters are escaped automatically. In short, an API facilitates well-formed XML. There are few excuses for not using one, since an API is available in nearly all programing languages and platforms.

 

Stay tuned for more on CDA in the Wild!

 

Read Installment #3: Rick takes on XML Schema Validation

 

To see the full series click #CDAinthewild

 

#CDAinthewild #XML #HIT