CDA in the Wild: Validation and Schematron (Installment #4)
It has been weeks since our last episode. Frankly, the sight of that genetically mutated Wild-CDA/North-American-XHTML hybrid left my team so repulsed that many could not continue. But after discussing the importance of this work, and a few crew changes, we are back in the field.
And we are documenting something never seen before: the undisturbed lair of a wild CDA.
The sight, I must say, is both spectacular and grotesque.
At first glance I see everything I would expect given my experience with wild CDAs. The hastily crafted Results scattered across the floor. The incoherent Problems piled against the wall. But it’s in the dusty, filthy corners where you see the true nature of this creature. We have all been taught that The Majestic CDA is at the top of the food chain and feasts only on LOINC, SNOMED, and CPT codes; but look here. Clearly the majority of the waste consists of the bones of undocumented local codes, placeholders, and other unidentifiable rodents. The heretofore assumed diet of the Wild CDA is obviously a fiction manufactured by well-funded academics and politicians. Whatever their interests, they don’t include dealing with the reality of CDAs in the wild…
CDAs exchanged today, for lack of a better word, suck. No one willing to stalk them down can say that extra-enterprise exchanges of clinical information work consistently in the real world today.
In my last post (apologies for the delay, but I do have a life), I asserted that I would be extremely happy if EHR vendors would just validate their documents against the latest CDA XML Schema files at runtime. That’s still true.
EHR vendor CTOs reading this…please read the next two paragraphs very carefully.
If you have not had your developers implement runtime validation against the CDA XML Schema files available via SVN at http://gforge.hl7.org/svn/strucdoc/trunk/CDA_SDTC for your US based customers then there is no point in reading any further.
Are you still reading? If your company has not implemented what I said above then stop reading. Seriously, STOP; there’s no point in going any further. XML Schema validation is child’s play for any self-respecting developer who claims basic knowledge of XML. Just email your team and have them do it already.
But if you’ve already implemented runtime XML Schema validation; apologies my friend, I may have offended you. Feel free to continue reading. I’ll even let you in on a little secret. Runtime Schematron validation is the next biggest bang for your buck. Schematron is the savory gravy on your luscious mashed potatoes. (In case you didn’t guess, runtime XML Schema validation = mashed potatoes, and no one wants gravy alone.)
Among software developers, Schematron is less well known than XML Schema. XML Schema is a popular W3C specification that is widely implemented in all major programming languages. Schematron is not a W3C spec. Rather it is an ISO standard (yeah, that “other” standards organization). Schematron takes some effort to implement.
XML Schema is very valuable, but frankly is not cutting edge. XML Schema used to be wild, fun, headline grabbing, but now is generally well behaved, famous for being famous, has a lot of money from its sponsors, and presents itself well at high fashion software development conferences.
Don’t get me wrong. XML Schema is a very good friend to have. XML Schema makes sure you are well groomed, well dressed, and don’t beep when the bouncer waves you with the metal detecting wand. XML Schema gets you into the exclusive red carpet interoperability party.
But Schematron is different. Schematron got into that party simply by nodding to the bouncer. Schematron fought for its life, earned every dollar it made, and has serious street cred. Schematron is not famous, but rather is notorious…notorious BIG even (Big Interoperability Glue). Even though Schematron is big & bold; confident implementers have no reason to fear Schematron, because Schematron speaks the truth, lays down the law, and shows you the way to well-earned success.
No self-respecting EHR vendor worries about XML Schema glancing over their files at runtime. They all want to present themselves well and not be turned away from the interoperability party.
But only a truly confident implementer wants Schematron scrutinizing them. Schematron cares about who you are below the surface. Schematron wants to know how you handle your business; and every file you send speaks about your business.
In the US, Schematron schemas are tightly bound to CDA template ids at each level in a CDA document. Note: implementers in other countries do not necessarily follow this pattern (Grahame, feel free to comment). So in the US when a element appears at any point in a CDA document it means the implementer promises to follow the rules of that template for everything nested below it.
Imagine your snippet of XML is a promise someone can follow up on. The template id is the verifiable detail of that promise. XML Schema doesn’t care about that detail; you and XML Schema already got past the door man and are just here to party at the open bar. But you can be sure Schematron cares. Schematron is cold sober and has its crew checking up on every single detail, because money and reputations depend on details.
The validation capabilities of Schematron are immense, with a list that includes but is not limited to the following (I obscured details and won’t call out names, I have seen real world variations of everything listed below – beware, if you are taking cheap shortcuts with your codes, Schematron will suss you out.)
- Incorrect values in document type and section type codes
- Incorrect OIDs for code systems and identifiers
<code code="34133-9" codeSystem="REPLACE-WITH-LOINC-OID"
- Missing template IDs, or template IDs with the wrong values
- Missing or invalid extensions in template IDs:
- Coding typos (see hard to spot space in the code attribute below)
<code code="34133-9 " displayName="Summarization of Episode Note"
That’s a lot of errors passed over by XML Schema, but easily caught by Schematron.
Now just to be clear, Schematron does not catch everything, which is why I’ll have more entries in this blog series, but rest assured the real world impact of runtime Schematron validation is BIG.
So how can you implement Schematron?
Most online validators such as NISTs Test Transport Tool (aka TTT), or the Lantana Validator (shameless plug) check the HL7-supplied Schematron files. These free tools are a great place to start. But online validators are only useful during development and testing. To perform runtime Schematron validation, developers need to go to Schematron.com and apply the knowledge found there to the C-CDA Schematron schemas from the HL7 gForge SVN repository at http://gforge.hl7.org/svn/strucdoc/trunk/C-CDA.
Runtime Schematron validation takes some developer time to make it production worthy. Just my opinion, but I think CTOs should budget at least 80 hours for your team to research, develop, test, fix, and productize runtime Schematron validation. That’s my two cents as the CTO of a small business, and the real cost will depend on your developers’ experience with CDA, XML, XSLT, and dealing with all other inevitable <expletive> that comes with software development; so don’t hold me to those hours. But I think implementing runtime Schematron validation is time well spent for any company that gives more than lip service to interoperability.
And there is a bonus! (Queue catchy infomercial music.) Act now to implement both runtime XML Schema and runtime C-CDA Schematron validation in your production system in 2016 and I’ll buy you a beer if you see me in person at any HL7 event! That’s right, a FREE BEER! And not on Lantana’s tab, but on my own personal tab!
Seriously, I will buy you a beer; or other drink of your choice if you don’t drink beer. If you doubt my sincerity ask Keith, he knows I’m good for it (yeah, that Keith).
Read Installment #5: Rick takes on CDA Narrative Issues
To see the full series click #CDAinthewild