CDA in the Wild: Timing is Everything (Installment #7)
This week, we delve back in time to explore the origins of the Wild CDA species.
The roots of this breed trace back to an entire genus known as Veethree, indigenous to the Atlantic RIM area. While this genus showed early promise and seemed likely to spawn many thriving branches, the truth is that it’s genome was burdened by implementable-oppositism—an unfortunate genetic defect that typically results in an evolutionary dead end.
One-by-one, most of the fledgling offshoots from this decaying genome passed into oblivion, leaving the Wild CDA as the sole thriving descendent.
Some may challenge my assertion above, pointing to members of the Veethree genus recently spotted in parts of Canada and the Netherlands. This is true; but those specimens barely qualify as alive at all. A recently captured Veethree was found to be infected by a parasite known colloquially as the Suffering Dead Organism (SDO), which only infects the most feeble and sickly of animals, then devours its host from the inside out, turning them into zombie-like creatures.
Sadly, the only way to properly dispose of SDO-infected corpses is to corral them together and set fhir to the whole lot.
But I digress. Let’s proceed to the history of the Wild CDA, and how it survived.
We must go to San Diego, CA. Here we will spelunk for early Wild CDA drawings in the Spinosa cave. This cave experiences periodic tidal flooding, and is so named for the explorer who first postulated that it was the narrative of the Wild CDA that allowed it to thrive where other Veethree offshoots failed.
Spinosa’s writings came to light in the late fall of 1996, when fragments of his journal were first found. The journal seems to have been written in 1706 according to the dates found in the nearly unreadable fragments. Unfortunately, the journal had been waterlogged, dried repeatedly, and pounded in the unrelenting surf over the past few centuries. Finally, after being recovered, it was accidentally filed away in a mold infested cabinet at UC San Diego until it was recently rediscovered by a grad student.
But in my hand I hold an envelope with not one, but two bombshells to divulge to you tonight. Not only do I have the definitive carbon dating of the Spinosa manuscript, but also the computer enhanced recovery of the full text of the manuscript, which was nearly unreadable until now. Due to recent advances in narrative reconstruction, the text is now available and fully parsed.
Bombshell one…the carbon date of the manuscript is 1996 (the same year it was discovered). Bombshell two…Spinosa wrote the journal on June 17, 1996. Spinosa was a surfer who found some cave paintings, then scribbled down his trip about the paintings in his notebook using something he called a “markup language”, possibly some form of advanced hieroglyph…
Will time be cruel to CDA? Some think so, we think it is still alive and kicking. The hard part of CDA has always been its HL7 V3 heritage, and it is HL7 V3, not CDA that needs to join the dinosaurs, buried in shale and crushed into carbon. V3 was fracked from the start. It was an unruly, un-implementable beast and it was only by the miracle of the Word (i.e. the inclusion of the narrative block and a single, stable schema) that it spawned a single viable offspring. HL7 V3 was the mule of health IT, truly an evolutionary dead end.
Nowhere was this more evident than in the HL7 V3 datatypes, which brings us back to our contemplation of time and the V3 timing datatypes. Timing data types in HL7 V3-based standards (like CDA) are notoriously difficult for implementers to get right. The syntax for representing time is not what developers are used to and the semantics are confusing, unintuitive, and error prone.
It’s no wonder so many implementers get them wrong. Timing issues are one area where I have the utmost sympathy for implementers. Even if they read the specs, the odds of them getting a GTS implementation correct for transmission over the wire are astronomically low for a first attempt (and most only have budget for a first attempt, because their managers assumed it would be easy and intuitive).
The errors I typically see fall into several categories. The simplest are point in time errors. HL7 V3 uses a very specific syntax for timing variable. Per the CDA spec:
The value of a point in time is represented using the ISO 8601 compliant form traditionally in use with HL7. This is the form that has no decorating dashes, colons and no “T” between the date and time. In short, the syntax is “YYYYMMDDHHMMSS.UUUU[+|-ZZzz]” where digits can be omitted from the right side to express less precision. Common forms are “YYYYMMDD” and “YYYYMMDDHHMM”, but the ability to truncate on the right side is not limited to these two variants. See the Data Types Abstract Specification for detail.
First, almost no one reads the CDA datatype specs, so I often see:
- <effectiveTime value=”2/14/2016″/>
- <effectiveTime value=”2016-02-14T09:10:14Z”/>
Next, even when implementers do read the spec, they don’t grok that the value belongs in the effectiveTime/@value attribute, so even if they get the time syntax right the XML is often invalid because they put it as plain text inside an effectiveTime element as follows:
(Hint) This is how it should look: <effectiveTime value=”20160214091014-0000″/>
But points in time are only the beginning. What about date ranges? These are typically represented by an effectiveTime element with high and low sub-elements. Assuming implementers get point-in-time (i.e. TS datatype) syntax right, there is still a whole host of new pitfalls awaiting, such as:
<effectiveTime> <high value="20160213"/> <low value="20160214"/> </effectiveTime>
This is the case of someone misunderstanding that low is for the earliest date, and high is the terminating boundary for the latest date. A high value that precedes the low value makes no sense (i.e. the patient checked into the ER on Feb 14, but checked out before the start of Feb 13…did the ER have a time machine?).
Another misconception is that the “high” value is inclusive of the date/time in question, when in fact it represents the terminating boundary of the date range. Take the following example:
<effectiveTime> <low value="20160214"/> <high value="20160215"/> </effectiveTime>
At a glance this looks fine. It looks like the checkin/checkout dates you select when booking a hotel online. You would assume that the patient checked in on Feb 14 and checked out on Feb 15. But HL7 V3 date/times don’t work like that. Feb 15 is the end of the date range, but is not included in the range. It actually means that the patient checked out at the end of the day Feb 14 BEFORE the start of Feb 15. Note, there is an inclusive attribute that one can add to get around this, but I’d have to dig really hard to find a system that produces that attribute, or worse one that would interpret it correctly if it was received.
And even if implementers get date ranges right, there are a whole host of other pitfalls waiting, such as:
- Mismatched organizer dates and component observation dates (i.e. a lab panel says it was completed on July 15, 2016, but the tests that make up the panel were completed on July 20, 2016).
- False precision (padding dates with zeros, effectively implying that an event occurred at exactly at midnight)
- Missing precision (birth date with only a year, i.e. partially mapped dates)
…and so on.
In short, HL7 V3 timing datatypes are hard to get right, and implementers need to understand that. Senders need to study the datatypes specs and the XML examples available from the CDA Example Task force in detail, and recipients need to understand that most of the dates they get should be treated with caution.