nVenio Analytics has been building solutions to integrate data from multiple health care billing systems for the last 20 years. These systems all have different database schemas and data models, but they have one thing in common. They all contain real world identifiers for things like physicians, patients, subscribers to insurance plans, claims and payments.
For example, in the HIPAA ASCX12N 837 Health Care Claim specification the field CLM01 Claim Submitter’s Identifier is described as:
“The number that the submitter transmits in this position is echoed back to the submitter in the 835 and other transactions. This permits the submitter to use the value in this field as a key in the submitter’s system to match the claim to the payment information returned in the 835 transaction. The two recommended identifiers are either the Patient Account Number or the Claim Number in the billing submitter’s patient management system.”
In order to integrate data on the same health insurance claim coming from both a medical practice billing system and an insurance company claim adjudication system we apply a technique called fingerprinting or hashing to this claim identifier. Because it is created by the medical practice at the time of billing and must be maintained in all transactions involving a given claim, we can be sure that regardless of what source system we acquire the claim data from that identifier will be constant.
The data for a Claim Submitter’s Identifier might look like the character string “KC2783” and when customer service representatives respond to inquiries from the medical practice, they will look the claim up by its claim number…KC2783.
In order to take advantage of the database technologies we use to reconcile claims and payments, our entire data architecture is based on the creation of fingerprints for each business key, like Claim Submitter’s Identifier, using a cryptographic algorithm known as the MD5 Message Digest Algorithm. This algorithm will produce exactly the same 32-character fingerprint every time it is supplied with the same input data. The claim number KC2783
always results in an MD5 fingerprint of 4876F36EC46338E0ED69DE00C16D3553 regardless of what system or source we obtained the data from or what database management system computed the fingerprint.
The use of this fingerprinting technique as a core component of our data architecture has provided us with significant advantages:
We can load massive amounts of claim data in parallel, hashing the business keys as we go, and link the data together later very reliably and efficiently using the fingerprints.
We can create a single fingerprint for complex, composite business keys that is much more efficient when running reports and analytics against the resulting data.
We can take advantage of processing efficiencies seen in most database management systems when joining or accessing data on a key that is always the same length.
Our data architecture can join data across multiple, heterogeneous systems since the computation of the fingerprint on any system using the same input data will produce the same fingerprint.
