Sunday, November 11, 2007

Dataspaces for Veterans

In the U.S. we are marking Veteran's Day tomorrow. There are many ways in which we should be thanking our veterans and making their lives better. I'd like to report a rather unique way.

I recently had the opportunity to visit the Veteran's Administration Hospital in Washington DC and learn first-hand about their patient-record system. I was pleased to see the principles of dataspaces in action, clearly enabling better healthcare services.

The VA provides services to veterans of the American military and has around 150 hospitals, 800 clinics and 200 nursing centers scattered around the country. To support these services, the VA maintains electronic records for all their patients, a system that has won them many accolades in recent years. The system stores the patients' prescriptions, doctor visits, lab tests and other data about each patient. As their patients often move around and receive treatment in various locations, when a doctor views the data about a patient, it needs to be integrated from multiple VA locations. Each of these locations is running their own system. In addition, data about their patients may reside in systems of the Department of Defense (and their healthcare providers) and various drugstore chains.

Clearly, this is an incredible data integration problem. Today they are aware of at least 130 different "implementations" of their electronic record system, i.e., different schemas. Also, given the different local needs of hospitals and clinics, imposing a single schema on all the VA centers would not work. Using a data integration solution at this scale and in such a dynamic environment would be extremely difficult.

Instead, what the VA did is standardize on a very small subset of patients' attributes, namely attributes describing patients' vital signs. Outside of this set of attributes, hospitals are free to develop their own local data organizations. However, the system lets the healthcare providers see all the data even if it's not completely integrated. So for example, if a doctor wants to see what happened to a patient while they were at a remote location, then the remote data may appear as plain text, and therefore the doctor would have to work a little harder to digest it, and won't be able to pose the queries she could pose on the local data. But being able to see the data in some form is infinitely better than not seeing it at all, and the doctors are extremely happy with the system's capabilities.


The VA also demonstrated two examples of the pay-as-you-go principle that is at the foundation of dataspaces. The first was the fact that they decided that vital signs are critical, so their data sources are aligned on the attributes relating to those (effectively, creating semantic mappings involving the attributes of vital signs), and they plan to continue agreeing on terminology as they see fit. Second, they had a culture that allowed for local innovation, class-3 applications, that represented needs at the local level. When these needs were perceived to be important throughout the organization, they promoted them to class-1 applications, and required all their systems to support them.

Just to make it clear, when I walked in the door they did not greet me and say: "Pleased to see you Dr. Halevy; we'd love to show you our dataspace system". What I'm describing is a post-rationalization of a system that was developed over more than a decade. I believe that their loose integration was the key to their success.

2 comments:

Ian Hsu said...

Hi Alon, I noticed that you're a Stanford alumnus, and I wanted to invite you to add your blog to the Stanford Blog Directory. The university is trying to raise the exposure of authentic voices from Stanford faculty, students, staff, and alumni -- and I'd love to include your blog in the directory. Just click on "Submit a Blog" from the directory homepage. Thanks! Ian

chuckl said...

Happy Veterans Day, Dr. Halevy, it must have been gratifying to see the concept of dataspaces being used in such a worthwhile way. I've been following the discussion of dataspaces as much as I've been able to, and find the concept fascinating. We look at it from a publisher's point of view, and I am wondering if you might comment on the value and use of dataspaces in a publishing environment. Thanks in advance.