Big Data – What’s That?

Big Data – What’s That?

A quote from Anthony D. Williams: “Every animate and inanimate object on Earth will soon be generating data, including our homes, our cars, and yes, even our bodies.” (http://anthonydwilliams.com/2011/03/30/sciences-big-data-revolution-yields-lessons-for-all-open-data-innovators/)

Big DataAs I grew up reading George Orwell’s 1984, considering statements like this leave me with a bit of a queasy feeling. I realize that every keystroke or search on the internet tells some system where I am and what I am interested in, and that cell phones are always tracking locations, etc., but still, I tend to think of those as small bits of data not terribly interesting to anyone. Apparently though, I am wrong. Huge amounts of money and effort are being spent in learning the minutiae of everyone’s life.

As data managers, the HIM professional understands that database systems allow organizations to organize, categorize, store, analyze and compare information. We can do it on a small scale with a simple spreadsheet, or with larger more complex database systems. We have used the data to trend significant health events such as post-op readmission rates, or infection rates, for example. We are familiar with our systems, and understand how our data is collected, how our clinicians record, how we code activities, etc. Therefore, we can look at our data, view it with some level of confidence, and use the data for the myriad of activities that support care delivery.

Our experience though is limited (primarily) to the data we collect within our organizations or regions, and although it is complex information, it is typically limited in scope to the administrative, demographic, and clinic information from each person’s encounter with our facility. We trust our data, and because of the reporting relationships with CIHI, we trust (to a certain level) the comparative data we can obtain from CIHI about other facilities. I guess we could say this is our first foray into Big Data.

But Big Data though is taking things to a whole new level – a precipe! What if we were taking from multiple sources, including sources that may not share our edits, our data standards, our formats for collection? What if we were accessing family physician notes (not just billing records), what if we amassed data from our clients PHRs in which they have stored family histories, what if we accessed family financial data, or education data? What could be gleaned from Facebook or someone’s Twitter accounts? Everyone acknowledges that family history, social environments, finances and education impact health, so what if we could access all of this? What if we did start compiling information about liquor and food purchases and analyzed it from an individual’s health perspective? What if we could use Facebook and other social media to identify all the people a person with a communicable disease may have been in contact with – without relying on that person’s memory at a time of stress? It boggles my mind.

Even if we wanted to, AND could manage the privacy issues, AND could have some way to provide a ‘reliability rating’ to the data from other sources, we would need entirely new methods of data collection and evaluation, new tools to manage the data, and a whole new way of thinking about data collection. And that’s what makes HIM such a fabulous field – there is always something new!