BIG DATA
bedded in clinical notes. “If we’re serious
about big data mining, it’s silly to only
look at 20 percent of the data,” he says.
The technique was able to scan 10 mil-
lion notes in eight hours and determine
whether the target events were mentioned;
another scan looked for the timestamps
that showed their temporal relationships.
“If we can do this well with data from one
hospital, the possibilities are huge with a
health information exchange.”
While it would be nice if physicians
could settle on one way to describe each
medical concept—rather than having 30
different terms for a heart attack—Shah
says it doesn’t really matter. It’s easier and
faster to have data analysts manually iden-
tify those 30 ways and program them into
the algorithm than to reeducate an entire
profession.
He says the technique can be used
with large databases to answer all kinds
of multi-factor questions. For example,
clinical trials are often difficult to populate because the researchers are trying
to study one condition in isolation, even
though most patients have co-morbidities
that have to be treated at the same time. By
analyzing many different combinations of
diseases and treatments, researchers can
identify patterns. “You’ve heard of evidence-based practice?” Shah says. “We’re
doing practice-based evidence.”
“If we’re serious about big data
mining, it’s silly to only look at 20
percent of the data.”
—Nigam Shad
Clinic, meet claims
Massive compilations of claims
data have been a health care
research source for decades. They’re often a deeply flawed source, riddled with
inconsistencies and coded for maximum
allowable reimbursement rather than unwavering accuracy. But with millions of
examples for even the most obscure diagnoses, claims databases were big data before there was Big Data.
Claims data would be much more useful
if researchers could tie it back to providers’
clinical records in order to validate the in-
formation and find explanations for pat-
terns of disease and care, but those link-
ages have been difficult to accomplish on
a large scale. In January, the Mayo Clinic
and Optum, the analytics company owned
by insurance giant United Healthcare, an-
nounced a collaboration that could take
claims-based research to the next level.
Mining device data
For a flood of data, look no further than the medical devices
that update patients’ information almost
continuously. Most of it disappears into
the ether, landing in the electronic record
maybe once an hour. But at The Ohio State
University Wexner Medical Center, Assistant I. T. Director Kevin Jones is planning