Big Data

duganGuest blogger Dugan Maddux, MD, FACP, is the Vice President for CKD Initiatives for FMC-NA. Before her foray into the business side of medicine, Dr. Maddux spent 18 years practicing nephrology in Danville, Virginia. During this time, she and her husband, Dr. Frank Maddux, developed a nephrology-focused Electronic Health Record. She and Frank also developed Voice Expeditions, which features the Nephrology Oral History project, a collection of interviews of the early dialysis pioneers. Dr. Maddux’s story has been captured in a previous post on this blog. We look forward to featuring posts from Dr. Maddux on a more regular basis in the future.


In early August I attended the eHealth Initiative National Forum on Data & Analytics. Interestingly, the talks were all about Big Data, with sessions titled “Policy & Privacy Issues in the Era of Big Data & Analytics,” “Integration of Big Data and EHR Systems” and “For Big Data to Realize its Promise, Industry Collaboration will be Key.” This star treatment elevated Big Data into something akin to the Great and Powerful Oz. After the conference I scurried to my computer to get behind the curtain to get the real skinny on Big Data.


Data sets


Big Data specifically refers to a collection of data sets such as claims data, clinical data and lifestyle data, which is aggregated into a large and complex single database for analysis. The individual data sets have independent value, but they have even greater value when they are brought together and analyzed since the data are related to each other in some way. Big Data typically refers to data sets that are too large for management by common software applications.
In short, Big Data is truly big. We can only contemplate using Big Data today because of the amazing ability of modern systems to manage huge amounts of information. Today, systems can manage and analyze “exabytes” of data. (For reference, 1 exabyte (EB) = 1018 or one quintillion bytes = 1000 petabytes = 1 billion gigabytes.)


Data volume


Big Data is all about data volume, so dealing with Big Data has centered around the challenges of data capture, data “curation” and data storage. The definition of Big Data used by Gartner, a U.S. Information Technology Research and Advisory firm, is “Big Data is high volume, high velocity and/or high variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization.” Big Data was originally described by Doug Laney in 2001 and has since been associated with the 3 Vs: increasing Volume (amount of data), increasing Velocity (the speed of the data in and out) and increasing Variety (the types and sources of datasets included). Recently a fourth V associated with Big Data has been identified: Veracity, the ability to insure that the data that are collected and analyzed are a true representation of the actual data.


What Big Data might lack in meaningful direct information, it makes up for in sheer volume. The low information density of Big Data is offset by the huge volume of data, which lends itself to the use of “inductive statistics” to develop inferences. For example, a piece of lifestyle data such as food shopping habits might be coupled with clinical data to determine patterns associated with better or worse outcomes in diabetes management.




One of the biggest Big Data sources in healthcare is genomics. The data from a single human genome is a very large data set. In healthcare, a patient’s genomic data set will be combined with other data sets from that individual to inform healthcare decisions. Big Data and the sophisticated analytics that should accompany it in healthcare can help to create a narrow grouping of people and improve the opportunity for more customized decisions about healthcare resources and interventions.
Sam Ho, CMO of United Healthcare, is ready for Big Data to “eliminate unwanted variations in care.” John Glaser, CEO of Siemens, sees Big Data and analytics as the opportunity to advance the limits of healthcare beyond the confines of better hardware. Hopefully Big Data will be a Big Opportunity for all of us as patients and providers.


If you have any thoughts on Big Data and its application to the field of nephrology, please share them with us in the comments.

Comment on Article