insideBIGDATA Guide to Healthcare & Life Sciences

Print Friendly, PDF & Email

The inside BIG DATA Guide to Healthcare & Life Sciences is a useful new resource directed toward enterprise thought leaders who wish to gain strategic insights into this exciting new area of technology. This guide is a useful new resource directed toward enterprise thought leaders who wish to gain strategic insights into this exciting new area of technology. The guide provides an overview of the utilization of big data technologies as an emerging discipline in healthcare and life sciences. It explores the characteristics of this business strategy and the benefits of leveraging big data technologies within these sectors. It also touches on the challenges and future directions of big data and analytics in the healthcare and life sciences industries. The complete inside BIG DATA Guide to Healthcare & Life Sciences is available for download from the insideBIGDATA White Paper Library.

Big Data and Analytics for Healthcare and Life Sciences – An Overview

The healthcare and life sciences industries historically have generated vast amounts of data. These large volumes of data hold the promise of  supporting a wide range of medical and healthcare tasks, including clinical analytics and decision support, patient profiling, disease surveillance,  regulatory and compliance requirements, scientific research, and many others. Data in healthcare and life sciences is expected to grow exponentially in the coming years and will be beyond the capability of the traditional methods of data management and data analytics.

It is vitally important for organizations in these industries to acquire the available infrastructures, methodologies, and tools to leverage this vast amount of data effectively to ensure the highest possible standard of patient care, as well as risk significant revenue and potential profits.

Despite the fact that some data in the healthcare sector is still stored in hardcopy form, most is in electronic form. One issue, however, is that this data is now stored in electronic silos with more and more data produced every day from new devices. Big data in the healthcare industry promises to  support a diverse range of healthcare data management functions, however the industry is still in the early stages of getting its feet wet in the large  scale integration and analysis of big data.

Life sciences research continues to evolve rapidly in conjunction with an increasing focus on analytics and the more effective use of data. The race to  understand patients and diseases at the molecular level to achieve precision medicine is fueling this shift. The figure on page three encapsulates many of the demand drivers coupled with new business models where outcomes and real-world data are providing health data and transforming what is  possible. It’s clear that the race to find the cause and subsequent treatments and cures is paramount.


The New York based research and consulting firm, Institute for Health Technology Transformation estimates that in 2011, the US Healthcare industry
generated 150 exabytes of data—enough to copy all of the printed materials in the Library of Congress 15 MILLION times over. This data was mostly generated by patient care, record keeping, and various regulatory requirements. Since then, there has been an exponential increase in data which has led to an expenditure of $1.2 trillion towards data solutions in the healthcare industry. Healthcare expenses in the U. S. now represent 17.6 percent of GDP—nearly $600 billion more than the expected benchmark for a nation of its size and wealth. McKinsey & Company projects that the use of big data solutions in healthcare can reduce the healthcare data management expenses by $300 billion – $500 billion.

Big data in healthcare originates from large electronic health data sets—these data sets are very difficult to manage with the conventional hardware and software. The use of legacy data management methods and tools also makes it difficult to usefully leverage all this data. Big data in healthcare is  an overpowering concept not just because of the volume of data but also due to the variety of data types and the velocity at which healthcare data  needs to be managed. Furthermore, the sum total of data related to the patient and their well-being constitutes a rising problem in the healthcare industry.

Big data technologies allow leading healthcare and life sciences organizations to address and overcome a wide variety of business and clinical challenges such as improving patient safety, reducing 30-day hospital readmissions, and enhancing drug discovery by harnessing the power of their data with big data analytics solutions. These technologies enable organizations to import, unify and analyze information from traditionally isolated
data silos such as patient electronic health records (EHR), claims information from payer organizations, even data outside the traditional healthcare context such as socio-economic and patient generated data in a scalable, cost-effective manner. Importantly, these solutions comprehend both  structured and unstructured information and help organizations move from historical reporting to real-time predictive analytics.

Big data technologies enable researchers to perform analyses and make informative, actionable decisions that are driving real change in the treatment of rare genetic diseases—making it easy for biologists to identify genetic disease markers and assess drug efficacy when visualizing cell data. This  change also allows for faster time-to-value for pharmaceutical companies as well as a shorter path to patient benefits, e.g. identification, diagnosis and
predictive analytics in action. With the power of big data and data science, we are one step closer to a world where genetic diseases are more  effectively managed and more frequently cured, changing patient lives forever.

Use Case Examples Abound

There are an increasing a number of important use case examples where combining big data technology with healthcare and life sciences has become meaningfully beneficial:

  • Genome processing and DNA sequencing – there is tremendous growth occurring in the genomics sequencing market as evidenced by data  volume increases produced by DNA sequencers and in the number of individuals being sequenced (even though much of the data coming out of a sequencer is not actionable and not usable in the EHR). Additionally, “medicine” starts after the VCF (variant call format) file is annotated and is part of the Interpretation phase, which could happen in part in Hadoop, although there are other options like SAP HANA and Microsoft  Analytics Platform System (APS). Using SAP®Foundation for Health™, built on SAP HANA, helps turn big data into smart data, adding value for healthcare organizations, and life sciences companies.
  • Neuroscience – the U.S. based BRAIN Initiative uses big data technologies to map the human brain. By mapping the activity of neurons in the brain, researchers hope to discover fundamental insights into how the mind develops and functions, as well as new ways to address brain  trauma and diseases. Researchers plan to build instruments that will monitor the activity of hundreds of thousands and perhaps 1 million  neurons, taking 1,000 or more measurements each second. This goal will unleash a torrent of data. A brain observatory that monitors 1 million  neurons 1,000 times per second would generate 1 gigabyte of data every second, 4 terabytes each hour, and 100 terabytes per day. Even after compressing the data by a factor of 10, a single advanced brain laboratory would produce 3 petabytes of data annually.
  • Personalized treatment planning – a way to customize treatment for a patient to continuously monitor the effects of medication. The dose can be adapted or the medication changed based on how the medication is working for that particular individual. This analysis can be applied at the individual level and is tailored to each patient’s specific needs. But personalized medicine goes far beyond monitoring the effects of medication. Precision medicine is defined as an emerging approach for disease management and prevention that takes in to account individual variability of genes, environment and lifestyle for each person. Precision medicine looks to move the needle from reactive medicine to proactive medicine. It’s a move away from drugs and treatments for the broad population to a more precise, individualized treatment plan. A recent study from the  University of California, San Diego, found that patients who are treated utilizing a more precise plan or more personalized plan saw improved  wellness periods of nearly 30% versus the non-precision medicine group.
  • Assisted diagnosis – being able to access a broad combination of knowledge across multiple data sources aids in the accuracy of diagnosing  patient conditions. Assisted diagnosis is accomplished using expert systems that contain detailed knowledge of conditions, symptoms,  medications and side effects. Bringing together individual data sets into big data algorithms provides more accurate insights.
  • Using predictive analytics to help accurately predict discharge dates, help identify patients at high risk for readmission, and help surgical teams keep patients safe by reducing surgical site infections by 58 percent while decreasing the cost of care (as reported by the University of Iowa).
  • Using machine learning tools to circumvent diagnosis codes in EHRs that are fraught with accuracy problems, to automate the detection of both false positives and missing codes in patient charts. Use text data from millions of doctors’ notes to train a machine learning classifier to pick out heart failure patients based on everything in their charts, not just their diagnosis codes.
  • Creating tools that integrate genetic data and accelerate its use at point of service. Integration, however, is not an easy task since genetic data must follow the patient over their lifetime throughout their episodes of care.
  • Monitoring patient vital signs – healthcare facilities are looking to provide more proactive care to their patients by constantly monitoring patient vital signs. The data from these various monitors can be used in real time and send alerts to nurses or care providers so they know instantly about changes in a patient’s condition.
  • Evidence-based medicine – involves making use of all clinical data available and factoring that into clinical and advanced analytics. The  outcomes of this application of big data include improved ability to detect and diagnose diseases in their early stages, assigning more effective  therapies based on a patient’s genetic makeup, and adjusting drug doses to minimize side effects and improve effectiveness.
  • Using bioanalytics platforms designed to improve the productivity of biologists.
  • Optimizing the EHR through consolidation to reduce costs and increase efficiency.
  • Gaining insight, preventing inefficiency, and adapting workflows for better healthcare.
  • Using sensing devices for data collection as a revolutionary step forward in Parkinson’s research.
  • Using collaborative analytics via the cloud to personalize treatment plans.
  • Health plans can move beyond traditional descriptive analytics and unwieldy data warehouse strategies. New solutions provide advanced  business intelligence, analytics and strategic information management systems that can help derive meaningful insights from data to attract  customers, as well as manage costs and risks.
  • Creating fraud detection solutions – healthcare organizations need to be able to detect fraud based on analysis of anomalies in patient records,  billing data or procedural benchmark data.
  • Imaging analytics – opens up new diagnostic landscapes for interpreting x-rays, CAT scans, and MRIs which has largely remained under the  responsibility of skilled clinicians who specialize in catching abnormalities and reporting on findings. In contrast, as computing power increases and analytics algorithms start to become intelligent enough to analyze patterns in digital images, these test results are taking on a whole new  meaning for the diagnostic process.

Challenges for Adopting Big Data and Analytics in Healthcare and Life Sciences

The effective use of data analytics in healthcare has been hailed as a solution for saving time and dollars and improving patient outcomes for a  healthcare organization. However, many health systems have a hard time capturing and using data from patients that can make a real impact on  patient outcomes. Part of the issue lies in EHR data, which can provide an incomplete picture of patient behavior. EHR is the systematized collection  of patient electronically stored health information in a digital format. EHRs are real-time, patient-centered records that make information available instantly and securely to authorized users. But EHRs are inadequate in capturing mental health diagnoses, visits, specialty care, hospitalizations, and medications. When big data can be used holistically to revamp healthcare processes including care coordination, patient relationships and financial services, the results can save organizations time and money.

According to research by healthcare technology provider Evariant, Inc., the effective use of big data technologies in healthcare could save the industry
$300 million a year. Analysis of real-time data saved one hospital $850,000 in overtime costs alone, using more intelligent discharge planning, disease management, quality assurance, and performance reporting. However, it has been difficult for healthcare providers to integrate EHR data  with clinical transcripts and other notations that would add context to patient care. According to a 2015 eHealth Initiative survey, only 17% of  providers have been able to couple population health analytics with EHR data.

As life sciences organizations face growing challenges, being effective with data becomes essential for sustained success now and into the future. Those who understand how to manage both the internal and external data relevant to their products, markets and customers will create the opportunity for competitive advantage based on improved insight. If life sciences organizations are able to apply their acumen with big data and analytics to drive decisions and engage in smart collaboration, they will find order and opportunity where others see chaos.

With 80% of the healthcare data being unstructured and growing exponentially, it is a challenge for the healthcare industry to make sense of all this data and leverage it effectively for treatment courses, clinical operations, and medical research. It is extremely important for the big data healthcare companies to make use of the best-in-class technology that can leverage big data in healthcare effectively. Getting access to and using this unstructured data—such as output from medical devices, doctor’s notes, lab results, imaging reports, medical correspondence, clinical data, and financial data—is an invaluable resource for improving patient care and increasing efficiency.

In the next sections, we’ll examine two areas of technology that have become quite prevalent for the healthcare and life sciences industries —deep learning and distributed computing architectures (e.g. Apache Hadoop, Spark), as well as other technologies like SAP HANA solutions for aggregating data types into a format that can easily be queried.

The overarching goal is to be able to engage the analytics maturity progression as depicted in the figure below. There is a tremendous opportunity to use predictive analytics to automate processes in healthcare. In the provider case, you can consult thousands of physicians and hundreds of thousands of cases, process real-time data, and learn the repeated patterns that allow you to make predictions about best courses of actions, best treatments, the probability and risk involved in sepsis, and more. We’re looking at a revolution in the way healthcare will be delivered, paid for, organized and this revolution will be driven by data and predictive analytics to make the most important use out of available historical data.



Over the next few weeks we will explore these healthcare & life sciences topics:

  • Big Data and Analytics for Healthcare and Life Science – An Overview
  • Use Case Examples Abound
  • Challenges for Adopting Big Data and Analytics in Healthcare and Life Sciences
  • The Rise of Deep Learning
  • Distributed Systems – the Key to Success
  • Hadoop Use Cases
  • Spark Use Cases
  • The Impact of IoT on Healthcare and Life Sciences
  • The Convergence of Big Data and HPC
  • Case Studies: Dell Focused Customer Use Cases
  • Summary

If you prefer, the complete insideBigData Guide to Healthcare & Life Sciences is available for download in PDF from the insideBIGDATA White Paper Library, courtesy of Dell and Intel.



Speak Your Mind