Why Big Data is Critical to the Pharmaceutical Industry

Print Friendly, PDF & Email

In this special guest feature, Inga Shugalo, a Healthcare Industry Analyst at Itransition, suggests that whether it’s an application for precision medicine, decreasing the failure rates in drug trials, or lowering the cost of research and developing better medicine, big data has a bright future for the pharmaceutical industry. Itransition is a custom software development company headquartered in Denver, Colorado. Inga focuses on Healthcare IT, highlighting the industry challenges and technology solutions that tackle them. Her articles explore diagnostic potential of healthcare IoT, opportunities of precision medicine, robotics and VR in healthcare and more.

When it comes to investing in big data, probably, no other industry has as much at stake as pharmaceutics. Big data not only provides the foundation for research and the discovery of new drugs but also helps patients and caregivers make better decisions. Predictive data modeling combined with rich data visualization can substantially decrease the expenses on drug discovery and facilitate decision-making in the pharmaceutical industry and in healthcare in general.

Decreasing the Cost of Drug Discovery

Drug discovery can take immense amounts of time and resources, which contributes to the expense of many drugs and even their inability to make it to the market in the first place. Drugs used to fight Amyotrophic Lateral Sclerosis (ALS), for example, aren’t being developed despite the 140,000 new cases of ALS being diagnosed each year worldwide. This is because the costs of developing the drugs far outweigh demand, which means that such investment wouldn’t pay off.

Big data and machine learning can be essential in lowering the cost of drug discovery, moving the experiment from clinical researchers to a combination of AI, complex software, and powerful computers to minimize the time needed for clinical trials. This, in turn, would drastically decrease the amount of research necessary, lowering the costs significantly for manufacturers and, as a result, patients.

Drug-protein interactions, an area that has remained a mystery for some time, is another example of how big data can be applied to the pharmaceutical industry. Solving protein structures is time- and resource-intensive due to the sheer number of protein structures that exist and their different effects when combined with different drugs. In addition to testing for the effectiveness of a new drug, researchers must test to ensure the drug does not harm patients. These multiple factors create research that becomes extremely time-consuming and impractical in light of available resources.

At Carnegie Mellon University, a machine-learning experiment was developed that would analyze the results of different drug interactions with proteins. Without computer modeling, prediction of protein structures is quite difficult, as there are almost 15,000 protein families in the database. The data was divided into several sets, each with different levels of understanding of their protein structure.

Of the total 9,216 experiments, about a third were completed through the automated system and the results predicted by the machine. The results of this segment had an accuracy rate of 92% and scientists were able to be receive them far faster than if it had been conducted in a clinical laboratory. Since the accuracy of the results combined with the automation saves valuable manual research time, these machine learning techniques can be applied to a variety of medical tests that would ultimately help to lower costs associated with drug research.

Making Critical Drug Development Decisions

Big data not only helps decrease drug discovery and manufacturing costs, it can also be used to help government health agencies, payers, and providers make important decisions related to drug discovery, even as far as deciding which drugs should be developed.

One leading biotechnology company, Genentech, foresaw the advantages of big data and invested in building a powerful big data infrastructure. Data scientists working in the pharmaceutical industry face a challenge of gathering data that gives them answers to important research questions – those same questions that are aimed at supporting drug research and development. It is only once they have this type of data that they can use it to gain a better understanding of the many possible patient treatments in various patients. They can use this information to develop a better individual treatment for patients as well as apply what they’ve learned to future drug development decisions.

The National Cancer Institute gathered 4.1 petabytes (1 petabyte equals 1 million gigabytes) of data from 14,000 anonymized patient cases. The Genomic Data Commons (GDC) project is aimed at enabling the sharing of data across genomic studies in support of precision medicine. A large amount of data is necessary to understand the exact genomic signature needed for the right combination of drugs to fight the tumor.

The GDC is the largest repository for cancer data in the world, delivering data in a digestible form to the research community at large. The project also enables others to build data commons that can scale as large as the GDC. One project, BloodPAC, built a data commons for liquid biopsies. The idea is that when patients are treated, their blood will be drawn and the cell DNA in their blood will help researchers understand their reaction to the chemo and whether or not their treatment needs to change. The data may also shed light on how tumors grow if certain tumors are similar, and the different genes or proteins that should be targeted – all information needed to develop new drugs for each specific situation.

A Bright Future Ahead

Whether it’s an application for precision medicine, decreasing the failure rates in drug trials, or lowering the cost of research and developing better medicine, big data has a bright future for the pharmaceutical industry. The ability for researchers all over the world to contribute their own data sets to huge projects is also key. For better and worse, humans are creating more and more data every day, which means that as time goes on these projects will gather more insights, leading to better and greater applications in the field of medicine for everyone. It will be exciting to see what the future will bring.


Sign up for the free insideBIGDATA newsletter.

Speak Your Mind



  1. I like that you’ve explained how big data is the key to make Critical Drug Development Decisions, people need to understand how effective this use of big data is.

  2. Good explanation how big data collected anonymously from patients can help cure difficult diseases like Cancer with low costs. We will see in the future more insights coming from the utilisation of big data.

  3. This is a great article that really highlights the importance of Big Data in the pharmaceutical industry.

  4. I like the way you described how big data is essential for making important decisions on the development of new drugs; more people should be aware of this usage of big data.