GSK Aims High with Its Big Data Strategy

GlaxoSmithKline is aiming high with its big data by using decades’ worth of data from clinical trials. The pharmaceutical giant is aiming to turn this data into an asset of drug-discovery, with drugs being delivered to the market quicker.

Image Credit

Big Data

The pharmaceutical industry is renowned for its slow pace; this method of leveraging data could revolutionise the business.

In contrast to other industries such as telecommunications or financial services, pharmaceuticals have not as yet used data as a strategic asset.

By using data from paid research studies, such as clinical trials at Trials4us, more effectively and efficiently, drug discovery could be accelerated.

Clinical Trials

Pharmaceutical companies, such as Bayer, the largest company in Europe, are decades, if not hundreds, of years old. They accumulate and store vast hordes of data from clinical trials conducted, with most simply stocking it away in various archives.

Image Credit

GlaxoSmithKline is over 300 years old, and maintains its data in 2,100 silos, providing a potential goldmine of pharmaceutical insights. Like most pharmaceutical companies, GSK was not exploiting its data analytics. Data from clinical trials was instead used once with the intention of bringing new medicine to the market.

GSK Big Data Information Platform

GSK saw a great opportunity to use and share data across their conducted trials, but this required a comprehensive data platform. The GSK Big Data Information Platform was launched – a Cloudera Hadoop-based software data store where automated bot technology consumes data from hundreds of operational systems.

GSK then uses software to simplify chaotic complex data sets into compilations that business users can analyse.

GlaxoSmithKline also utilises machine learning software, called Tamr, to transfer data into industry systems and AtScale software in order to present the information. Zoomdata visualisation software enables business users to view the data.

The other tools hosted within the platform are Googles TensorFlow, Spotfire, Tibco and Anaconda. These different technologies are integrated with the aim of making clinical trials easier by sharing data.

GSK moved approximately 12 terabytes and 8 petabytes of structured and unstructured data into the platform as part of the project. This took 11 months, which is extremely quick for any company, and especially so for a pharmaceutical business. Although over 300 years old, GSK has changed its mindset to that of an innovative start-up.


Leave a Reply

Your email address will not be published. Required fields are marked *