Sign up for our newsletter and get the latest big data news and analysis.

Lowering the Barrier to Entry for Cloud Computing is the Key to Scientific Discovery

In this special guest feature, Ivan Ravlich, Co-Founder and CEO of Hypernet Labs, points out how the cloud industry needs to offer more accessible options to scientists and researchers who need to process large amounts of data. Containerizing scientific applications is a major step forward. Ivan previously worked as a theoretical physicist and literal rocket scientist, and this career trajectory is the driving force behind Hypernet Labs. The first product powered by Hypernet is Galileo. It is reinventing how code is deployed to make computing power widely accessible to those who need it. Ivan was recently included in Forbes’ 30 Under 30 in the Science category.

Computing has taken on a more prominent role in the sciences over the past couple of decades, and the advent of the cloud has enabled scientists and researchers to process large amounts of data … in theory, that is. In practice, the current cloud solutions are ineffective when it comes to processing mass amounts of very specific data. They’re also notoriously difficult to use without a computer science degree or formal engineering education. The cloud industry needs to offer more options to these scientists, especially more accessible options. Containerizing scientific applications is a major step forward. 

For example, In spring 2019, nanotoxicology doctoral researcher Artur Kirjakulov was faced with a looming deadline, data from 500 sequential electron microscopy images, and limited access to the computing power he needed to process the data and finish his dissertation. Happily, he was able to finish his research on schedule and reduce his computing time by almost three quarters, by leveraging a containerized version of his scientific application to access powerful machines with the necessary compute power through the cloud. 

Kirjakulov was working in the research group of Prof. Howard Clark, Dr. Jens Madsen, and Dr. Sumeet Mahajan at the University of Southampton Faculty of Medicine. The group studies airways and the importance of innate immunity for the lung during infection, inflammation, repair processes, and normal healthy lung maintenance. Nanoparticles became an important part of this research, and spurred the creation of nanotoxicology as a subdiscipline, due to the exponential rise in the synthesis and use of nanomaterials over the past twenty plus years.

Nanotoxicology aims to address the potentially adverse effects of engineered and naturally occurring nanomaterials on living organisms and ecosystems. While humans are now intentionally (cosmetics, medicine) and unintentionally (environmental, occupational) exposed to nanoparticles on a daily basis, the study of these particles and how they interact with biological systems is not straightforward. Since nanoparticles are 1000 times smaller than human hair, powerful imaging techniques, like electron microscopy, are required. In the process, tens or hundreds of gigabytes of data are generated, which cannot be analysed using conventional computers.

Kirjakulov employed his containerized software to leverage parallel processing for image analysis, specifically Trainable Weka Segmentation on ImageJ. His research involved exposing cell cultures to 20 nm gold particles and then imaging the cells using serial block-face scanning electron microscopy, which generated more than 500 sequential images, amounting to a layered 3D representation. He was then able to train the software to recognize individual particles that had been taken up by the cell culture. ImageJ thus allowed him to isolate specific image features and analyze them, eliminating any interference from the image background and avoiding multiple manual operations.

The containerized approach enabled him to utilize specialized cloud machines much more effectively. This helped to reduce the time of single analysis from more than 40 hours on his laptop (16 GB RAM i7 3rd Gen) to less than 12 hours, with a potential for even greater time savings if he had decided to split the work between multiple machines.

Kirjakulov concluded that containerization coupled with a seamless deployment tool offers a promising solution to researchers across industries and scientific fields. While universities like his often possess their own supercomputers or high-performance machines, these are most likely not available on-demand. Specialised machines in the cloud along with streamlined deployment software fill the gap by seamlessly connecting researchers to cloud, in just minutes, and provides a huge degree of flexibility in that it can serve as a unified access point to on-campus or on-premises computing resources, in addition to cloud.

If we are to solve the major problems of our time, we need to embrace new technologies and methods, such as containerization and automated deployment tools, to make it easier for those who are attempting to save us to do just that.

Sign up for the free insideBIGDATA newsletter.

Leave a Comment

*

Resource Links: