High Performance Computing: Answering the Big Data Dilemma

Print Friendly, PDF & Email

In this special guest feature, Jeff Reser, Global Product Marketing Manager of SUSE, suggests that as HPC’s prowess in business expands, so does its ability to solve a variety of data management problems. Individuals struggling to tackle Big Data’s most complex challenges should increasingly look at HPC to deliver the power and sophistication required to manage large volumes and varieties of data. Jeff currently drives strategy and planning for Linux for High Performance Computing at SUSE. Jeff has a background in astrophysics and a wealth of big data experience at both IBM and Progress DataDirect.

If you consider yourself a tech news junkie like me, you probably see at least one article a day on how Big Data is transforming various industries and technologies. But the reality is Big Data analysis can be extremely time and resource consuming. Luckily, there is a solution that’s no stranger to complex analysis and data evaluation: High Performance Computing (HPC).

HPC can be a bit hard to define, but it “generally refers to the practice of aggregating computing power in a way that delivers much higher performance than one could get out of a typical desktop computer or workstation to solve large problems in science, engineering, or business.” Many associate HPC with scientific studies, NASA projects or large pharma trials. But with more industries becoming dependent on complex data analysis, HPC is taking center stage.

A Long-term Relationship: Big Data and HPC

Before the term “Big Data” became popular, many scientific and engineering industries were already using HPC environments to help deal with the overwhelming amount of data and analysis involved in their fields. HPC environments are known for their ability to support the modeling and simulation of complex programs, which can be anything from product designs for cars, planes and pharmaceuticals to global weather patterns and interstellar research.

While complicated data analysis may have been limited to scientific fields in the past, businesses across a variety of verticals are now dependent on their ability to access and analyze data quickly. This can be difficult, especially when it comes to determining actionable patterns and insights from unstructured data. And unstructured data is pouring in through multiple forms and from a wide variety of sources: sensors, logs, emails, social media, pictures, videos, medical images, transaction records and GPS signals.

The New Generation of HPC Industries

The rise in Big Data, particularly unstructured data, is causing a new wave of commercial organizations to use HPC for the first time to support their high-performance applications.  In banking and financial services, for example, high-performance applications include risk modeling in determining aggregate risk in financial portfolios, fraud detection in real-time for the millions of transactions that are being processed between disparate systems, high-frequency trading applications and pricing and regulatory compliance applications.

In manufacturing, high-performance applications include automotive modeling and design, oil and gas exploration, smart cities and power grids and autonomous vehicle design.  In healthcare, high-performance applications include precision and personalized medicine, on-demand diagnosis and treatment plans, drug research and remote surgeries.  All of these use cases are popping up across these industries, and all.

The way to gain new insights and get the most value out of big data is through a strong HPC infrastructure, one that can handle the tsunami of data analytics and machine learning applications, discover hidden patterns, and then track patterns dynamically as they form and evolve.

Putting the Performance in HPC

If you’ve ever built a derby car, you know that success depends on the foundation. In a similar fashion, Big Data requires the right HPC infrastructure and resources to support the high-performance data analytics that power artificial intelligence applications.  Traditional enterprise IT technology can’t handle the complex and time-critical workloads that these applications require. The solution? High-quality infrastructure, delivered through a Linux operating system.

Disruptive technologies are driving the use cases of the near future and many are enabled by a strong HPC and Linux infrastructure.  These include cognitive computing, the Internet of Things, HPC in the cloud and smart cities.

The latest Top500 report shows how dominant Linux is among the top 500 supercomputer sites around the world, holding a whopping 99.6 percent share.  With HPC being adopted across more and more industries and experiencing exponential growth, I expect to see pervasive growth of Linux as well to handle the new wave of high-performance workloads.

Linux is ideal for HPC environments because of its ability to manage complex clusters and exploit HPC levels of compute power and storage.  Designed for maximum performance and scaling, HPC configurations have separate compute and storage clusters connected via a high-speed interconnect fabric.  Whether it’s a massive supercomputer or clusters of more affordable computers, Linux can leverage the parallelism of those complex clusters.

Final Thoughts

As HPC’s prowess in business expands, so does its ability to solve a variety of data management problems. Individuals struggling to tackle Big Data’s most complex challenges should increasingly look at HPC to deliver the power and sophistication required to manage large volumes and varieties of data. With HPC delivering a solid backbone for Big Data analysis, today’s data scientists are equipped with the technology they need to tackle their businesses’ toughest data challenges.

 

Sign up for the free insideBIGDATA newsletter.

Speak Your Mind

*