This guide to big data solutions in the cloud, from the editors of insideBIGDATA, explores high performance storage solutions in the cloud for an exploding commercial data universe. Each week we’ll release a section of this guide or you can download the complete PDF from the insideBIGDATA White Paper Library courtesy of Intel.
The recent release of a commercial version of the Lustre* parallel file system running on Amazon Web Services (AWS) was big news for business data centers facing ever expanding data analysis and storage demands. Now, Lustre, the predominant high-performing file system installed in most of the supercomputer installations around the world, could be deployed to business customers in a hardened, tested, easy to manage and fully supported distribution in the cloud.
Proven to scale up to extreme levels of storage performance and capacity as measured in tens or even hundreds of petabytes, shared and accessible to tens of thousands of clients, Lustre offers high throughput with high availability using vendor-neutral for server, storage and interconnect hardware coupled with various distributions of Linux.
In this guide to big data solutions in the cloud, we take a look at what Lustre on AWS delivers for a broad community of business and commercial organizations struggling with the challenge of big data and demanding storage growth. Including:
- Data – The Next Big Challenge
- Attaining High-Performance Scalable Cloud Storage
- Lustre 101
- Commercial-Grade Lustre in the Cloud
- Lustre: Scalability, Affordability, Manageability
- High Performance Data Analytics in the Cloud
Data – The Next Big Challenge
For a long time, the industry’s biggest technical challenge was squeezing as many compute cycles as possible out of silicon chips so they could get on with solving the really important, and often gigantic problems in science and engineering faster than was ever thought possible. Now, by clustering computers to work together on problems, scientists are free to consider even larger and more complex real-world problems to compute, and data to analyze.
Now the new technical challenge is keeping those very fast compute engines fed with a constant and rich stream of data.
Until recently, this data challenge was limited to scientists at the national labs doing large-scale government-sponsored supercomputing research in astrophysics, weather prediction and climate modeling, or in data-intensive industries ranging from large scale manufacturing, such as aerospace and automotive, to oil and gas exploration and processing.
Today, detailed analyses over extreme volumes of data has become a major challenge for commercial enterprises as they attempt to solve complex business computing problems, from predicting financial risk and bringing drugs to market faster, to detecting patterns of fraud over millions of internet transactions. Enterprise data centers have discovered that their conventional hardware/software configurations and approaches are not nearly adequate to handle this explosion of data-intensive computing. Businesses are finding that those High Performance Computing (HPC) solutions developed in the labs need to be re-purposed to create solutions for performing complex commercial computing tasks for the world of High Performance Data Analytics (HPDA).
Next week we’ll explore Attaining High-Performance Scalable Cloud Storage. If you prefer you can download the complete guide in a PDF from the insideBIGDATA White Paper Library courtesy of Intel.
* Other names and brands may be claimed as the property of others.