In this special guest feature, Rob Farber from TechEnablement writes that the Intel Scalable Systems Framework is pushing the boundaries of Machine Learning performance. “machine learning and other data-intensive HPC workloads cannot scale unless the storage filesystem can scale to meet the increased demands for data.”
Intel Enterprise Edition for Lustre* Software has taken a leap toward greater enterprise capabilities and improved features for HPC with release of version 3.0. This latest version includes new security enhancements, dynamic LNET configuration support, ZFS snapshots, and other features asked for by the HPC community inside and outside the enterprise. Additionally, it adds the Intel Omni-Path Architecture drivers.
Converging High Performance Computing (HPC) and Lustre* parallel file systems with Hadoop’s MapReduce for Big Data analytics can eliminate the need for Hadoop’s infrastructure and speeding up the entire analysis. Convergence is a solution of interest for companies with HPC already in their infrastructure, such as the financial services Industry and other industries adopting high performance data analytics.
Today SGI announced that global deployments of the SGI UV 300H single-node system provide in total over 200 Terabytes of in-memory computing capacity to organizations running the SAP HANA platform. Introduced just one year ago, more than 50 SGI UV 300H systems have been installed in organizations to run a variety of applications on SAP HANA, including the SAP ERP, SAP Supply Chain Management (SCM), SAP Bank Analyzer, and SAP Business Warehouse applications, as well as advanced analytics.
A number of industries rely on high-performance computing (HPC) clusters to process massive amounts of data. As these same organizations explore the value of Big Data analytics based on Hadoop, they are realizing the value of converging Hadoop and HPC onto the same cluster rather than scaling out an entirely new Hadoop infrastructure.
Jim McHugh from Cisco describes how the new Intel Xeon processor E7 v3 processor family will bring to Cisco UCS systems in the big data and analytics arena. He emphasizes how new insights driven by big-data can help businesses become intelligence-driven to create a perpetual and renewable competitive edge within their field.
“The Hadoop MapReduce framework grew out of an effort to make it easy to express and parallelize simple computations that were routinely performed at Google. It wasn’t long before libraries, like Apache Mahout, were developed to enable matrix factorization, clustering, regression, and other more complex analyses on Hadoop. Now, many of these libraries and their workloads are migrating to Apache Spark because it supports a wider class of applications than MapReduce and is more appropriate for iterative algorithms, interactive processing, and streaming applications.”
“When organizations operate both Lustre and Apache Hadoop within a shared HPC infrastructure, there is a compelling use case for using Lustre as the file system for Hadoop analytics, as well as HPC storage. Intel Enterprise Edition for Lustre includes an Intel-developed adapter which allows users to run MapReduce applications directly on Lustre. This optimizes the performance of MapReduce operations while delivering faster, more scalable, and easier to manage storage.”
In this video from the 2014 Lustre Administrators and Developers Conference, Brent Gorda from Intel describes how the company is adding enterprise features to the Lustre File System.
In this video from the LAD’14 Lustre Administrators and Developers Conference in Reims, Rekha Singhal from Tata Consultancy Services presents: Performance Comparison of Intel Enterprise Edition Lustre and HDFS for MapReduce Applications.