Sign up for our newsletter and get the latest big data news and analysis.

New SGI Scale-out Solution for SAP HANA Does Real-Time Applications

Today SGI announced that enterprises can now leverage the Intel-based SGI UV 300H server in a multi-node cluster (scale out) to run SAP Business Warehouse (SAP BW) on SAP HANA or new SAP BW/4HANA. Unique to SGI, the cluster nodes can later be reconfigured as single-node systems with 1 to 32TB of shared memory (scale up) to run SAP S/4HANA and other real-time applications. “For large enterprises that plan to migrate to SAP S/4HANA but wish to begin their journey to SAP HANA with SAP BW, our new SGI cluster offering is unquestionably the optimal solution,” said Jorge Titinger, president and CEO, SGI. “The scalability of the SGI UV 300H architecture coupled with our expertise in mission-critical environments provides an ideal path to real-time business with SAP HANA.”

Video: Why use Tables and Graphs for Knowledge Discovery System?

In this video from the 2016 HPC User Forum in Austin, John Feo from PNNL presents: Why use Tables and Graphs for Knowledge Discovery System? “GEMS software provides a scalable solution for graph queries over increasingly large data sets. As computing tools and expertise used in conducting scientific research continue to expand, so have the enormity and diversity of the data being collected. Developed at Pacific Northwest National Laboratory, the Graph Engine for Multithreaded Systems, or GEMS, is a multilayer software system for semantic graph databases. In their work, scientists from PNNL and NVIDIA Research examined how GEMS answered queries on science metadata and compared its scaling performance against generated benchmark data sets. They showed that GEMS could answer queries over science metadata in seconds and scaled well to larger quantities of data.”

Rescuing Lost History: Using Big Data to Recover Black Women’s Lived Experiences

“A lot of times when people think about big data, they think about it in ahistorical times…outside of this political context,” said Ruby Mendenhall, an associate professor of sociology at UIUC. “It’s really important to think about whose voice is digitized, in journals and newspapers. A lot of that for black women has been lost and you need to make a concerted effort to recover it.” Mendenhall’s study employs Latent Dirichlet allocation (LDA) algorithms and comparative text mining to search 800,000 periodicals in JSTOR (Journal Storage) and HathiTrust from 1746 to 2014 to identify the types of conversations that emerge about Black women’s shared experience over time.

Intel Xeon Phi Processor Code Modernization Nets Over 55x Faster NeuralTalk2 Image Tagging

“Benchmarks, customer experiences, and the technical literature have shown that code modernization can greatly increase application performance on both Intel Xeon and Intel Xeon Phi processors. Colfax Research recently published a study showing that image tagging performance using the open source NeuralTalk2 software can be improved 28x on Intel Xeon processors and by over 55x on the latest Intel Xeon Phi processors.”

Intel Scalable System Framework Facilitates Deep Learning Performance

In this special guest feature, Rob Farber from TechEnablement writes that the Intel Scalable Systems Framework is pushing the boundaries of Machine Learning performance. “machine learning and other data-intensive HPC workloads cannot scale unless the storage filesystem can scale to meet the increased demands for data.”

Adding Security and More to Intel® Enterprise Edition for Lustre* Software version 3.0

Intel Enterprise Edition for Lustre* Software has taken a leap toward greater enterprise capabilities and improved features for HPC with release of version 3.0. This latest version includes new security enhancements, dynamic LNET configuration support, ZFS snapshots, and other features asked for by the HPC community inside and outside the enterprise. Additionally, it adds the Intel Omni-Path Architecture drivers.

Teradata Showcases Presto at 2016 Hadoop Summit

“Presto is a perfect fit with the Teradata Unified Data Architecture, an integrated analytical ecosystem for our enterprise customers. Presto enables companies to leverage standard ANSI SQL to execute interactive queries against Hadoop data. With Presto, utilizing Teradata’s Query Grid connector for Presto, customers can execute queries that originate in Teradata Integrated Data Warehouse that join data within the IDW and Hadoop leveraging Presto.”

Slidecast: Announcing the Nvidia Deep Learning SDK

In this slidecast, Marc Hamilton from Nvidia describes the latest updates to the company’s Deep Learning Platform. “Great hardware needs great software. To help data scientists and developers make the most of the vast opportunities in deep learning, we’re announcing today at the International Supercomputing show, ISC16, a trio of new capabilities for our deep learning software platform. The three — NVIDIA DIGITS 4, CUDA Deep Neural Network Library (cuDNN) 5.1 and the new GPU Inference Engine (GIE) — are powerful tools that make it even easier to create solutions on our platform.”

Video: Machine Learning Overview from NERSC

In this video from the HPC User Forum in Tucson, Prabhat from NERSC presents: Machine Learning. “Prabhat leads the Data and Analytics Services team at NERSC. His current research interests include scientific data management, parallel I/O, high performance computing and scientific visualization.”

Best Practices – Big Data Acceleration

“This talk will provide an overview of challenges in accelerating Hadoop, Spark and Memcached on modern HPC clusters. An overview of RDMA-based designs for multiple components of Hadoop (HDFS, MapReduce, RPC and HBase), Spark, and Memcached will be presented. Enhanced designs for these components to exploit in-memory technology and parallel file systems (such as Lustre) will be presented. Benefits of these designs on various cluster configurations using the publicly available RDMA-enabled packages from the OSU HiBD project ( will be shown.”