In this video from the HPC User Forum in Tucson, Prabhat from NERSC presents: Machine Learning. “Prabhat leads the Data and Analytics Services team at NERSC. His current research interests include scientific data management, parallel I/O, high performance computing and scientific visualization.”
“This talk will provide an overview of challenges in accelerating Hadoop, Spark and Memcached on modern HPC clusters. An overview of RDMA-based designs for multiple components of Hadoop (HDFS, MapReduce, RPC and HBase), Spark, and Memcached will be presented. Enhanced designs for these components to exploit in-memory technology and parallel file systems (such as Lustre) will be presented. Benefits of these designs on various cluster configurations using the publicly available RDMA-enabled packages from the OSU HiBD project (http://hibd.cse.ohio-state.edu) will be shown.”
Today Cornell University announced a five-year, $5 million project sponsored by the National Science Foundation to build a federated cloud comprised of data infrastructure building blocks (DIBBs) designed to support scientists and engineers requiring flexible workflows and analysis tools for large-scale data sets, known as the Aristotle Cloud Federation.
Today SGI, a global leader in high-performance solutions for compute, data analytics, and data management introduced the SGI UV 300RL for big data in-memory analytics. As a new model in the SGI UV server line certified and supported with Oracle Linux, the SGI UV 300RL provides up to 32 sockets and 24 terabytes of shared memory. The solution enables enterprises that have standardized on Intel-based servers to run Oracle Database In-Memory on a single system to help achieve real-time operations and accelerate data analytics at unprecedented scale.
Today SGI announced that global deployments of the SGI UV 300H single-node system provide in total over 200 Terabytes of in-memory computing capacity to organizations running the SAP HANA platform. Introduced just one year ago, more than 50 SGI UV 300H systems have been installed in organizations to run a variety of applications on SAP HANA, including the SAP ERP, SAP Supply Chain Management (SCM), SAP Bank Analyzer, and SAP Business Warehouse applications, as well as advanced analytics.
Joseph George from HP presented this talk at the recent HPC User Forum. “This paper describes the HP Big Data Reference Architecture (BDRA) solution and outlines how a modern architectural approach to Hadoop provides the basis for consolidating multiple big data projects while, at the same time, enhancing price/performance, density, and agility. HP BDRA is a modern, flexible architecture for the deployment of big data solutions; it is designed to improve access to big data, rapidly deploy big data solutions, and provide the flexibility needed to optimize the infrastructure in response to the ever-changing requirements in a Hadoop ecosystem.”
“Pre-integrated with the Hadoop and Spark frameworks, the Urika-XA system combines the benefits of a turnkey analytics appliance with a flexible, open platform that you can modify for future analytics workloads. This single-platform consolidation of workloads reduces your analytics footprint and total cost of ownership.”
“There are a number of Bayesian modelling packages available, but how do you know which one to use? This talk will take you through the positives and negatives of the major packages, focusing on the specifics of my work in health statistics, as well as providing a general overview of what these packages can do.”
In this video from the PyData Seattle Conference, Lorena Barba from George Washington University presents: Data-driven Education and the Quantified Student. “Education has seen the rise of a new trend in the last few years: Learning Analytics. This talk will weave through the complex interacting issues and concerns involving learning analytics, at a high level. The goal is to whet the appetite and motivate reflection on how data scientists can work with educators and learning scientists in this swelling field.”
“The Seagate 1200.2 SSD family includes the next-generation of high-capacity, high-performance SAS SSDs designed with multiple endurance offerings optimized for demanding enterprise applications and maximum TCO savings. The 1200.2 SAS SSD family delivers ultra-fast, consistent and easily scalable performance that exceeds 12Gb/s SAS single port bandwidth. By removing the storage bottleneck, it closes the gap between processor and data storage performance and significantly improves overall system and application responsiveness.”