In this video from the ISC Big Data’14 Conference, DK Panda from Ohio State University presents: Performance Optimization of Hadoop Using InfiniBand RDMA.
The Hadoop framework has become the most popular open-source solution for Big Data processing. Traditionally, Hadoop communication calls are implemented over sockets and do not deliver best performance on modern clusters with high-performance interconnects. This talk will examine opportunities and challenges in optimizing performance of Hadoop with Remote DMA (RDMA) support, as available with InfiniBand, RoCE (RDMA over Converged Enhanced Ethernet) and other modern interconnects. The talk will start with an overview of the RDMA for Apache Hadoop project (http://hibd.cse.ohio-state.edu). Then, high-performance designs using RDMA to accelerate the Hadoop framework on InfiniBand and RoCE clusters will be demonstrated. Specific designs and case-studies to accelerate multiple components of Hadoop (such as HDFS, MapReduce, RPC, and HBase) will be presented. An overview of a set of benchmarks (OSU HiBD Benchmarks, OHB) to evaluate performance of different components of the Hadoop framework in an isolated manner will be presented. The presentation will also include initial results on optimizing performance of the new Spark framework using RDMA.”