MapR Technologies, Inc., a provider of Apache™ Hadoop® technology for big data deployments, today announced at the O’Reilly Strata Conference (booth #501): Making Data Work, the latest MapR Distribution including Hadoop 2.2 with YARN. YARN delivers next-generation resource management and is taken to the next level within a MapR cluster by combining flexible resource management with the reliability and real-time capability of MapR’s next-generation data platform.
YARN’s resource management and scheduling capabilities allow Hadoop applications to share a cluster’s compute resources, thereby increasing the overall efficiency and utilization of the cluster. By combining YARN with MapR’s read-write (R/W) POSIX data platform, MapR enables YARN-based applications to not only run on a Hadoop cluster and share compute resources, but also read, write and update data in the underlying distributed file system and database tables. As a result, organizations now have the ability to develop and deploy a much broader set of Big Data Hadoop applications.
YARN opens up Hadoop for processing patterns beyond just MapReduce,” said Evan Quinn, research director, Enterprise Management Associates. “MapR’s Hadoop distribution extends YARN even further by adding a full, open standard NFS interface in addition to HDFS, enabling non-MapReduce applications to optimally take advantage of a cluster’s storage.”
MapR is also announcing that it enables organizations to run the Hadoop MapReduce 1.x and YARN schedulers on the same nodes in the cluster simultaneously, providing a path for MapReduce 1.x users to upgrade to the new Hadoop scheduler. MapR is commitment to backward compatibility and customer success. MapR also provides the ability to run third-party services that are not YARN-compatible on the same cluster.
comScore runs more than 20,000 jobs each day on its production MapR cluster,” said Michael Brown, CTO, comScore. “We are excited that MapR is delivering Hadoop 2.0 and that MapR is providing a seamless upgrade path by supporting MapReduce 1.x and YARN on the same cluster.”
YARN-based applications on MapR inherit the high availability, data protection, disaster recovery, security, and performance of the MapR Distribution. Moreover, YARN-based applications are more real time with MapR because the MapR file system uniquely enables streaming writes, giving YARN-based applications immediate access to the latest operational data.
With this release, MapR continues to provide broad support for open source projects of any Hadoop-powered distribution. The MapR Distribution now includes over one dozen open source projects, including Apache projects Hive, Pig, Solr, Oozie, Flume, Sqoop, HBase, and ZooKeeper, as well as Apache-licensed open source projects such as Multitool, Hue, Impala, and Cascading. In addition, MapR is an active participant and contributor in the Apache Hadoop community and continuously evaluates and adds new projects to its distribution, with many expected in 2014.
As YARN expands Hadoop use cases in the enterprise, the need for enterprise-grade dependability, interoperability and performance increases exponentially,” said Tomer Shiran, vice president, product management, MapR Technologies. “The combination of YARN and the MapR Data Platform delivers the only distribution for Hadoop in which both YARN and non-YARN distributed Big Data applications share the compute and storage resources of large-scale clusters.”
The MapR Distribution including Apache Hadoop YARN will be available in March. For more information on the beta program, send an email to firstname.lastname@example.org.
Sign up for the free insideBIGDATA newsletter.