Over at ReadWrite, Brian Proffitt writes that the coming release of Hadoop 2.0 will make information found within data warehouses and unstructured “data lakes” more accessible than ever.
For Arun Murthy, the release manager for Hadoop 2.0, the most important change will be upgrading the MapReduce framework to Apache YARN, which will expand what software can be used in Hadoop and how much. Murthy, who is also YARN project lead and co-founder of Hortonworks, explained that “In Hadoop 1.0, everything was batch-oriented. In 2.0, you will now have multiple apps hitting the data inside all at once.” What YARN does, essentially, is divide the functionality of MapReduce even further, breaking the two major responsibilities of the MapReduce JobTracker component – resource management and job scheduling/monitoring – into separate daemons: a global ResourceManager and per-application ApplicationMaster.
Read the Full Story.