Running Hadoop Clusters on Supercomputers

Over at the San Diego Supercomputing Center, Glenn K. Lockwood writes that users of the Gordon supercomputer can use the myHadoop framework to dynamically provision Hadoop clusters within a traditional HPC cluster and run quick jobs.

For the purposes of testing mappers and reducers, doing a lot of smaller analyses, and debugging issues, I found that being able to establish a semi-persistent Hadoop cluster on a traditional HPC resource to be very useful in its own right. While one can feasibly do this on Amazon EC2, doing so is annoying and costs money (unlike XSEDE and FutureGrid, which are free). I wanted to just get a Hadoop cluster running so that I could prototype code and learn features, and the process is quite simple. This page describes how to create a semi-persistent Hadoop cluster on a traditional HPC resource (supercomputer), and by semi-persistent, I mean that the Hadoop cluster will run for as long as you tell it to rather than just for the lifetime of a single map/reduce job.

Read the Full Story.

Filed Under: Imported

Running Hadoop Clusters on Supercomputers

Sponsored Guest Articles

Optimizing Performance and Cost Savings for Elastic on Pure Storage

White Papers

From complexity to clarity: Harnessing the power of AI/ML and risk-informed strategies to streamline clinical data management

Speak Your Mind Cancel reply

Featured RSS Feed

More News from insideHPC

Running Hadoop Clusters on Supercomputers

Sponsored Guest Articles

Optimizing Performance and Cost Savings for Elastic on Pure Storage

White Papers

From complexity to clarity: Harnessing the power of AI/ML and risk-informed strategies to streamline clinical data management

Join Us On Social Media

Speak Your Mind Cancel reply

Related Posts

Featured RSS Feed

More News from insideHPC