Spark 101: Running Spark and MapReduce together in Production

July 15, 2015 by Daniel Gutierrez Leave a Comment

Clusters must be tuned properly to run memory-intensive systems like Spark, H2O, and Impala alongside traditional MapReduce jobs. This Hadoop Summit 2015 talk describes Altiscale’s experience running the new memory-intensive systems in production for its customers. The discussion focuses on the cluster tuning needed to create environments that run a mix of processing frameworks reliably and efficiently. The results show that there’s no need to rip out and replace MapReduce clusters in favor of Spark, or any other memory-intensive system. The slides for this presentation are available HERE.

Sign up for the free insideBIGDATA newsletter.

Filed Under: Big Data, Big Data Software, Featured, Google News Feed, Hadoop, inside SPARK, News / Analysis, Spark 101, Uncategorized, Video Tagged With: Weekly Featured Newsletter Post

Optimizing Performance and Cost Savings for Elastic on Pure Storage
[SPONSORED POST] Organizations can now confidently embrace Elastic, enhance their hot tier storage, and seamlessly manage historical data with cost-efficient capacity-optimized storage. Pure Storage not only meets the demands of the modern data landscape but also empowers organizations to simplify their Elastic architecture, reflecting the industry trend towards a more streamlined and efficient approach.

From complexity to clarity: Harnessing the power of AI/ML and risk-informed strategies to streamline clinical data management

In today’s fast-paced world, driven by demands for speed and efficiency, the field of clinical development has undergone a remarkable transformation. The way trials are being conducted has changed significantly with decentralized clinical trials (DCT) becoming mainstream and the collection of clinical data from wearables and other remote-monitoring devices becoming common practice. While these advances […]

Download

Spark 101: Running Spark and MapReduce together in Production

Sponsored Guest Articles

Optimizing Performance and Cost Savings for Elastic on Pure Storage

White Papers

From complexity to clarity: Harnessing the power of AI/ML and risk-informed strategies to streamline clinical data management

Speak Your Mind Cancel reply

Featured RSS Feed

More News from insideHPC

Spark 101: Running Spark and MapReduce together in Production

Sponsored Guest Articles

Optimizing Performance and Cost Savings for Elastic on Pure Storage

White Papers

From complexity to clarity: Harnessing the power of AI/ML and risk-informed strategies to streamline clinical data management

Join Us On Social Media

Speak Your Mind Cancel reply

Related Posts

Featured RSS Feed

More News from insideHPC