Spark 101: Running Spark and MapReduce together in Production

Print Friendly, PDF & Email

Clusters must be tuned properly to run memory-intensive systems like Spark, H2O, and Impala alongside traditional MapReduce jobs. This Hadoop Summit 2015 talk describes Altiscale’s experience running the new memory-intensive systems in production for its customers. The discussion focuses on the cluster tuning needed to create environments that run a mix of processing frameworks reliably and efficiently. The results show that there’s no need to rip out and replace MapReduce clusters in favor of Spark, or any other memory-intensive system. The slides for this presentation are available HERE.


Sign up for the free insideBIGDATA newsletter.

Speak Your Mind