Introduction to Spark Webinar

This Introduction to SPARK webinar will feature Daniel Gutierrez, Managing Editor of insideBIGDATA.

In the past year, the Apache Spark distributed computing architecture has continued its upward trajectory amongst the big data players. Its growth has been fueled by several innovative differentiators for big data applications, such as MapReduce 2.0 (or YARN), provisions for analytic workflows, and efficient use of memory. Databricks’ recent 2015 Spark industry survey reports that Spark adoption is outpacing Hadoop because of its accelerated access to big data. In support of this new computing architecture.

Spark 101: MapR Free On-Demand Training Now Includes Apache Spark

MapR Technologies, Inc., provider of a leading distribution for Apache™ Hadoop® that integrates web-scale enterprise storage and real-time database capabilities, announced the availability of the first free Apache Spark course as part of a new series in its Hadoop On-Demand Training program.

Spark 101: Estimating Financial Risk with Spark

The talk below by Sandy Ryza, walks through a basic VaR calculation, aiming to give a feel for what it is like to approach financial modeling with Spark.

Datameer’s Stefan Groschupf on the Future of Spark

As another episode of the Big Data & Brews industry perspectives series, Stefan Groschupf, CEO of our friends over at Datameer, shares his thoughts on the future of Spark and how it is part of an evolution in the Hadoop environment.

Spark 101: Anatomy of RDD – Deep Dive Into Spark RDD Abstraction

As Apache Spark continues its exponential rise in popularity as a big data platform, the presentation included below dives deeper into architecture -a detailed discussion about how RDD is constructed, transformed and executed over the cluster.

Spark 101: Spark Streaming and GraphX at Netflix

The Bay Area Spark Meetup recently was hosted at Netflix to feature talks by Netflix engineers about their use of Spark Streaming and GraphX, as well as a Q&A session with the Netflix folks plus the lead engineer of Spark Streaming. The presentation is provided here with the abstracts of the two talks below.

Spark 101: Running Spark and MapReduce together in Production

Clusters must be tuned properly to run memory-intensive systems like Spark, H2O, and Impala alongside traditional MapReduce jobs. This Hadoop Summit 2015 talk describes Altiscale’s experience running the new memory-intensive systems in production for its customers.

Hadoop 101: Learning the Core Elements of Hadoop

A background in Hadoop can serve as a valuable differentiator for professionals interested in building new areas of expertise in managing and analyzing data. To address this need, MapR launched earlier this year a free, comprehensive On-Demand Training program that can lead to becoming a certified Hadoop professional.