Sign up for our newsletter and get the latest big data news and analysis.

Apache Spark Survey 2016 Report

More than 1,600 members of the Apache Spark community from over 900 organizations have spoken, and Spark continues to be the most active open-source project in the big data space today. The 2016 Databricks Apache Spark Survey shows a rise in production deployments of Spark in the public cloud, as well as an increased usage […]

InsideBIGDATA: An Insider’s Guide to Apache Spark

Apache Spark is an open source cluster computing framework originally developed in 2009 at the AMPLab at University of California, Berkeley but was later donated in 2013 to the Apache Software Foundation where it remains today. Spark allows for quick analysis and model development, plus it provides access to the full data set thus avoiding the need to subsample, as often needed in environments like R.