Apache Spark Survey Reveals Increased Growth in Users and New Workloads Including Exploratory Data Science and Machine Learning

In order to better understand Apache Spark’s growing role in big data, Taneja Group conducted a major market research project, surveying approximately 7,000 people. The sample was made up of technical and managerial job roles from around the world directly involved in big data.

Splice Machine Announces Native PL/SQL Support to Accelerate Migrations from Oracle to Hadoop

Splice Machine, provider of the open-source SQL RDBMS powered by Hadoop and Spark, announced that it now supports native PL/SQL on Splice Machine.

Bigstep Launches High-Performance, Low-Latency Spark-as-a-Service for Real-Time Streaming Applications

Bigstep, the big data cloud provider, today launched a bare-metal Spark-as-a-Service offering.

Databricks Adds Deep Learning Support to Cloud-Based Apache Spark Platform

Databricks®, the company founded by the creators of the Apache® Spark™ project, today announced the addition of deep learning support to its cloud-based Apache Spark platform.

Distributed System Architectures for Healthcare and Life Sciences

The insideBIGDATA Guide to Healthcare & Life Sciences is a useful new resource directed toward enterprise thought leaders who wish to gain strategic insights into this exciting new area of technology. This segment focuses on the use of distributed system architectures – Hadoop and Spark.

Apache Spark Survey 2016 Report

More than 1,600 members of the Apache Spark community from over 900 organizations have spoken, and Spark continues to be the most active open-source project in the big data space today. The 2016 Databricks Apache Spark Survey shows a rise in production deployments of Spark in the public cloud, as well as an increased usage […]

Gigaspaces Launches the Next Generation Apache Spark Distribution

GigaSpaces, a provider of in-memory computing (IMC) technologies, launched InsightEdge, a data grid-enabled real-time analytics platform that incorporates Apache Spark to dramatically enhance fast data analytics.

InsideBIGDATA: An Insider’s Guide to Apache Spark

Apache Spark is an open source cluster computing framework originally developed in 2009 at the AMPLab at University of California, Berkeley but was later donated in 2013 to the Apache Software Foundation where it remains today. Spark allows for quick analysis and model development, plus it provides access to the full data set thus avoiding the need to subsample, as often needed in environments like R.

Scalable Deep Learning Platform On Spark In Baidu

In the presentation below, Weide Zhang is a Senior Architect at Baidu, talks about his team’s work in using Spark to drive deep learning training and prediction using Paddle, the deep learning library developed by Baidu IDL.

Apache Spark MLlib 2.0 Preview: Data Science and Production

From the recent Spark Summit 2016 in San Francisco, the video presentation below by Joseph K. Bradley of Databricks give focus to “Apache Spark MLlib 2.0 Preview: Data Science and Production.”