Data Science 101: What’s Coming for Spark in 2015

Print Friendly, PDF & Email

Apache Spark took the data science world by storm in 2014 as a technology foundation for big data applications. In the talk below from the Bay Area Spark User Meetup, Patrick Wendell from Databricks speaks about new developments in Spark and identifies areas of focus in the coming year. A major focus for Spark in 2015 is providing extension points for integration of Spark with other system, libraries, and programming models.

Spark has a new a data sources API that allows for optimized input and output into many storage systems (relational databases, noSQL storage, etc) with high throughput and minimal configuration. Spark’s machine learning and streaming libraries are also providing new extension points to allow for deeper integration of user libraries. Finally, to help users discover new user libraries for Spark, we’ve introduced a new community package index Spark Packages. The talk covers these developments and highlights relevant features in the recent 1.2 release.


Earn your master’s in predictive analytics completely online from Northwestern University.

Speak Your Mind