Sign up for our newsletter and get the latest big data news and analysis.

Databricks Simplifies and Scales Deep Learning with New Apache Spark Library

Databricks, the company founded by the creators of the popular Apache Spark project, announced Deep Learning Pipelines, a new library to integrate and scale out deep learning in Apache Spark.

Databricks Launches New Edition of Its Spark-Based Cloud Platform for Data Engineers

Databricks, the company founded by the creators of the popular Apache Spark project and providers of the leading Spark-based cloud platform for data science, announced an edition of its cloud platform optimized specifically for data engineering workloads called Databricks for Data Engineering.

Data as a Critical Element in the Discovery and Delivery of Smart Energy

In this contributed article, Jules S. Damji, an Apache Spark Community Evangelist with Databricks, shows how as the value of data continues to grow, the next-generation smart grid should become a reality, benefiting utility companies and consumers alike.

The Leaky Pipeline Problem -
 Making your Mark as a Woman in Big Data

insideBIGDATA was on hand for the recent Spark Summit East 2017 conference in Boston, and one of the more compelling presentations was by Kavitha Mariappan, VP Marketing at Databricks. The talk focused on the premise that despite the tremendous growth and opportunities in big data today, women still play a small role in this arena.

Structuring Apache Spark 2.0: SQL, DataFrames, Datasets And Streaming

In the talk below, Michael Armbrust, gives an overview of some of the exciting new API’s available in Spark 2.0, namely Datasets and Structured Streaming. Together, these APIs are bringing the power of Catalyst, Spark SQL’s query optimizer, to all users of Spark.

Databricks Sets New World Record for CloudSort Benchmark Using Apache Spark at $1.44 Per Terabyte

Databricks®, the company founded by the the team that created the popular Apache® Spark™ project, announced that in collaboration with industry partners, it has broken the world record in the CloudSort Benchmark, a third-party industry benchmarking competition for processing large datasets.

Databricks Adds Deep Learning Support to Cloud-Based Apache Spark Platform

Databricks®, the company founded by the creators of the Apache® Spark™ project, today announced the addition of deep learning support to its cloud-based Apache Spark platform.

How Viacom Built a Just-in-Time Data Warehouse

In the video presentation below from Spark Summit East 2016 conference, Viacom, the global media company, explains how they are using Apache Spark and Databricks to quickly adapt to their audience by building a just-in-time data warehouse.

Apache Spark MLlib 2.0 Preview: Data Science and Production

From the recent Spark Summit 2016 in San Francisco, the video presentation below by Joseph K. Bradley of Databricks give focus to “Apache Spark MLlib 2.0 Preview: Data Science and Production.”

Databricks Becomes the First Vendor to Provide Support for Apache® Spark™ 2.0 on Its Just-in-Time Data Platform

Databricks, the company founded by the team that created Apache® Spark™, today announced that Apache Spark 2.0 is generally available on its just-in-time data platform, making it the first vendor to offer Apache Spark 2.0 support.