Sign up for our newsletter and get the latest big data news and analysis.

Structuring Apache Spark 2.0: SQL, DataFrames, Datasets And Streaming

In the talk below, Michael Armbrust, gives an overview of some of the exciting new API’s available in Spark 2.0, namely Datasets and Structured Streaming. Together, these APIs are bringing the power of Catalyst, Spark SQL’s query optimizer, to all users of Spark.

Databricks Sets New World Record for CloudSort Benchmark Using Apache Spark at $1.44 Per Terabyte

Databricks®, the company founded by the the team that created the popular Apache® Spark™ project, announced that in collaboration with industry partners, it has broken the world record in the CloudSort Benchmark, a third-party industry benchmarking competition for processing large datasets.

Databricks Adds Deep Learning Support to Cloud-Based Apache Spark Platform

Databricks®, the company founded by the creators of the Apache® Spark™ project, today announced the addition of deep learning support to its cloud-based Apache Spark platform.

How Viacom Built a Just-in-Time Data Warehouse

In the video presentation below from Spark Summit East 2016 conference, Viacom, the global media company, explains how they are using Apache Spark and Databricks to quickly adapt to their audience by building a just-in-time data warehouse.

Apache Spark MLlib 2.0 Preview: Data Science and Production

From the recent Spark Summit 2016 in San Francisco, the video presentation below by Joseph K. Bradley of Databricks give focus to “Apache Spark MLlib 2.0 Preview: Data Science and Production.”

Databricks Becomes the First Vendor to Provide Support for Apache® Spark™ 2.0 on Its Just-in-Time Data Platform

Databricks, the company founded by the team that created Apache® Spark™, today announced that Apache Spark 2.0 is generally available on its just-in-time data platform, making it the first vendor to offer Apache Spark 2.0 support.

Databricks Announces General Availability of Community Edition

Databricks, the company founded by the team that created Apache® Spark™, announced the General Availability of Databricks Community Edition (DCE), a free version of the just-in-time data platform built on top of open source Apache Spark.

Can Spark Data Tools Stamp Out Cyber Crime?

The video presentation below discusses how big data engines like Apache Spark are being deployed to help detect and put an end to ad fraudulence. Spark allows for enterprises across various sectors, including security firms, to extract data in real time to catch patterns and help halt fraudulent activities and breaches earlier.

Databricks Offers APIs to Enable Agile Application Development with Apache Spark for the Enterprise

Databricks, the company behind Apache Spark, launched a new set of APIs that will enable enterprises to automate their Spark infrastructure to accelerate the deployment of production data-driven applications.

Databricks Announces Community Edition of Cloud-Based Platform

Databricks, the company behind Apache Spark, today announced the beta release of Databricks Community Edition, a free version of the cloud-based big data platform at Spark Summit East. This service will provide users with access to a micro-cluster as well as a cluster manager and notebook environment, making it ideal for developers, data scientists, data engineers and other IT professionals to learn Spark.