Sign up for our newsletter and get the latest big data news and analysis.

How Viacom Built a Just-in-Time Data Warehouse

In the video presentation below from Spark Summit East 2016 conference, Viacom, the global media company, explains how they are using Apache Spark and Databricks to quickly adapt to their audience by building a just-in-time data warehouse.

Apache Spark MLlib 2.0 Preview: Data Science and Production

From the recent Spark Summit 2016 in San Francisco, the video presentation below by Joseph K. Bradley of Databricks give focus to “Apache Spark MLlib 2.0 Preview: Data Science and Production.”

Databricks Becomes the First Vendor to Provide Support for Apache® Spark™ 2.0 on Its Just-in-Time Data Platform

Databricks, the company founded by the team that created Apache® Spark™, today announced that Apache Spark 2.0 is generally available on its just-in-time data platform, making it the first vendor to offer Apache Spark 2.0 support.

Databricks Announces General Availability of Community Edition

Databricks, the company founded by the team that created Apache® Spark™, announced the General Availability of Databricks Community Edition (DCE), a free version of the just-in-time data platform built on top of open source Apache Spark.

Can Spark Data Tools Stamp Out Cyber Crime?

The video presentation below discusses how big data engines like Apache Spark are being deployed to help detect and put an end to ad fraudulence. Spark allows for enterprises across various sectors, including security firms, to extract data in real time to catch patterns and help halt fraudulent activities and breaches earlier.

Databricks Offers APIs to Enable Agile Application Development with Apache Spark for the Enterprise

Databricks, the company behind Apache Spark, launched a new set of APIs that will enable enterprises to automate their Spark infrastructure to accelerate the deployment of production data-driven applications.

Databricks Announces Community Edition of Cloud-Based Platform

Databricks, the company behind Apache Spark, today announced the beta release of Databricks Community Edition, a free version of the cloud-based big data platform at Spark Summit East. This service will provide users with access to a micro-cluster as well as a cluster manager and notebook environment, making it ideal for developers, data scientists, data engineers and other IT professionals to learn Spark.

Data Exploration with Databricks

The “Data Exploration on Databricks” jump start video below will show you how go from data source to visualization in a few easy steps. Specifically, you’ll see how to take semi-structured logs, easily extract and transform them, analyze and visualize the data using Spark SQL, so you can quickly understand your data.

Spark MLlib: Making Practical Machine Learning Easy and Scalable

In this talk, Xiangrui Meng of Databricks shares his experience in developing MLlib. The talk covers both higher-level APIs, ML pipelines, that make MLlib easy to use, as well as lower-level optimizations that make MLlib scale to massive data sets.

Advanced Apache Spark

Big data is going Spark crazy! Here’s a whopping 6 hour intensive, fast-paced and vendor agnostic look at Spark Core presented by Sameer Farooqui, a client services engineer at Databricks.