Databricks Announces ‘Jobs’ Feature for Databricks Cloud

Print Friendly, PDF & Email


Databricks — the company founded by the creators of the popular open-source big data processing engine Apache Spark with its flagship product, Databricks Cloud — introduced “Jobs,” a feature for Databricks Cloud at the inaugural Spark Summit East. Hosted by Databricks, the Summit will include over thirty high quality sessions that showcase Spark’s momentum and use cases from top talent in the Spark community and leading production users, including Salesforce, Intel, DataStax, MyFitnessPal, Box, and more.

As the latest update to Databricks Cloud, Jobs enables data scientists and engineers to easily schedule and manage production pipelines to run Spark workloads without any human intervention. Built to integrate seamlessly with Databricks Cloud, this new feature can perform periodic ingest, transformations, and processing of data in Databricks Cloud automatically.

Jobs supports the creation of production pipelines using Databricks Cloud notebooks as well as standalone Spark applications, enabling Databricks Cloud users to seamlessly transition from exploration to production workloads. As a result of the Jobs feature, time spent on developing, scheduling, and managing complex Spark workloads will be dramatically reduced.

Jobs also runs on clusters using both Amazon Web Services on-demand as well as spot instances. Additional capabilities include:

  • The ability to set up new Spark clusters or reuse existing clusters for the execution of jobs.
  • A flexible job scheduler that guarantees timely execution of Spark applications.
  • A notification service that will email Jobs owners of important events, such as failures.

Jobs holistically automates and eliminates the repetitive, manual, human processing element typically required to properly schedule, sequence and execute these production pipelines — generating significant time and cost savings through improved productivity and better use of strategic resources,” said Ion Stoica, CEO of Databricks. “Databricks Cloud makes it easy for users to get started on analyzing their business-critical data within minutes. Spark Summit East is the perfect avenue for this announcement since it’s exciting to hear such a dynamic lineup of speakers discuss their unique way of innovating and simplifying Big Data with Spark.”

To learn more about Jobs, read the Databricks blog post HERE.


Sign up for the free insideBIGDATA newsletter.

Speak Your Mind