Databricks and Intel Collaborate to Optimize Apache Spark-Based Analytics for Intel® Architecture

Print Friendly, PDF & Email

databricks_logo_NEWStrata + Hadoop World News

Databricks, the company founded by the creators of the popular open-source Big Data processing engine Apache Spark with its flagship product, Databricks Cloud, today announced plans to collaborate with Intel to optimize Spark real-time analytic capabilities for Intel® architecture.

Enterprises are increasingly developing applications to extract real-time insights from large data sets. The necessity for real-time analytics across Intel architecture is a vital piece of the Big Data puzzle to enable the extraction of prompt, actionable insights from large data sets. As an open source framework that enables stream processing as well as fast queries on large data sets stored on a Hadoop cluster, Apache Spark supports new modes of analytics on big data platforms based on the Apache Hadoop ecosystem.

Open source is undoubtedly the future of technological innovation and Big Data tools and processing are at the forefront of that wave,” said Ion Stoica, CEO at Databricks. “Our collaboration with Intel will bring the unified Spark ecosystem to businesses of all sizes with new levels of analytic capabilities, real-time benefits, and simplicity.”

Apache Spark is a tool for iterative processing of large datasets compatible with Hadoop data. It can run in Hadoop clusters through YARN or Spark’s standalone mode and is designed to perform both batch processing and new workloads like streaming, interactive queries, and machine learning. Having recently won the 2014 Gray Sort competition, a 3rd-party benchmark measuring how fast a system can sort 100 TB of data (1 trillion records), Spark is built for scalability, stability and performance with the ability to process datasets from Gigabytes to Terabytes to Petabytes.

As more and more connected devices, including sensors, are introduced to the market, Big Data sets are growing exponentially every year, making processing and analyzing this data a more complex task,” said Michael Greene, Intel Vice President, Intel Software and Services Group and General Manager of System Technologies and Optimization. “To find new trends and strong patterns from large complex data sets, a strong analytics foundation is needed. Our work with Databricks to advance these analytics capabilities on Intel® architecture by utilizing the rich capabilities of Spark will help our customers dive deeper into their data and derive real-time insights and benefits in the cloud.”

Read Michael Greene’s blog post to learn more about this announcement.


Sign up for the free insideBIGDATA newsletter.

Speak Your Mind