Cloudera, the provider of the secure data management and analytics platform built on Apache Hadoop and the latest open source technologies, released benchmark results that validate Cloudera’s modern analytic database solution, powered by Apache Impala (incubating), not only delivers unprecedented capabilities for cloud-native workloads but does so at better cost performance compared to alternatives. Impala uniquely offers elastic scalability, better flexibility, and direct Amazon S3 query ability unavailable from traditionally architected systems such as Redshift. With a modern design, Impala decouples data and compute to provide the same high-performance SQL analytics whether cloud-natively over data in S3 or across a wide range of on-premise and cloud storage options. Furthermore, Impala enables all these capabilities while also delivering up to 275% more cost-efficiency and up to 10x greater performance compared to Amazon’s analytic database Redshift, equating to more value all within an open platform.
Using queries from the TPC-DS industry standard benchmark, Cloudera compared Impala running on the cloud (both cloud-natively over S3 and over local EBS storage) to Amazon Redshift (only able to run over its own storage on dedicated AWS instances). Results from the benchmark show:
- Impala is over 200% less costly and over 10x faster on S3 compared to a general purpose tuned Redshift
- Impala is still 8% less costly and 90% faster on S3 compared to a pre-tuned Redshift for specific fixed reporting queries
- Impala is 28-275% less costly and 42-400% faster on EBS compared to either pre-tuned or general purpose tuned Redshift
Increasingly our customers are looking to move BI and analytic workloads to cloud environments to tap into the cost-effectiveness of elastic scale and greater flexibility. But they still require the high-performance analytics and big data agility they’re used to on-premises,” said Charles Zedlewski, Vice President, Products, at Cloudera. “Impala brings all its advantages it has over traditional, on-premise analytic databases to the cloud with a modern architecture that enables unprecedented agility no matter where the data lives. This comparison is clear evidence that Impala is unmatched for these BI and analytic workloads in the cloud.”
As businesses look to bring in more data from new sources, actively adjust models based on changing needs, and iteratively design for a variety of use cases, they need a modern analytic database that is built to address these requirements, without hindering business productivity. The rigid design and inelastic scale of traditionally architected, monolithic systems, whether on-premise or in the cloud, simply are not able to keep up with today’s ever-changing business needs. Cloudera’s analytic database, powered by Impala as the interactive SQL engine, is purpose-built to bring high-performance SQL analytics to big data, with elastic scalability for cloud and on-premise deployments, as and when it is needed.
Impala works natively with data stored on a number of storage engines, including Amazon S3 object store, eliminating the need to move or load data specifically into Impala clusters. Especially for cloud deployments, this translates to cost-savings and efficiencies as transient clusters can be spun up as needed for BI and reporting workloads and, with cost-effective storage from S3, more data is quickly and readily available for analysis.
Advancing Impala’s performance, concurrency, and scalability is a consistent area of focus for Cloudera. The company has widened the performance gap between Impala’s analytic database architecture and other alternatives for both single and multi-user workloads. The latest release delivers 12x better performance on secure workloads compared to its two prior versions. Cloudera plans to continue expanding Impala’s value and price performance benefits by adding support in the future for other object stores in the public cloud.
Sign up for the free insideBIGDATA newsletter.