Sign up for our newsletter and get the latest big data news and analysis.

Dremio Introduces Self-Service Data Paradigm for Business Intelligence and Data Science

Dremio, the self-service data company, announced its entrance into the data analytics market with the immediately availability of the Dremio Self-Service Data Platform, a fundamentally new approach to data analytics. Working with existing data sources and business intelligence tools, Dremio’s solution eliminates the need for traditional ETL, data warehouses, cubes, and aggregation tables, as well as the infrastructure, copies of data, and effort these systems entail. Dremio combines consumer-grade ease-of-use with enterprise-grade security and governance, and includes ground-breaking execution and caching technologies that dramatically accelerate analytical processing. Dremio was released as a new open source project under the Apache license and is now available for download.

Founded in 2015 by a team of big data experts, Dremio has raised over $15 million. The company’s software is being used by leading organizations in the US, Europe, Asia, and Australia, such as Daimler, a leading producer of premium cars and the world’s largest manufacturer of commercial vehicles, and OVH, Europe’s leading cloud provider. Additionally, technology providers including Microsoft, Tableau, Qlik, as well as open source communities like Python Pandas and R are collaborating with Dremio to deliver end-to-end self-service for data analytics.

Despite promises of software designed to unlock the value of data, analysts and data scientists continue to struggle to harness data for business intelligence and data science. Dremio accelerates time to insight by empowering analysts and data scientists to be independent and self-directed in their use of data, from any source and at any scale, while preserving governance and security.

In our personal lives, most people expect to get answers to questions in just a few seconds. But in the workplace, it can take months to answer a question,” said Tomer Shiran, Co-founder and CEO of Dremio. “While tools like Tableau, Power BI, and Qlik provide a self-service model for visualization, Dremio is the first to provide a self-service experience for the rest of the data analytics stack, empowering business users and analysts to discover, explore and analyze any data at any time, no matter where it is or how big it is.”

Dremio provides a future-proof strategy for data, allowing customers to choose the best tools for analysts, and the right database technologies for applications, without compromising on the ability to leverage data to power the business.

Dremio is a new breed of data analytics platform that doesn’t require ETL, cubes, data warehouses, or even data virtualization tools to deliver self-service analytics to data analysts,” said Wayne Eckerson, founder and principal consultant, Eckerson Group. “The big data platform, designed from the ground up for the cloud and Hadoop, works with any BI product or data science tool, sits between users and data sources, eliminating the need for data movement. This speeds deployments and provides agile access to data.”

Key capabilities include:

  • Apache Arrow Execution Engine. Dremio is the first Apache Arrow-based distributed query execution engine. This represents a breakthrough in performance for analytical workloads as it enables extreme hardware efficiency and minimizes serialization and deserialization of in-memory data buffers between Dremio and client technologies like Python, R, Spark, and other analytical tools. Arrow is also designed for GPU and FPGA hardware acceleration, making it a powerful paradigm for machine learning workloads.
  • Native Query Push Downs. Instead of performing full table scans for all queries, Dremio optimizes processing into underlying data sources, maximizing efficiency and minimizing demands on operational systems. Dremio rewrites SQL in the native query language of each data source, such as Elasticsearch, MongoDB, and HBase, and optimizes processing for file systems such as Amazon S3 and HDFS.
  • Dremio Reflections™. Dremio accelerates processing and isolates operational systems from analytical workloads by physically optimizing data for specific query patterns, including columnarizing, compressing, aggregating, sorting, partitioning, and co-locating data. Dremio maintains multiple reflections of datasets, optimized for heterogeneous workloads, that are fully transparent to users. Dremio’s query planner automatically selects the best reflections to provide maximum efficiency, providing a breakthrough in performance that accelerates processing by up to a factor of 1000.
  • Comprehensive Data Lineage. Dremio’s Data Graph preserves a complete view of the end to end flow of data for analytical processing. Companies have full visibility into how data is accessed, transformed, joined, and shared across all sources and all analytical environments. This transparency facilitates data governance, security, knowledge management, and remediation activities.
  • Self-Service Model. Dremio was designed with analysts and data scientists in mind, providing a powerful and intuitive interface for users to easily discover, curate, accelerate, and share data for specific needs, without being dependent on IT. Users can also launch their favorite tools from Dremio directly, including Tableau, Qlik, Power BI, and Jupyter Notebooks.
  • Built for the Cloud. Dremio was designed for modern cloud infrastructure, and is able to take advantage of elastic compute resources as well as object storage such as Amazon S3 for its Reflection Store. In addition, Dremio can analyze data from a wide variety of cloud-native and cloud-deployed data sources.

Customer Applications of Dremio

Because Dremio can be run in the cloud, on premises, or as a service provisioned and managed in a Hadoop cluster – customers can easily deploy Dremio to meet their needs at any scale. Popular use cases include BI on Modern Data, like Elasticsearch, S3, and MongoDB; Data Acceleration, making even the largest data sets interactive in speed; Self-Service Data, making consumers of data more independent and less reliant on IT; and Data Lineage, tracking the full lineage of data through all analytical jobs across tools and users.

With over 1 million customers and 270,000 servers across our 20 data centers, telemetry data about our infrastructure is a critical asset we use to remain competitive while providing a great experience to our customers,” said Vincent Terrasi, head of data, analytics, and CRM for OVH. “Dremio helps our data managers and analysts work with our data, independently and effectively, and makes it available for analysis using Tableau Desktop and Tableau Server. We are proud to be a part of this important open source community.”

Dremio Partner Ecosystem

By working closely with partners, Dremio looks to change the current approach to data analytics by expanding the big data, business intelligence, and analytics ecosystem for the enterprise.

Qlik is a pioneer in self-service BI and visual analytics,” said Hjalmar Gislason, VP of data at Qlik. “Dremio shares our vision of making analysts and data scientists increasingly independent and productive. I have been waiting for a solution like Dremio to emerge in the rapidly evolving landscape of modern data sources, and am excited about the benefits it will bring to our more than 40,000 customers.”

With more than 100,000 curated datasets, Enigma is the leading provider of analysis-ready public data,” said Hicham Oudghiri, CEO of Enigma Technologies. ”Customers rely on our open source intelligence to enrich their enterprise data to drive smarter decision making. Dremio’s approach for self-service data analytics can drive immense productivity in all types of organizations. We are excited to partner with this innovative open source company.”

Availability

Dremio is distributed as a Community Edition, which is open source and free for anyone, as well as an Enterprise Edition, which is available as part of an annual subscription with support, a commercial license, and enterprise features.

 

Sign up for the free insideBIGDATA newsletter.

Leave a Comment

*

Resource Links: