ClearStory Data and Cloudera Join Forces with Spark

Print Friendly, PDF & Email

ClearStory-Data-LogoClearStory Data, the company bringing business-oriented Data Intelligence to everyone, announced it has further integrated its Apache Spark-based business user application and data harmonization engine with CDH, the latest version of Cloudera’s Hadoop distribution platform. With this integration, ClearStory Data brings its Spark-based data processing capabilities to CDH customers as a native, integrated data source. ClearStory Data is a certified technology partner of Cloudera, the leader in enterprise analytic data management powered by Apache Hadoop™.

Hadoop-based data hubs have increasingly emerged as a singular location for enterprise and consumer internet companies to store all data. As a result, data hubs hold a rich variety of data with the potential for extremely high-value insights when integrated with new generation Spark-based analytics solutions.

By pairing ClearStory Data with Cloudera, customers are able to speed data access and analysis by leveraging ClearStory’s unique capabilities around data inference, data harmonization and metadata management. Further, given ClearStory’s intuitive application experience, it can be used by business analysts and other staff without the requirement for specialized data or technical skills.

Enterprises across various sectors such as insurance, banking, retail, and healthcare are using data hubs to answer questions about the business,” says Tim Stevens, vice president of corporate and business development at Cloudera. “ClearStory Data helps deliver enterprise-scale Spark-enabled processing, self-service data analysis, and holistic, consumable insights for line of business users.”

The integrated solution is focused on answering important business questions on a fast-cycle, leveraging data from CDH. ClearStory Data’s simple business application, user-guided interface, collaborative “Data Stories” and “Interactive, Collaborative StoryBoards™” let business users be more self-reliant with data analysis and frees up IT resources, while enforcing appropriate user and data governance controls based on the nature and sensitivity of the data stored in CDH.

Today’s enterprise data hubs require Spark’s fast in-memory processing because they have terabytes, if not petabytes of information, and there’s complexity and security concerns about who gets access to what data both inside and outside the organization,” says Vaibhav Nivargi, chief architect and technical co-founder of ClearStory Data. “The combination of Cloudera and our platform makes large volumes of disparate data more consumable. It allows business users to freely explore and collaborate on data in Hadoop with a user-guided experience through our simple business application.”


Sign up for the free insideBIGDATA newsletter.

Speak Your Mind