Waterline Data Releases Smart Data Catalog 4.0 for Faster Use of Big Data

Print Friendly, PDF & Email

Waterline Data, a leader in Data Lifecycle Management, announced the immediate availability of its latest platform offering, Smart Data Catalog 4.0.

It has become clear that the data catalog is a fundamental enabler not just of the management of the data within a data lake, but also for a variety of related business use cases,” said Matt Aslett, Research Director, Data Platforms and Analytics, 451 Research. “By creating an inventory of data and data lineage, tagging sensitive data to control access, and even identifying data redundancy,  the data catalog can be used to identify data for analysis, enable data governance and rationalize excess data sets, unlocking the potential value of big data projects.”

Connecting the Right Data to the Right People

The 4.0 version of the Smart Data Catalog replaces manual tagging of metadata with an automated process that rapidly classifies and organizes all of an organization’s data assets and lineage, making data readily available for:

  1. Self-Service Analytics
  2. Data Governance and access control for regulatory compliance
  3. Data Rationalization for greater storage and cost efficiency.

Smart Data Catalog 4.0 answers fundamental questions that most organizations have regarding data. Where do I find it?  Where did it come from? What’s in the data? Who can use the data?

Smart Data Catalog 4.0 Key Features

SDC 4.0’s new enhancements were all designed to accelerate the usability of trusted data in the enterprise. New capabilities include:

  • Support for directly fingerprinting and cataloging data located in Teradata, Oracle, MySQL, and other relational databases expands Waterline beyond prior version support for Hadoop-only data sources.
  • Support for Data Lakes operating in Amazon AWS
  • Tag-based access control identifies sensitive data fields and allows data tagged as “sensitive” to have access automatically controlled directly by Apache Ranger and Cloudera Sentry, along with other access control tools via REST API integration.
  • Dramatically improved user experience for the business professional with a new user interface “skin”; faster, more scalable search based on the industry standard SOLR search platform, improved crowdsourced ratings, annotation, reviews, and collaboration features.
  • The industry’s most extensible, open architecture that supports Hadoop, Spark, and Cloud deployment environments; an RDBMS plug-in architecture for relational sources, as well as extensive REST API partner integration and extensibility.

With its unique combination of automated data inventorying plus crowdsourcing, Smart Data Catalog 4.0 allows data professionals to “fingerprint” data at scale by analyzing actual data values. The software automatically tags data fingerprints to glossary terms as well as matches terms through crowdsourcing, and then curates the results by allowing data stewards to accept or reject tags. Meanwhile, business professionals can easily search and use data through a user-friendly interface or through a variety of third party applications.

Our mission at Waterline Data is to connect the right people to the right data while information is still fresh,” said Alex Gorelik, CEO at Waterline Data. “Most organizations have more than 50% of their data stagnating in quarantine zones or lost in data swamps, because nobody has the time or expertise to identify and organize the assets and decide who should have access to them. Waterline Smart Data Catalog 4.0 delivers a unique combination of automation and crowdsourcing that allows our customers to quickly get their data out of quarantine and into use with the confidence that the data is properly tagged so it can be governed and put into use in days instead of weeks or months.”


Sign up for the free insideBIGDATA newsletter.

Speak Your Mind