Sign up for our newsletter and get the latest big data news and analysis.

AI-Driven Data Management Is Your Surfboard for Surviving the Big Data Tsunami

The proliferation of data volume and variety has been so radical in large organizations that passive and siloed data management automation systems, which at best can merely speed up manual and labor-intensive processes, are not scaling for success.  Fortunately, platforms like Apache Hadoop, and its newer cloud variants, help organizations build data lakes to enable newer and more innovative forms of analytics in their business. As such, organizations are beginning to recognize the value of using data for competitive advantage. As this happens, the need to automate data management practices to successfully ride the “big data tsunami” becomes clear, if not essential.

Initially, companies reacted to this by hiring more data engineers and data scientists. The opportunity to use more innovative data infrastructures was essentially addressed by throwing people at the problem. But, like any viral phenomenon, the growth in data supply and demand has been relentlessly exponential in scale and has started to tax organizations that are unable to hire and train at the requisite rate of data growth.

To get a truly scalable and exponential edge on data management, organizations cannot afford to simply automate the manual labor. Organizations need to fundamentally remove the manual labor and manual judgement from the equation using artificial intelligence (AI)-driven data management systems. This will transform the labor equation for information management teams by systematically and intelligently eliminating labor-intensive data management with a machine-aided approach to discover, prepare, and deliver data to end users.

Through this approach, organizations can scale their delivery of trusted data with growth of the underlying data itself. The machine-aided approach, powered by newer innovations in graph database technology and distributed processing, is the only technological approach to data management that can logically scale indefinitely with the growth of data.

So, how can organizations put AI-driven data management to work?  We have observed a four-step journey that organizations tend to take on their path to delivering more innovative analytics with AI-driven data management. These steps are:

1. Use AI-driven data cataloging

By intelligently scanning the structure of all data assets across the enterprise, combined with information about the usage of those data assets, organizations can build a centralized knowledge graph and inventory of all data assets. This then becomes the central focal point for the remaining data management processes, since a centralized knowledge graph can be leveraged beyond the tasks of cataloging data assets and tracking data provenance.

2. Fully automate all data management processes with a model-based approach

A model- or mapping-based approach codifies the critical data management processes into abstracted business logic instead of infrastructure-specific code. This ensures resilience against future changes in underlying infrastructure systems and provides an abstraction of data management processes with which machine learning systems can engage.

3. Build a fully trusted AI-driven data lake or data hub

Data lakes and data hubs have historically been considered dumping grounds for ungoverned data. While there is a case for having these raw staging zones as a part of an organization’s data footprint, they must be appended with a fully certified and trusted data zone that is automatically and intelligently filled with accurate, secure, and compliant data that can be published to end users.

4. Wrap AI-driven process with a layer of AI-driven self-service

By doing so, end users can engage in the preparing of data in the data lake, as well as subscribe to ongoing feeds of data after the preparation steps have been fully operationalized.  In this AI-driven model, end users not only benefit from machine-aided recommendations coming from the data management system, but also discover data assets they would never have found on their own, due to the collaborative filtering and wisdom of crowds that AI provides.

It’s important to note that AI cannot replace the need for people to remain engaged in data management. Make no mistake, this is not just some futuristic Jetsons-like view of the world of data. The AI-driven approach to data management is becoming the new normal for organizations that simply cannot continue to hire and train data engineers and data scientists at the rate at which data is growing.

By allowing machines to facilitate and aid the work of people, organizations can more strategically up-level employees and invest in new talent, while effectively addressing exponentially growing data sets across the enterprise.  Ultimately, this AI-driven and machine-aided approach to data management is what will help organizations unleash the power of data to drive market disruptions.

Holistic middleware data platforms are already helping organizations find, prepare, cleanse, master, govern, and protect all the data they need to ensure the right data gets to the right people at the right time.  With the inevitably ever-growing tsunami of data, the AI-driven approach to data management is the indisputably scalable way to ensure that IT can deliver trusted data in the cloud and on-premises to business analysts, so they can deliver more innovative, timely, relevant, and personalized digital experiences.

About the Author

Murthy Mathiprakasam is a Director of Product Marketing for Informatica’s Big Data solutions. He has a decade and a half of experience with high-growth software technologies, including roles at Mercury Interactive, Google, eBay, VMware and Oracle.

 

Sign up for the free insideBIGDATA newsletter.

Leave a Comment

*

Resource Links: