Sign up for our newsletter and get the latest big data news and analysis.

What’s so Important about Data Quality?

In this special guest feature, David Kolinek, Vice President of Product, Ataccama, asks why data quality is so important? Sometimes taking a step back and reviewing the basics can help clear things up. David leads the process of road mapping and defining what will be developed in Ataccama ONE, a self-driving Data Management and Data Governance platform. He is closely involved in the strategic planning of the long-term vision for the platform. David first joined Ataccama as a UX Designer and after building a successful design team he became the Head of Product Design, a role in which he advocated for design-driven development.

It’s understandable why many in business believe data management and the need to improve quality is the result of ePrivacy laws in the U.S. and the European Union’s General Data Protection Regulation (GDPR). But make no mistake, while thrust into the spotlight in recent years, ensuring data quality (DQ) has long been a thorn in the side of enterprises. 

In 2016, IBM had already estimated that bad data cost U.S. businesses and organizations more than $3 trillion per year, eaten up by tasks from identifying and fixing errors to vetting and confirming data sources. Four years later, a Gartner customer survey conducted to support a 2020 “Magic Quadrant for Data Quality Solutions” found poor DQ taking a $12.8 million annual toll on organizations. 

Why is data quality so important? Sometimes taking a step back and reviewing the basics can help clear things up. 

A little understanding 

The Data Management Body of Knowledge defines data quality as “activities that apply quality management techniques to data to ensure it is fit for consumption and meets the needs of data consumers.” Today, we might add the need to turn data into operational knowledge, which can then enable consumers to make better decisions based on single data points or patterns that otherwise would go unnoticed.

With that understanding, DQ is not a single activity but a series of actions that draw upon multiple resources and functions, all focused on making data usable in a purposeful way. It falls under data management, which has an overriding mission of delivering a view of datasets from various perspectives, enabling the type and quality level of data to be assessed. 

Accomplishing that alone is daunting. It encompasses data completeness, such as identifying missing details in records. There’s timeliness to consider in order to supply real-time data for things like CRM systems. Validity checks are necessary to ensure data conforms to approaches, a key factor for automating processes. Uniqueness needs to be ascertained to eliminate issues like duplicate data sets, along with accuracy to reduce the likelihood of errors slipping by and spreading across your organization.  

What have we got to lose?

Not all data sets deliver the same commercial or financial value, some produce stronger results, others offer reduced risk. But by determining quality, enterprises can prioritize use, data sources and domains. In doing so, they can enhance DQ and the odds of greater project success. 

For instance, employee salary data is vital to HR but not sales, whereas customer information is crucial to sales but not HR – this needs to be considered. As for risk, not only could inadequately secured Personally Identifiable Information (PII) result in fines, it could damage trust in a brand’s reputation. 

Quality counts and as enterprises become more sophisticated in their processes, they’ll learn more about their markets, even themselves. For instance, business decisions can be made quicker and with greater precision. On the other hand, enterprises that don’t enlist data capture and management could be more vulnerable. Without validation fields, they’d have to rely on free-form capture and management methods that could introduce bad data into a system. 

From a lack of correct names and addresses undermining costly email campaigns to flawed analytics ruining AI initiatives, there’s a lot to lose with poor data quality.

The benefits of better

When data quality is better, an enterprise reaps higher ROI in marketing and customer outreach – the product of effective delivery and more reliable targeting. Improved personalization can bring about greater sales, as well as brand building via improved customer service. Compliance can be streamlined, analytics can drive decisions. The list of benefits from better DQ seems endless, and arguably is the potential for financial and resource savings.

According to SiriusDecisions: “The longer incorrect records remain in the database, the more expensive it becomes to deal with them.” The researchers added, “In data management circles, this point is illustrated by the 1-10-100 rule: It takes $1 to verify a record as it is entered, $10 to cleanse and de-dupe it and $100 if nothing is done, as the ramifications of the mistakes are felt over and over again.”

Set for success

Positioning yourself for data quality success can be facilitated with various technology, but there are a few must-have capabilities. First are data profiling tools that will allow you to examine data sources much quicker than SQL queries. They can also pinpoint necessary process gaps and issues that need to be addressed moving ahead. Improving DQ may also require data structure change, so enlist tools for format standardization, data parsing and enrichment, deduplication and masking for added security. 

Most of these can also be automated, so make sure you validate and treat data before it enters your system. Also, consider deploying algorithms to support functions like checking website form data for format compliance – similar algorithms can be used in back-office processes and even customer-facing apps. 

Finally, the only way to improve something is to start by measuring it. By tracking data changes over an extended period with a DQ dashboard, organizations can better grasp how their operations are performing and what tweaks are needed for advancement. Monitoring can also reveal bad data and its source, while reducing the time and effort needed to comply with regulations. 

Sign up for the free insideBIGDATA newsletter.

Join us on Twitter: @InsideBigData1 – https://twitter.com/InsideBigData1

Comments

  1. My favorite part of your blog is when you said that better data quality could ensure higher ROI in marketing and therefore improve sales. This reminds me of manufacturing companies that need to ensure that their products will meet the satisfaction of their clients. I could imagine how the use of data quality monitoring software could allow them to efficiently monitor the quality of all their items to maintain a good reputation of a company.

Leave a Comment

*

Resource Links: