Sign up for our newsletter and get the latest big data news and analysis.

Why Companies Must Feed Data Unicorns to Maximize ROI

Your guide to making the most of your data assets

Not all enterprise data is equal. Some data assets are unicorns; they are so widely used that they account for the majority of data usage inside an organization, i.e., data asset popularity follows a power law.

Most organizations waste their budget by spreading it thinly across many data assets. Chief Data Officers (CDOs) will see better ROI by focusing on a small number of data assets. Even a small change to the right data assets will magnify your efforts and help you communicate value to the rest of the organization. This high-level framework helps you to identify the best candidates to deliver a high ROI.

Unicorns, cows, donkeys, and microbes — how to find them

Lacking any other metrics, usage is an excellent indicator of value. Data assets are used and valued differently depending on the organization and, within that, the team. We can classify data assets based on usage into four categories.

Note: If you don’t have an exact usage metric, you could use simple t-shirt sizing (XL–XS) to approximate usage based on users, queries, bandwidth, etc.

  • Data Unicorns: A few data assets are so widely used that the top 10% of data assets disproportionately contribute to 70% of all data usage inside the company. If you assume usage is a measure of extracting value, these select data assets contribute to the majority of the value in the organization.
  • Cows: Cows are usually data assets that would be top-used data assets for some smaller teams in the organization, but not widely used across the organization. These data will be highly valuable for specific teams that use the data, but, when compared to unicorns, cows become small.
  • Microbes: These are data assets that are never used, aka dark data. Many studies indicate the majority (60–70%) of enterprise data is never used. These are termed ‘microbes’ as they remain unseen. Microbes can either be harmful — consuming resources to store and manage — or useful data that suffers from the lack of awareness and tools on the part of the organization. Likely, microbes are from stagnant systems that might have lost their relevance.
  • Donkeys: These are just everything in between cows and dinosaurs. Donkeys are the data with very little data use. If you looked up the usage metrics of these assets, donkeys would be often only used by one or two people in the organization.

How to govern and nourish

Here are some key strategies to deal with different categories of data assets.

Unicorns

Because of the volume of use, any issues — such as data quality issues or privacy non-compliance incidents — would be easily magnified. Given the nature of power laws, popular data sources get more popular, hence nourishing your unicorns will result in better ROI.

  • Offer the highest level of redundancy and quality. Investing in tools and processes helps to improve the quality of decisions inside the organization.
  • Eliminating personal data significantly reduces privacy risks. So, de-identify your personal data to obtain extra security.
  • Find more ways to distribute and consume the data. Implementing new tools and pipelines that can visualize, analyze, and share the data better yields better ROI.

Cows

For the teams who rely on cows, cows will be revered as unicorns. The teams themselves can articulate the connection between the data and business objectives. “What is the value of data?” can be answered anecdotally as well as quantitatively. If the marketing team is using cows, then they’re might lead to better conversions. If it’s the operations team, then cows are tied to predictability or speed-to-market. Let the teams pick and explain.

  • Find a local champion for the data who can evangelize inside the organization. Given the conviction of the teams, they are the best people to convey to others how to use the data more effectively.
  • Improve discoverability by, for example, providing better documentation.
  • Teams tend to create silos and duplicates without interacting with other teams. If you can integrate silos and remove duplicates, you will eliminate many conflicting insights.

Donkeys

If users aren’t repeatedly using a data asset, it might indicate that the data is of low data quality or value.

  • Any efforts to build a dashboard, pipeline, or data marts for donkeys are likely low-ROI efforts.
  • Reduce duplicates and encourage teams to move to an alternative, if alternatives are available.
  • New data assets — not only infrequently-used data assets — will be in this category. Companies waste significant investment as teams allocate resources anticipating growth. Only a few will grow into cows and unicorns. When usage is still sparse, you must base your investment decisions on the most recent growth rate.

Microbes

Microbes could slow your organization down.

  • If your organization doesn’t have a legal requirement to hold the unused data, you must get rid of the unused data. By getting rid of the legacy data, you can reduce your costs and significantly reduce your attack surface.
  • If you create data catalogs and drive awareness across the organization, valuable parts of the unused data could be discovered and used.

In times when businesses are struggling to deal with COVID-19, showing ROI is critical for any leadership role, especially for Chief Data Officers (CDOs). CDOs’ budgets have shrunk, and new investments get greater scrutiny. Thus, CDOs must have a laser focus on the most crucial assets. Not only generating ROI, but also demonstrating ROI is critical. The above framework will help you to prioritize resources and increase ROI.

About the Author

Amar Kanagaraj is the founder and CEO of oneDPO, a PrivacyTech startup that applies AI and ‘privacy by design’ to protect enterprise data from breaches, insider risks, and privacy violations. He is a successful serial entrepreneur, passionate about building innovative products. Before oneDPO, he was the co-founder of FileCloud, a leading content collaboration solution with 3000+ enterprise customers. Amar has 20+ years of experience in building products, marketing, scaling companies, and leading teams. He has an MBA from Carnegie Mellon and MS from Louisiana State University.

Sign up for the free insideBIGDATA newsletter.

Leave a Comment

*

Resource Links: