Dark Data: Shedding Light on a Growing Issue for Businesses

Print Friendly, PDF & Email

Mika_JavanainenIn this special guest feature, Mika Javanainen, Senior Director of Product Management at M-Files Corporation, observes that  with the plethora of new data-gathering tools that are continually coming out, more and more data is being collected – and  we’re reaching a point where a significant amount of it is going unused. This so-called “Dark Data” isn’t being analyzed, and potentially useful trends are being missed. Javanainen is in charge of managing and developing M-Files product portfolio, roadmaps, and pricing globally. Prior to his executive roles, Javanainen worked as a Systems Specialist, where he integrated document management systems with ERP and CRM applications. A published author, Javanainen has an executive MBA in International Business and Marketing.

While somewhat ominous, the term “Dark data” effectively describes an important class of information that exists within nearly every company — all of the files that have been forgotten or lost within an organization’s digital repositories.

Many define dark data as information assets that are created and used only once like accounting records or email conversations that are retained for long periods but seldom used. Also, content that is actively used for a period of time can turn into dark data when organizational and project priority changes. Active information that becomes inactive is typically left where it was and is easily forgotten. To make matters worse, employees often recreate data when they can’t quickly find their copy. Duplication and recreation multiply the incremental volumes of any data that subsequently goes dark.

Learning to See in the Dark with Metadata

Before dark data can see the light, it has to be identified. After that, enterprise information management (EIM) solutions that leverage metadata can inject a layer of intelligence capable of eradicating the dark data.

By attaching metadata attributes (or tags) to content assets, an EIM solution can instantly identify the information assets that are related and/or relevant to other unstructured content assets as well as structured data objects. For example, a sales proposal (an unstructured content asset) can be tagged with a metadata attribute for “Customer A” by the EIM system, which is linked to the CRM solution. That proposal then becomes visible from within the CRM system, and can be linked to the CRM account for Customer A (a structured data object). In this way, metadata shines the light on previously dark data.  All of the information assets related to Customer A can be displayed to decision makers in context with other related information. In a similar fashion, an e-mail or technical support ticket from that customer can be linked to the customer object. The ability to connect the support ticket to the proposal gives support engineers a better ability to assess the urgency of the support issue if they see the financial impact of the customer.

With the ability to see and harness dark data, companies can make better use of all of their data. At a time when the volume of information continues to skyrocket, it makes sense to pay attention to dark data. Since a portion of dark data can still provide value, there are positive incentives for making it broadly visible.

Some Data Belongs in the Dark

Some dark data, if not managed properly, can expose a business to numerous risks. Some content assets are meant to be designated as dark data and intended to remain invisible to certain individuals because they contain sensitive or private information. For instance, HR collects and stores personal information about employees. Across all departments, confidential company information and proprietary digital assets such as design files, sales proposals, marketing strategies, and other intellectual property needs to be secured and visible only to authorized individuals. Also, some financial records, for example, belong “in the dark” because there is no business value in integrating these repositories with active content. However, laws determine long-term retention periods for these records, so you cannot actively get rid of this data.

Shine the Light on Dark Data

Injecting more intelligence in your data essentially brings your information into the light – the assets live longer and can be used by more people. In many cases, dark data never stays dark for long, since it can be regularly recycled for uses that go beyond the original intent. This reuse of data gives businesses huge gains in efficiency, as industry surveys cite that up to 70% of documentation is re-created at some point.

The benefits and saved time add up quickly. Decision makers can achieve better results since they can find and use all relevant information, and productivity goes up for all of the knowledge workers in the organization since everyone will spend less time looking for misplaced information. As an impetus for introducing enterprise-wide improvements like these, dark data can be a positive phenomenon at present. And by recycling data, today’s intelligent EIM solutions might soon eliminate the need for the term, “dark data.”


Sign up for the free insideBIGDATA newsletter.

Speak Your Mind