Sign up for our newsletter and get the latest big data news and analysis.

Uncovering Opportunities at the Intersection of Public and Private Data

In this special guest feature, Hicham Oudghiri, CEO and Co-Founder at Enigma, discusses how data analysis as a tool for solving problems requires an understanding that the answer lies within the data set, either in the form of an insight or actionable anomaly. As businesses turn to analytics more to solve their everyday issues, it is important to understand what data can and can’t do, and to understand how to identify which data sets in a sea of data can help solve the problem at hand. Headquartered in NYC, Enigma helps puts data in the context of the real world and makes it connected, open and actionable for the Fortune 500. Prior to Enigma, Hicham managed the private sustainable finance program at BMCE Bank, in partnership with the World Bank Group, where he created energy models for large-scale alternative energy projects across Africa. Hicham developed a rating system and software for environmental and social risk management across the bank’s entire commercial loan portfolio. Hicham received a B.A. from Columbia University, where he studied Philosophy and Mathematics.

As organizations increasingly turn to analytics to solve business-critical issues, few are positioned to do so at scale. Many of the nation’s largest companies are still operating across sprawling, incompatible IT systems and organizations. Despite the massive volume of data now available with the advent of new technologies, by and large, that data is unstructured, heterogeneous, and siloed across legacy systems. As a result, a lot of data goes to waste – acquired but providing little value, leaving new opportunities and innovation on the table.

From enterprise asset to competitive advantage

To take advantage of data’s full value, companies must transition to measuring and managing business through the lens of data. They have to learn to think about each problem as a data problem.

While this cultural shift is often an obstacle in and of itself, an organization’s capacity to become data native boils down to two key things: access to data and the ability to interpret and apply it.

This means we need technology that allows teams across an organization the ability to efficiently acquire and easily access data in a centralized place. It also requires data that’s connected and contextualized to go beyond solving one-off problems. With both of these at work, businesses will be able to generate repeatable insights.

Data is only valuable in context

New insights and opportunities often lie at the intersection of public and private data. In a sea of data, how do you know what’s relevant? That’s where context comes in. In this case, the application of context is twofold: 1) to connect a single data set to the real world (or to another data set), you need to provide context—in the form of normalized schema and metadata—to understand what data is and how to interpret it and 2) to make proprietary data meaningful outside the walls of an organization, you need to place it in the context of a larger data set (i.e. public data) to understand how it relates to an industry or problem set as a whole. There’s always more to the story.

For example, a pharmaceutical company has access to a significant amount of proprietary data surrounding adverse drug events, which it uses to triage potential deviations in its manufacturing process, pull drugs off the shelf if necessary, or investigate the clinical causation of those adverse events. While that information is vital to managing safety, when linked with or analyzed alongside population data and/or adverse events data sets compiled from the FDA, CMS etc., it now offers a more complete knowledge base for evaluating new markets, optimizing manufacturing, and refining the understanding of adverse events for a given population — especially as big pharma moves into personalized medicine.

Opening data and schema across the enterprise enables organizations to move from merely storing data to interpreting and acting upon it. By creating a fluid framework in which you can cluster similar data types or bridge unconnected data sets, you drastically increase the likelihood of identifying new patterns or revealing previously hidden realities. And in connecting data to provide a more holistic, nuanced picture, you can extract substantially more value from data. Simply put, you have the ability to answer a question or solve a problem that a single data set could not.

Putting data to work

Let’s use small business underwriting as an example. While there’s ample public data available about public companies, the same can’t be said for small and medium private businesses. Often, insurance companies have little information about an applicant beyond data provided by FICO and the actual credit application. But, thinking about an approach that marries public and private data, it’s possible to realize the asset gained when public government data sets are pulled in to supplement that data.

For example, it’s possible to pull in state and federal contracts to identify both new and existing sources of revenue, IRS 401k filings to track number of employees and recent company growth, patent claims and applications, which serve as indicators of company activity and R&D, and perhaps H1b visa applications, knowing that a trend in foreign sponsored visa applications is a forward indicator of future growth. Together, these disparate data sets create a much more robust business profile that the insurance carrier can correlate against their history of losses to identify negative indicators. These negative indicators can then be used to assess future applications to reduce their loss ratio.

Perhaps most importantly, exposure to all these signals can help an organization gain the confidence needed to engage in more statistical thinking needed to put data to work horizontally across its operations. As businesses push to become data-driven organizations, those that create operational data frameworks that ensure open and connected data throughout the enterprise will be the best positioned. Unlocking a new kind of intelligence requires leveraging multiple relevant sources that were not designed to necessarily speak to each other in the first place.

 

Sign up for the free insideBIGDATA newsletter.

 

Leave a Comment

*

Resource Links: