Data Preparation: The Key to Unravelling the Big Data Opportunity

Print Friendly, PDF & Email

Cari JaquetIn this special guest feature, Cari Jaquet of Paxata spotlights the rise of IoT data flows and the importance of data preparation solutions designed to assist data scientists and data analysts in making these data more consumable. Cari Jaquet is Vice President of Marketing at Paxata, providers of the first purpose-built Adaptive Data Preparation™ solution.  In this role, she is responsible for ensuring the alignment and delivery of strategic marketing around key corporate initiatives and revenue targets.

If we thought we had data issues before, things are about to get really crazy with the Internet of Things (IoT). While companies like Cisco and Intel are figuring out how to make “things” connected, smarter and even more useful than they are today, those things are also going to produce a lot of data exhaust. Just because a toothbrush can collect data about a child’s brushing habits and tell parents how often they brush, for how long, with what amount of pressure and coverage at the end of the day does the information hold any real value? The true benefit will only be realized when that data becomes a permanent part of that child’s records at their dentist office, when data is shared with toothbrush manufacturers to influence product development, or used to build averages across a large section of the population to shape future market opportunities.

While the “connected toothbrush” has gotten a lot of media attention, the Internet of Things is creating world-shifting innovation in how healthcare is administered starting with Smart Patient Rooms, Smart Cities, and Aging in Place technologies. It is also fostering innovations around supervisory control and data acquisition (SCADA), smart meter reading, and even the reinvention of traditional workflows in the construction industry so that a worker can get supplies brought to them instead of having to go up and down ladders and risk injury each time. With each of these disruptive initiatives, the number and variety of devices that will comprise the IoT is staggering and growing.

Due to the increasing traction of these cutting edge solutions, businesses have access to vast treasure troves of information about customers. Imagine the insurance company that could offer discounts on premiums to the construction firm using an IoT ladder over an older model. Not so different than a utility company offering discounts for purchasing smart appliances or car insurance companies offering lower premiums for drivers who use monitoring tracking devices, like Snapshot.

Yet with all these new opportunities, there are challenges. Unlike static data, IoT data is highly contextual and more valuable when used in conjunction with other meaningful data sources. When the IoT transformation is complete, data centers will be faced with an overwhelming amount of information that will need to be synthesized, analyzed, and stored. Investments in IoT technologies are driving an ecosystem of complementary technologies, such as self-service data preparation, largely because these solutions quickly combines, cleans and shapes data prior to analysis. The burden is therefore lifted from IT and individual business analysts will be able to reap the benefits of this new connectedness to drive efficiencies, reduce costs of doing business, and increase market opportunities.

In fact, according to a recent Gartner report “Data Preparation Is Not An Afterthought” analysts Lakshmi Randall and Mark Beyer encourage organizations to use self-service interactive data preparation tools to enhance analyst productivity. The report adds, “The iterative and explorative nature of data preparation results in a time-consuming process that demands considerable effort from data scientists and business analysts. In addition, preparation of data originating from new and diverse data sources can be challenging. In order to overcome these challenges and improve the productivity of data scientists and business analysts, enterprises should use data preparation tools.”

The Gartner report also cited the findings of their 2014 survey that looked at Big Data adoption. The research found that an increasing number of organizations are expected to analyze emerging data types that include social and machine-generated data. However, these emerging data types are not readily consumable using existing analytical tools since they comprise multiple data formats such as JSON, XML and Avro.

Let’s face it: businesses have a ferocious appetite when it comes to ingesting data and insights. But in order to leverage the proliferation of Big Data that is at its disposal, organizations must ensure that business users and analysts have access to the data itself. Data preparation tools simplify how the business gathers and uses data, regardless of its size or source. Using data preparation, business analysts, data scientists, developers, data curators, and IT teams can automate, collaborate and dynamically govern the data integration, data quality and enrichment process with unprecedented scale, so they can quickly and confidently build the Answer Sets needed for Big Data analytics without coding, scripting, data modeling or sampling.


Sign up for the free insideBIGDATA newsletter.

Speak Your Mind