In this special guest feature, Ran Sarig, Co-founder and CEO of Datorama, discusses the importance of applying machine learning to data integration or ‘cleansing’ processes with speed and at a scale in order to keep up with the ever increasing number of data sources. And why Big Data needn’t be a big mess anymore. Ran has 14 years of management, product, engineering and leadership experience. He co-founded Datorama in 2012 and is its chief executive officer. Prior to this, he worked for MediaMind as its VP of R&D where he managed a group of 130 engineers and product managers.
Today’s businesses are dealing with an onslaught of challenges, but undoubtedly the tallest hurdle is data. Every single company and every single industry is struggling with this problem. To better understand how today’s leading organizations can put their best foot forward though, you first must grasp the severity of the problem.
In the future we’ll most likely refer to the 2010s as the decade of data. While it’s always existed, the complexity and volume has increased exponentially. Take, for example, this oft-cited factoid: In 2010 enough digital data was created that it could fill a stack of DVDs stretching from Earth to the moon and back. By the end of the decade it’s estimated that this theoretical stack of DVDs will reach Mars.
And here’s the kicker, this is only the beginning of the data explosion. Until now, most of today’s data creation has been bolstered by the likes of all-new, more powerful mobile devices and the launches of various social media channels. The second wave of data growth will put the first to shame as the Internet of Things (IoT) gets into full swing and more source systems than ever come online.
Just think, this is only the volume problem. We haven’t even touched on data complexity. Yes, the hill gets steeper to climb.
The good news is that today’s cutting-edge companies have realized that data is actually a strategic asset that needs to be harvested for rich insights that can help fuel their respective businesses. Though the solution to this problem has been painted by many vendors as simple, I am here to tell you otherwise.
Tackling what I call the “messy data” problem requires the right organizational culture, the right approach and the right partner.
As seen in an April 2016 McKinsey & Company survey titled “The Need to Lead in Data and Analytics,” the difference in high- and low-performing respondents came down to how involved senior-level leaders were in data and analytics initiatives. In other words, there has to be a culture conducive to turning data into something that’s actionable from the top down. Keep in mind that this in and of itself may be a multi-year investment of time, but without building this critical foundation first your project is essentially doomed from the start.
Second, but of equal importance, is how today’s companies approach the messy data problem. What I’ve observed is pretty cut and dry. The old way of capturing data and making use of it to drive a business forward is starting to hit a roadblock.
Rather than taking a services-based approach that requires a significant investment of time and in overhead, today’s businesses need to make use of machine learning. As data complexity grows, more data sources come online, and data volume outpaces Moore’s Law, it will become impossible for services-based solutions’ staff to scale in order to meet this challenge. By putting machine learning techniques to use, today’s forward-thinking businesses will not need to rely on heavy-duty coding projects that take months to implement. When it comes to the integration of complex data sources, for example, the application of machine learning has been able to collapse the time-to-value drastically while providing business leaders the agility necessary to stay on top of all-new data points from emerging source systems. Of course, like all data-related challenges, it is not easy-peasy but it’s a step in the right direction.
Imagine a solution that acts as agile as the new pace of doing business.
Actually, you don’t have to. That’s because machine learning is here to stay. As more of today’s leading edge organizations better understand how the power machine learning can be harnessed and put to use, expect an increase in the number of Big Data-related success stories.
Sign up for the free insideBIGDATA newsletter.