A recent query using Google Trends shows an interesting level of interest in machine learning over time (see figure below). There was an emergence in hype around the 2005 time-frame and led to a cooling off period, but once big data started heating up around 2010, the upward swing in interest continues until today. The good thing is that “machine learning” really is just a confluence of related disciplines like computer science, statistics and mathematics. These fields aren’t going anywhere and neither is machine learning. Statistical learning is here to stay!
A natural question to ask is why we are experiencing such an explosive growth of machine learning and its applications today even though the technology has been around for the past 30-40 years? Some of the waves that push machine learning forward are the biggest unintended consequence of the shift from desktop to the web, namely that we now automatically collect vast amounts of data on just about everything. This sounds straightforward but it’s hard to overstate the impact. If you were a brilliant AI researcher working on Microsoft Word in 1990, the only data you had was what you could collect in the lab. If you discovered a breakthrough, it would take years to ship it. Ten years later, the same researcher working at Google had access to a vast repository of search queries, clicks, page views, web pages, and links, and if she found an algorithmic improvement to boost click through events by 7%, she could deploy it instantly.
Tech companies on the web were the first to apply machine learning technology to vast amounts of data, which is why technologies like MapReduce and BigTable were created at places like Google, but we’re seeing the same techniques move into many other application areas such as manufacturing, finance, retail, energy, government, healthcare, life sciences, security, and many others. It’s pervasive enough that “Big Data” conferences like Strata, Hadoop Summit, and Spark Summit have thousands of attendees.
In addition, today computation is cheap and plentiful. Just look at the rebirth of neural networks in the past few years. Has the backpropagation algorithm has not fundamentally changed since it was first used in 1974, but now we have a million times more CPU power.
Researchers were excited about machine learning as far back as the 1950s, and many of the implications were clear even then. But to achieve that vision required continued technological innovation including the microprocessor, the personal computer, a computer on every desktop, networked computing, widespread access to the Internet, the web browser, e-commerce, search, and social media.
Ultimately, the reason machine learning excels today is not due to any specific algorithmic advance, but rather a decades-long amalgamation of technologies that enable the digitization and network-enabling of data, which is just now bearing fruit.
For more information about machine learning, check out our “Ask a Data Scientist” series sponsored by Intel. Also, download the “insideBIGDATA Guide to Machine Learning” sponsored by Revolution Analytics. Finally, check our new Machine Learning channel for all the late breaking news and resources for this important field.
Contributed by: Daniel D. Gutierrez, Managing Editor of insideBIGDATA. He is also a practicing data scientist through his consultancy AMULET Analytics. Daniel just had his most recent book published, Machine Learning and Data Science: An Introduction to Statistical Learning Methods with R.