I’ve been waiting for good book that introduces the concepts of data science and machine learning for a lay audience. There’s so much being reported on these subjects nowadays that I’m sure many people are struggling to understand the underlying technology and its implications. We see talk about IBM’s Watson, the cognitive computing system once known for conquering Jeopardy!, now taking on the healthcare industry. And then there’s Google’s humanoid robot that just took its first step outdoors. “Machine learning,” “data science,” “big data,” and “artificial intelligence” dominate the headlines daily.
The prominence of the subject especially is true when you read that luminaries like Stephen Hawking and Elon Musk are afraid of Killer AI … <sigh>. As a machine learning practitioner, I’d like for the general public to understand enough to make their own reasoned evaluation of the technology and what it means to humanity.
Then I read an announcement of a new book that seemed to fill this need. The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World (Basic Books; September 22, 2015) by University of Washington professor Pedro Domingos. My hopes were high and for the most part I think this book represents a good introduction to the area.
Machine learning algorithms increasingly affect our lives. They find books, movies, jobs and dates for us. They manage our investments, and discover new drugs. The Master Algorithm, explains how more and more, these algorithms work by learning from the trails of data we leave in the digital world that surrounds us. Like curious children, they observe us, imitate us, and experiment. Of course, they’re guided by how we define them and ultimately program them.
The premise of the book is: at the world’s top research labs and universities, the race is on to invent the ultimate learning algorithm: one capable of discovering any knowledge from data, and doing anything we want, before we even ask. This holy grail of computer science is a primary research focus for the author. To believe that a single master algorithm will be able to make predictions in any problem domain may be stretch. It would be like looking for one theorem in mathematics that solves all problems. After all, we already know the answer to the ultimate question of life, the universe, and everything – it’s 42 (if you find yourself hitchhiking across the Milky Way)!
Machine learning is the automation of discovery – the scientific method on steroids – that enables intelligent robots and computers to make predictions for new observations of the world based on previously known observations – again, based on how we humans program them. No field of science today is more important yet more shrouded in mystery. Domingos works to lift the veil to give us a peek inside of the learning machines that power Google, Amazon and your smartphone. In doing so, he charts a course through machine learning’s five major schools of thought, showing how they turn ideas from neuroscience, evolution, psychology, physics, and statistics into algorithms ready to serve you.
The premise of the book definitely is compelling, but I contend its real value is how well each major learning algorithm is described while coupled with real-life and identifiable use case examples. I plan to recommend the book to my beginning data science and machine learning students for precisely this reason.
Domingos has an interesting take on the field with statements like – to make serious contributions to the field of machine learning, math is not needed.
The second goal of this book is thus to enable you to invent the Master Algorithm. You’d think this would require heavy-duty mathematics and sever theoretical work. on the contrary, what it requires is stepping back from the mathematical arcana to see the overarching pattern of learning phenomena.
So in essence, you too can be a data scientist by just leaning back and dream up a “master algorithm” all without any background in mathematics. Domingos goes on to say “we can fill in the mathematical details; but that is not for this book, and not the most important part.” I guess we’re all considered Einsteins who can come up with world-changing thought experiments to change the field forever. Amazing.
Probably the biggest question mark Domingos submits is to establish the belief that machine learning does something magical. He says, “Now we don’t have to program computers; they program themselves,” and “machine learning is something new under the sun: a technology that builds itself.” I guess I’ve been spinning my wheels all these years by developing and programming statistical learning algorithms myself when they should have been programming themselves.
On balance, the book serves to open the readers mind to how today’s technology works and how much computers do to make their lives better. It gives plenty of examples for how machine learning facilitates this. But don’t believe that by reading this book on the subject you’ll truly understand how machine learning works. For that, sorry, you really do need to know the computer science and the statistics and the math. You can start with a gentle introduction (with math) by taking Andrew Ng’s Machine Learning class on Coursera. If you can get past this, then there is hope, but that’s just the tip of the iceberg. To more fully understand the subject you should be able to read the machine learning Bible “The Elements of Statistical Learning,” by Hastie, Tibshirani and Freidman all from Stanford Computer Science. But even this valuable resource only serves as an index to many other books and areas of research.
In summary, I would recommend The Master Algorithm to machine learning newbies and practitioners alike. It can be viewed a a solid and valuable introduction to a very technical subject. Its futuristic claims can be accepted with bit of caution, but all-and-all, this new book constitutes a thought-provoking guide to technology that affects us all.
Contributed by: Daniel D. Gutierrez, Managing Editor of insideBIGDATA. He is also a practicing data scientist through his consultancy AMULET Analytics. Daniel just had his most recent book published, Machine Learning and Data Science: An Introduction to Statistical Learning Methods with R, and will be teaching a new online course starting on Jan. 13 hosted by UC Davis Extension: Introduction to Data Science.
Sign up for the free insideBIGDATA newsletter.