Whenever I wrap up one of my university or corporate training classes on data science, machine learning or R, I feel compelled to fess up and tell my students the harsh reality – to be a real data scientist, eventually they’re going to need to embrace the mathematical foundations of the algorithms they so casually use in environments like R and Python. What’s behind a supervised learning algorithm like linear regression? Among other things, it is partial differential equations to implement a gradient descent process. And what’s behind other algorithms like support vector machines and unsupervised techniques like principal component analysis (PCA)? It is linear algebra.

I take a step further and tell my students in order to call themselves data scientists, they must truly understand the “bible” of machine learning: “Elements of Statistical Learning” (ESL) by Hastie, Tibshirani and Friedman (which you can download for free HERE). This authoritative text by three Stanford luminaries includes a lot of math, mostly linear algebra.

Unfortunately, many newbie data scientists don’t have the requisite background in computer science, mathematics, statistics and probability theory. So where to start? I always say, start with the basics, start with linear algebra! The only way to understand ESL is with a firm foundation of linear algebra. But how to obtain this knowledge? In most cases, I would recommend a tried and proven text like “Linear Algebra and Its Applications,” by renowned MIT mathematics professor Gilbert Strang. But a book used at a place like MIT might not be the starting ground for most budding data scientists. I contacted Dr. Strang last year and told him how much I enjoyed his text over the years. It turns out he was friends with my freshman Calculus professor at UCLA. Academics travel in small circles.

Years ago, when I was building up my machine learning chops, I revisited my course work from when I was an undergrad in the UCLA mathematics department. I took Math 115A (Linear Algebra) as a Junior but the material had drifted away over the years. To help reclaim my command of the subject, I found an excellent reprint text at the UCLA bookstore, “Linear Algebra,” by Michael O’Nan and Herbert Enderton. This 500 page tome included all the proofs for each theorem and plenty of solved exercises. I literally took just over one year in my spare time to get back up to speed with linear algebra. This turned out to be an excellent expenditure of time. Sadly, this book is now out of print. Again, a resource like this wonderful book might not be the starting point for a newbie.

Given my admonitions above, I was happy to receive a review copy of book employing a very unique approach for teaching mathematics, “The Manga Guide to Linear Algebra,” published by No Starch Press. This is a comic book, perfect for new data scientists! I’ve read Japanese manga in the past and this book is authentic including inquisitive characters like Reiji and Misa. One the serious side, the book is great for newbies because it clearly spells out each minute step in performing calculations involving vectors, matrices, determinants, linear transformations, kernels, eigenvalues and eigenvectors. These simple, but useful steps are often left out of university texts mentioned earlier.

Here is a list of chapters:

Chapter 1 – What is linear algebra?

Chapter 2 – The fundamentals

Chapter 3 – Intro to matrices

Chapter 4 – More matrices

Chapter 5 – Introduction to vectors

Chapter 6 – More vectors

Chapter 7 – Linear transformations

Chapter 8 – Eigenvalues and eigenvectors

I think that if you wade through this very well-written introduction to the subject, you’ll be ready to pursue more challenging aspects. Maybe you can watch (and maybe fully understand after a few attempts) Professor Strang’s MIT lecture on performing an important calculation in PCA, singular value decomposition. You can watch this lecture HERE, available on MIT Open Courseware. Before forewarned, the good professor makes a mistake midstream and nobody in his MIT class saw it. Can you find it?

With the Manga Guide to Linear Algebra, I now feel like I have a good learning resource recommendation for the last day of my data science courses. I think this book is a great way to pave the way toward deeper understanding of the data science field.

*Contributed by Daniel D. Gutierrez, Managing Editor of insideBIGDATA. In addition to being a big data journalist, Daniel is also a practicing data scientist, educator and sits on a number of advisory boards for various start-up companies. *

## Speak Your Mind