My bookshelves are lined with materials that support my work in data science and machine learning. I have a large section of mathematics books including several on the subject of linear algebra. For many years my “go to” text on linear algebra was an old 2nd edition of MIT Professor Gilbert Strang’s seminal book on the subject that I picked up at a swap meet. To my surprise, the good professor recently sent me a copy of his latest and greatest 5th edition of “Introduction to Linear Algebra” (Wellesley-Cambridge Press).

I found the new edition to be even better than previous editions. For one it is now 574 pages versus my old copy’s 374. I also found the book to be impressively re-tooled for educational purposes. The chapters contain useful “Review of the Key Ideas” sections, worked examples, and well thought out problem sets (with special “Challenge Problems” for those who want to dive deeper). Gilbert Strang’s textbooks have changed the entire approach to learning linear algebra – away from abstract vector spaces to specific examples of the four fundamental subspaces: the column space and nullspace of A and A’. Here’s a list of the chapters:

Chapter 1 – Introduction to Vectors

Chapter 2 – Solving Linear Equations

Chapter 3 – Vector Spaces and Subspaces

Chapter 4 – Orthogonality

Chapter 5 – Determinants

Chapter 6 – Eigenvalues and Eigenvectors

Chapter 7 – The Singular Value Decomposition

Chapter 8 – Linear Transformations

Chapter 9 – Complex Vectors and Matrices

Chapter 10 – Applications

Chapter 11 – Numerical Linear Algebra

Chapter 12 – Linear Algebra in Probability & Statistics

The chapters directly apply to the needs of data scientists wishing to establish a firm foundation for how machine learning happens behind the scenes. All chapters are superbly crafted, but my favorites are: Chapter 7 because SVD plays an important role in Principal Component Analysis for dimensionality reduction as well as PCA regression; Chapter 10 as it enhances the math subject matter with practical applications; Chapter 11 which is a nice adjunct to the pure math content and reminds me of portions of the old “Numerical Methods” (Prentice-Hall) text by G. Dahlquist et al that I used in my early days of data science; and Chapter 12 which is perfect for data scientists who want to see the relationship with statistics and probability.

Strang’s new edition is a great launching point for newbies as well as practicing data scientists to gain a foothold in the theory behind the technology. If you feel a bit insecure with your mathematical prowess when reading the statistical leaning bible “Elements of Statistical Learning” by Hastie, Tibshirani and Friedman (a group of high-profile Stanford researchers), then Strang’s book is the best way to lay a firm foundation.

Gilbert Strang is a Professor of Mathematics at Massachusetts Institute of Technology and an Honorary Fellow of Balliol College, of the University of Oxford. His current research interests include linear algebra, wavelets and filter banks, applied mathematics, and engineering mathematics. He is the author or co-author of eight textbooks. He is a Fellow of the American Academy of Arts and Sciences and a member of the National Academy of Sciences.

The book also comes with an excellent web resource which includes downloadable sections (PDFs) of many chapters, a complete chapter-by-chapter solutions manual for the problem sets, and practice exam questions. The book is used as the textbook for MIT’s undergrad linear algebra course 18.06. It is also the book used in MIT’s Open Courseware class on the subject, complete with video lectures. This means you can take a full-fledged MIT course to help you become well-versed with this important subject matter. I highly recommend this book for any up-and-coming data scientist.

I do have a big complaint with this new book! It’s going to sap a lot of time from my busy schedule because with such a great learning resource in my hands, I know myself, I’m going to spend time “re-learning” the subject for the nth time, doing the problem sets, and thinking hard about how important math is to a firm understanding of machine learning. I don’t have time for this!

*Contributed by: Daniel D. Gutierrez, Managing Editor of insideBIGDATA. He is also a practicing data scientist through his consultancy AMULET Analytics. Daniel is also an educator and author with his latest title “Machine Learning and Data Science: An Introduction to Statistical Learning Methods with R.” Contact me at: daniel@insideBIGDATA.com*

*Sign up for the free insideBIGDATA newsletter.*

## Leave a Comment