Book Review: Mathematics for Machine Learning

This article represents the next installment of my reviews of books with a focus on the mathematics of machine learning. I’m energized about all the new learning resources coming out with alignment around this topic. As I mention to my Introduction to Data Science students, it is important for all data scientists to have a command of the theoretical foundations for our field. Without this, we’re really just guessing when it comes to performing tasks like hyperparameter tuning. “Mathematics for Machine Learning” by Marc Peter Deisenroth, A. Aldo Faisal, and Cheng Soon Ong, published by Cambridge University Press, is an excellent way to learn the math behind the models. This review shall highlight all the ways this book is special among the competition. Of all the books I’ve reviewed thus far, this is my favorite. Read on to learn why.

Excellent Coverage

As exhibited in the Table of Contents below, this book has excellent coverage for all important topic areas. I found Part I, Mathematical Foundations, a one-stop-shop for all the mathematical background necessary to appreciate all the ML-specific topics in Part II. There’s really no need for multiple textbooks on linear algebra and vector calculus for example. You can quickly get up to speed with these topics by methodically reading the chapters. I also appreciate the logical progression of topics as it makes total sense for getting a solid foundation for the mathematics of ML.

Part I: Mathematical Foundations

1. Introduction and Motivation
2. Linear Algebra
3. Analytic Geometry
4. Matrix Decompositions
5. Vector Calculus
6. Probability and Distribution
7. Continuous Optimization

Part II: Central Machine Learning Problems

1. When Models Meet Data
2. Linear Regression
3. Dimensionality Reduction with Principal Component Analysis
4. Density Estimation with Gaussian Mixture Models
5. Classification with Support Vector Machines

Beautifully Produced

When I receive a review copy of a new book from the publisher, I’m never sure of the level of publication quality I might encounter. Some books are flimsy, some are poorly edited, and others do silly things like publish color data visualizations in black & white. This book, on the other hand, is spectacular! The production quality is very high, and the figures, oh the figures! I’ve never seen a math book come alive like this one does, and colorful and well-thought-out graphics feed the senses, and carefully aid in communication of such a deep and technical subject. For instance, every chapter includes a “Mind Map” that is an outline of all the topics covered and how they’ll be used in subsequent chapters. Why can’t all books include this useful guide to learning?

Mathematical Clarity

The book includes very clear and concise mathematics with no “hand waving” in the derivations but instead every chapter has many long worked-out “Examples” that drill-down into the theory. Again, the authors include beautiful visualizations designed to aid in the understanding of the math as depicted in the adjacent figure. Further, each chapter includes well-crafted exercises to help the reader hone their understanding of the topics. Some of my favorite treatments in the book include: singular value decomposition (Section 4.5), Gradients of Vector-valued Functions (Section 5.3), Optimization Using Gradient Descent (Section 7.1), Bayesian Linear Regression (Section 9.3), and Dimensionality Reduction with PCA (Chapter 10).