Since launching Metacademy, I've had a number of people ask ,
What should I do if I want to get 'better' at machine learning, but I don't know what I want to learn?
Excellent question! My answer: consistently work your way through textbooks.
I then watch as they grimace in the same way an out-of-shape person grimaces when a healthy friend responds with, "Oh, I watch what I eat and consistently exercise." Progress requires consistent discipline, motivation, and an ability to work through challenges on your own. But you already know this.
But why textbooks? Because they're one of the few learning mediums where you'll really own the knowledge. You can take a course, a MOOC, join a reading group, whatever you want. But with textbooks, it's an intimate bond. You'll spill brain-juice on every page; you'll inadvertently memorize the chapter titles, the examples, and the exercises; you'll scribble in the margins and dog-ear commonly referenced areas and look for applications of the topics you learn -- the textbook itself becomes a part of your knowledge (the above image shows my nearest textbook). Successful learners don't just read textbooks. Learn to use textbooks in this way, and you can master many subjects -- certainly machine learning.
In this brief roadmap, I list a few excellent textbooks for advancing your machine learning knowledge and capabilities. I picked these texts after consulting with fellow graduates students, postdocs, and professors at UC Berkeley -- my own experience played a role as well. This list is purposefully sparse. Having 20 textbooks thrown at you is useless.
Level 0: Neophyte
My sister, an artist and writer by trade, asked me how she could understand the basics of data science in a nontrivial way. After reading several introductory and pop books in this area, I recommended Data Smart. My sister was able to work through it, and in fact, the next time I saw her we had a delightful conversation about logistic regression =).
Expectations: You'll understand some common machine learning algorithms at a high-level, and you'll be able to implement some simple algorithms in Excel (and a bit in R if you get through the entire book).
Necessary Background: basic Excel familiarity -- this book is a great starting point if you don’t have a CS/math-based background. Plus, it's not nearly as dry as a typical textbook.
Key Chapters: It's a short read, and every chapter is fairly illuminating -- though, you can skip the worksheet examples, and chapters 8 and 10 if you're interested in a basic overview.
Capstone Project: Using this dataset see if you can predict the MPG of the car given all of its other attributes. This will test your ability to manipulate data for a desired machine learning task, and also your ability to apply the correct machine learning technique to a somewhat vague problem.
Level 1: Apprentice
This is an example-laden book for simultaneously learning practical machine learning techniques and the R programming language. I'm a long time Scipy user, but after finishing the first few chapters (and remembering that R packages are so damn simple), I've mostly been turning to R for quick analyses.
Expectations: You'll be able to recognize when fundamental machine learning algorithms apply to certain problems and implement functioning machine learning code in R
Necessary Background: No real prerequisites, though the following will help (these can be learned/reviewed as you go):
- some programming experience [in R]
- some algebra
- basic calculus
- a little bit of probability theory
Key Chapters: It's a short book, and I recommend all of the chapters -- be sure to actually think through the examples (and type them into R). If you're looking to shave off some time, you can safely skip chapters 8 and 12.
Capstone Project: Using this dataset see if you can predict the food ratings given all of the other attributes. Use three different machine learning techniques for this task, and justify your top choice. Also, build a classifier that predicts whether a review is "good" or "bad" -- you should use reasonable "good/bad" thresholds. This will test your data munging capabilities, your strategy for analyzing a larger dataset, your knowledge of machine learning techniques, and your ability to write analysis code in R.
Level 2: Journeyman
This stage separates those with a surface-level understanding from those with rigorous, in-depth, knowledge. It starts getting mathy at this stage, but if you plan on making machine learning a substantial part of your career, you'll have to cross this bridge. PRML is the classic bridge. Use it. Read it. Love it. But keep in mind that a Bayesian perspective isn't the only story (Bishop strongly tends towards the Bayesian approach to machine learning).
Expectations Be able to recognize, implement, debug, and interpret the output of most off-the-shelf machine learning methods. Also, you should have an intuition about which advanced ML concepts to investigate for a given problem. Practicing data scientists should at least be at this level.
Necessary Background: you should be comfortable with off-the-shelf clustering and classification algorithms linear algebra: understand matrix algebra and determinants some multivariate and vector calculus experience -- know what a Jacobian is some machine learning implementation experience in R, Matlab, the SciPy stack, or Julia.
Key Chapters: Know and love chapters 1-12.1. Chapters 12.2 - 14 can be consulted as you need them.
Capstone Project: Implement the [Online Variational Bayes Algorithm for Latent Dirichlet Allocation] and analyze a large corpus of your choosing. Verify that your LDA implementation is correct. This will test your ability to understand and interpret cutting-edge machine learning algorithms, approximate and online inference techniques, as well as your implementation chops, your data munging abilities, and your ability to define an interesting application from a vaguely defined problem.
Note PRML spends quite a bit of time on Bayesian machine learning methods. If you're unfamiliar with Bayesian statistics, I recommend studying the first 5 chapters of Doing Bayesian Data Analysis
Level 3: Master
There's a number of subjects you may want to study in depth at the master level: convex optimization, [measure-theoretic] probability theory, discrete optimization, linear algebra, differential geometry, or maybe computational neurology. But if you're at this level, you probably have a good sense of what areas you'd like to improve, so I'll stick with the single book recommendation. Probabilistic Graphical Models: Principles and Techniques is a classic, monstrous tomb that should be within arms length of any ML researcher worth his/her salt =). PGMs pervade machine learning, and with a strong understanding of this content, you'll be able to dive into most machine learning specialties without too much pain.
Expectations: You'll be able to construct probabilistic models for novel problems, determine a reasonable inference technique, and evaluate your methodology. You'll also have a much deeper understanding of how various models relate, e.g. how deep belief networks can be viewed as factor graphs.
- you should be comfortable with most off-the-shelf ML algorithms
- linear algebra -- know how to interpret eigenvalues
- multivariate and vector calculus experience
- some machine learning implementation experience in R, Matlab, the SciPy stack, or Julia.
Key Chapters: Chapters 1-8 cover similar content as Bishop's Pattern Recognition and Machine learning Ch. 2 and 8, but at a much deeper level. Chapters 9-13 contain key content, and Ch. 19 on partially observed data is really helpful. Read Ch. 14 and Ch. 15 when/if they are relevant to your goals.
Level 4: Grandmaster
If you've achieved master status, you'll have a strong enough ML background to pursue any ML-related specialization at a novel level: e.g. maybe you're interested in pursuing novel deep learning applications or characterizations?