ML Resources

Outline:

-Where Should I Start?
-Computer Science
-Real Analysis
-Linear Algebra
-Probability
-Statistics
-Causal Inference
-Optimization
-Information Theory
-Classic Machine Learning
-Bayesian Machine Learning
-Deep Learning
-Natural Language Processing
-Reinforcement Learning
-Applications in Biology and Medicine
-Favorite Websites

Where should I start (in ML)?

If you’re here looking for a general introduction to machine learning, I would proceed in the following order:

Introduction to Statistical Learning by James, Witten, Hastie, and Tibshirani. This textbook is a fantastic introduction to the field, written by some of its leading experts. It is short and well-written enough to be read cover-to-cover, high-level enough to be accessible to people from various backgrounds, yet rigorous in the sense that it teaches you to think about the problems rather than just giving you a cookbook. The textbook is free as a PDF at the book website, and the authors have also provided a collection of excellent videos that accompany the text on Youtube (the videos are nicely organized into a collection here). Note that this textbook also has a “big sister”, the classic Elements of Statistical Learning, which covers the same content at much more mathematical depth. However, I would start with ISL and then move to ESL from there as your interest directs. Note that the code in this book and class is in R and covers most of the classical ML toolkit but doesn’t cover deep learning.
Fast.ai by Jeremy Howard and Rachel Thomas. This course provides an accessible but extremely effective introduction to deep learning, the most popular branch of modern machine learning. The course is hands-on and immensely practical, but each lesson will equip you with the tools to build a very effective model for some new branch of ML (computer vision, NLP, etc.). The course is taught in Python using Pytorch and their own library.

Once you make your way through ISL and fast.ai, you will have a solid handle on all the most commonly used techniques in ML (classic and cutting edge). You will have a decent intuition for which methods can work when, and an ability to at least understand and modify code for ML analysis in both R and Python. From there, you should be prepared to jump at greater depth into any subarea of the field that you fancy.

Depending on background and bandwidth, a motivated student could probably work through the above material in 1-4 months. Go get ‘em! :)

Computer Science

Theory

File	Description
CS Theory Cheatsheet	CS theory cheat sheet, originally accessed here
Tim Roughgarden’s Lectures on Algorithms and Algorithms Illuminated	Tim Roughgarden is one of most natural teachers I’ve ever seen. The first link is to lecture notes in PDF form from many classes. Videos for his Algorithms 2 class (CS 261) are here. The second is a link to his page for his new textbook, but that page also has links out to all the youtube videos from his coursera version of CS 161 (Algorithms 1).

Programming cheatsheets

File	Description
R dplyr cheatsheet	Cheatsheet for Hadley’s amazing data wrangling package, dplyr. One of many from RStudio
R dplyr and data.table side by side	Nice side-by-side comparison of dplyr and data.table by Atrebas.
R ggplot2 cheatsheet	Cheatsheet for Hadley’s amazing plotting package, ggplot2. One of many from RStudio
SQL Joins cheatsheet	Graphical description of classic SQL joins w/ toy code
Python pandas cheatsheet	Cheatsheet for python’s data wrangling package, pandas. Downloaded from here
Python numpy cheatsheet	Cheatsheet for python’s numerical package, numpy. Downloaded from Datacamp
Python keras cheatsheet	Cheatsheet for python’s NN package, keras. Downloaded from Datacamp.
Python scikit-learn cheatsheet	Cheatsheet for python’s ML package, scikit-learn. Downloaded from Datacamp.
Python seaborn tutorial	Tutorial for python’s plotting system, seaborn. Haven’t found a great one yet for matplotlib.
Graphic Design cheatsheet	Cute little graphic design cheatsheet downloaded from here
Pytorch cheat sheet and tutorials and docs	The pytorch team has some world class docs and tutorials.

Real Analysis

File	Description
Measure, Integration, and Real Analysis	Sheldon Axler’s textbook-under-development on measure theory and real analysis. (Website).

Linear Algebra

File	Description
Boyd Applied Linear Algebra	Freely available book from Boyd and Vandenberghe on Applied LA (website).
Fast.ai Computational Linear Algebra	Rachel Thomas has put together this great online textbook for computational linear algebra with accompanying youtube videos.
CS 229 Linear Algebra Notes	Linear algebra reference from Stanford’s Machine Learning Course.
Matrix Calc for DL (pdf here)	Really nice overview of matrix calculus for deep learning from Parr/Howard. Citable on on arxiv.
Strang: Matrix methods for Data, Signals, and ML	Gil Strang’s newer linear algebra course, focusing on the linear algebra relevant to data and ML. Youtube videos here.
Linear Algebra Done Right, Abridged	This is a free version of Sheldon Axler’s texbook Linear Algebra Done Right, which is a nice intro treatment of the subject that is accessible but more pure-mathy in flavor than the above.

Probability

File	Description
Probability Cheatsheet	Probability cheat sheet, from William Chen’s github
MIT 6.041 Intro Probability	John Tsitsiklis et al have put together some great resources. Their classic MIT intro to probability has been archived on OCW and also offered on Edx (Part 1, Part 2). The textbook is also excellent.
Joe Blitzstein’s Stat110	Joe Blitzstein’s undergrad probability course has a high overlap in content with 6.041. Like 6.041, it also has a great textbook, youtube videos, and an edx offering. It’s a bit more playful, as well.
MathematicalMonk	This guy is amazing. Some 250 youtube tutorials on ML, Probability, and Information Theory.
Trouble in high-dimensional Land	Fun little blog post going through intro high-dimensional geometry and its relevance to probability.

Statistics

File	Description
Common statistical tests are linear models (or: how to teach stats) and Statistical Test Cheatsheet	This little blog post does an incredible job explaining how a whole bunch of common statistical tests can be intuitively unified under a single framework (linear models)
Russell Poldracks’ Statistical Thinking for the 21st Century	This appears to be a pretty fantastic (albeit rather elementary) textbook for a one-quarter intro to statistics class (stat 60 at stanford). Despite assuming little, it touches upon a lot of great topics.
Doug Sparks’ Stats 200	Nice course notes on Statistical Inference from Doug Sparks 2014 offering of stats 200
Modern Statistics for Modern Biology	This online textbook is from Susan Holmes and Wolfgang Huber, and provides a nice and accessible intro to the parts of modern data science revelant to computational biologists. It also happens to be a piece of typographic art, created with bookdown.
Statistical Rethinking	Lecture Videos on youtube accompany this fantastic introductory textbook.
Seeing Theory Frequentist Inference	This is a really beautiful visual presentation of the basic ideas of frequentist inference, from the Seeing Theory textbook. I love it.
Estadística Multivariada	Beautiful note set on multivariate stats from María Teresa Ortiz and Felipe González. Covers bayesian networks, gaussian models, missing data, latent variable models, and baysian methods.
Estadística Computacional	Beautiful note set on computational stats from María Teresa Ortiz. Covers basic probability, simulation, visualization, inference, and some basic bayesian methods.

Causal Inference

File	Description
Hernan and Robbins Causal Inference Book	Long-upcoming textbook on causal inference (from the epidemiology perspective), with drafts fairly frequently updated on the web page.
All the DAGs from the Causal Inference Book and Glossary and Notes	In the first post, I’m compiling all the DAGs from Hernan and Robbins book into one place for easier use. In the second, I have some additional notes to accompany part one of the book.
Robert Osazuwa Causal Modeling in ML Book	Looks to be a nice course in development on causal and generative modeling. Lecture notes are being produced in a bookdown here
Causal Inference Papers	Nice Github repo that compiles a bunch of Arxiv papers on Causal Machine Learning
Hernan Selection Bias	Nice summary of selection bias via DAGs by Hernan et al.

Optimization

File	Description
Boyd Convex Optimization Book	Famous and freely available textbook from Boyd and Vandenberghe, accompanied by slides and Youtube videos. More advanced follow-up class here
NYU Optimization-based Data Analysis 2016 and 2017	Fantastic course notes on Optimization-based data analysis from NYU 2016 website and 2017 website.
Ruder Gradient Descent Overview (PDF here)	Great overview of gradient descent algorithms.
Bottou Large-Scale Optimization	Notes on Optimization from Bottou, Curtis, and Nocedal. Downloaded from arxiv.

Information Theory

File	Description
Chris Olah Visual Information Theory	As always, Chris Olah creates an amazing presentation both in words and images. Goal is to visualize key information theory concepts.
Cover and Thomas Ch2 - Entropy and Information	The extremely well-written introductory chapter from the classic information theory textbook.
Cover and Thomas Ch11 - Info Theory and Statistics	The information theory and statistics chapter from the classic information theory textbook.
Deriving Probability Distributions from Maximum Entropy Principle	It feels slimey and self-serving to include this, but I wrote this post to better understand how information theory can be used to understand/derive common probability distributions from first principles.
Deriving the information entropy of the multivariate gaussian	Another blog post I wrote to try to understand information theory + statistics.

Classic Machine Learning

Textbooks, Lectures, and Course Notes

File	Description
Math for ML Book	Math-first but highly accessible intro textbook for machine learning by Faisal and Ong, available on github.
Learning from Data by Abu Mostafa	“A short course. Not a hurried course.” on machine learning. A nice first treatment that is concise but fairly rigorous. Also has videos organized by topic.
Bishop’s Pattern Recognition and Machine Learning	This is a classic ML text, and has now been finally released (legally) for free online.
CS 229 Lecture Notes	Classic note set from Andrew Ng’s amazing grad-level intro to ML: CS229.
CS 229 TA Cheatsheet 2018	TA cheatsheet from the 2018 offering of Stanford’s Machine Learning Course, Github repo here.
ESL and ISL from Hastie et al	Beginner (ISL) and Advanced (ESL) presentation to classic machine learning from world-class stats professors. Slides and video for a MOOC on ISL is available here.
Foundations of Data Science textbook and videos	This mini-course appears to have developed out of CMU’s “CS Theory for the Information Age” 2012 site, which I think is a better name for this. It’s a strong upper-undergrad or intro-grad student math class covering foundations for high-dimensional data algorithms. Another class using the textbook is here. High-dimensional probability section is cool.
Tim Roughgarden’s Modern Algorithmic Toolbox	CS 168: Modern Algorithmic Toolbox has fantastic coverage of PCA, SVD, Compressive Sensing, Tensors, and other core ML tools.

Special Topics and Blog Posts

File	Description
Roughgarden SVD Notes	Really great presentation of SVD from Tim Rougharden’s CS168 at Stanford.
Roughgarden PCA Notes	Really great presentaiton of PCA from Tim Rougharden’s CS168 at Stanford.

Bayesian Machine Learning

File	Description
CS 228 PGM Notes	Really great course notes on Probabilistic Graphical Models from at Stanford. PDF export wasn’t ideal so linking only to website.
CMU PGM Course 2019 and 2014	Nice course from CMU (10-708) that covers PGMs and – in newer offerings – relevant parts of DL, too. Has videos, scribe notes, and slides.
Blei Foundations of Graphical Models Course	2016 course notes on Foundations of Graphical Models from David Blei 2016 website
Blei Exponential Familes/Variational Inference	A couple of the course notes I particularly like from Blei’s 2011 Probabilistic Modeling Course )
Blei Variational Inference Review	Overview on Variational Inference from David Blei available on arxiv
Visual Exploration of Gaussian Processes	Masterclass exposition on Gaussian Processes from the always-amazing Distill.

Deep Learning

Textbooks, Lectures, and Course Notes

File	Description
Roger Grosse’s CSC321 Notes	Notes from Roger Grosse’s CSC 321 full website here. Probably the single best intro to DL course I’ve found from any university. Notes and slides are gorgeous.
Fast.Ai	Wonderful set of intro lectures + notebooks from Jeremy Howard and Rachel Thomas. In addition, Hiromi Suenaga has released excellent and self-contained notes of the whole series with timestamp links back to videos: FastAI DL Part 1, FastAI DL Part 2, and FastAI ML.
CS231N DL for Vision	Amazing notes from Andrej Karapthy, with lectures on Youtube as well.
Deep Learning Book	This textbook by Ian Goodfellow, Yoshua Bengio, and Aaron Courville is probably the closest we have to a de facto standard textbook for DL.
CS294-158 Deep Unsupervised Learning	Open course on deep unsupervised learning from Berkeley. Looks fantastic.

Special Topics and Blog Posts

File	Description
Karpathy’s Recipe for Training NNs	A great blog post that contains a bunch of little tricks for training deep neural networks
Troubleshooting Deep Neural Networks video and slides by Josh Tobin	“A field gudie to fixing your model,” which has some nice tips.
Different types of convolutions by Kunlun Bai	Nice blg post providing an overview of many different types of convolutions used in deep learning.
Adversarial Examples/Robust ML Part 1, Part 2, and Part 3	The Madry lab is one of the top research groups in robust deep learning research. They put together a fantastic intro to these topics on their blog. I hope they keep making posts…
Distill Attention	Amazingly clear presentation of the attention mechanism and its (early) variants
Lilian Weng Attention post	Nice blog post on attention, self-attention, trasnformers, etc
Distill Building Interpretability	Coolest visualizations of NN internals I’ve ever seen
Distill Feature Visualization	Running theme: If it’s only distill.pub, read it.
Chris Olah Understanding LSTMs	Chris Olah is a master of his craft, and here offers a fantastic overview of LSTMs and GRUs.
Intro to Federated Learning	Intro to federated learning and PySyft from Andrew Trask and others using PyTorch.
Triplet Loss and Online triplet mining blog post	Nice exposition on Olivier Moindrot’s blog
Graph Convolutional Neural Networks	Blog post on GCNNs by Thomas Kipf

Instructive Codebases

File	Description
Sebastian Raschka’s Deep Learning Models Github	An impressively comprehensive set of TensorFlow and Pytorch models, annotated and perusable in 80+ Jupyter Notebooks.
Pytorch Tutorials	The tutorials put out by the pytorch developers are really fantastic. Easy to see why the community is growing so fast.
Wiseodd’s Website and Deep Generative Models Github and	An amazing collection of deep learning implementations.

Natural Language Processing

Textbooks, Lectures, and Course Notes

File	Description
Fast.ai Intro to NLP	Code-first intro to NLP from the excellent folks at fast.ai.
CS224W Deep Learning for NLP 2017	Fantastic course notes on Deep Learning for NLP from Stanford’s CS224. Updated noteset appears to live here.
CMU CS 11-747 Deep Learning for NLP	Fantastic course on Deep Learning for NLP from CMU’s Graham Neubig. Really great lecture videos on Youtube here
CS224U Natural Language Understanding 2019	Another DL+NLP course at Stanford. Also has accompanying Youtube videos and a Github repo

Special Topics and Blog Posts

File	Description
Chris Olah on Word Embeddings	Chris Olah explaining world embeddings and the like.
Transformers from Scratch by Peter Bloem	Nice overview of transformer architecures with some great diagrams and code.
The Illustrated Transformer	Nice visualization of how transformer networks work by Jay Alammar.
The Annotated Transformer	Harvard’s Sasha Rush created a line-by-line annotation of “Attention is All You Need” that also serves as a working notebook. Pedagogical brilliance, and it would be awesome to do this for a couple papers per year.
Goldberg’s Primer on NNs for NLP	Overview of Deep Learning for NLP from Yoav Goldberg downloaded from here.
Neubig’s Tutorial on NNs for NLP	Overview of Deep Learning for NLP from Graham Neubig. Downloaded from arxiv and pairs nicely with his course and videos.

Reinforcement Learning

Textbooks, Lectures, and Course Notes

File	Description
Sutton and Barto Open RL Book	De-facto standard intro to RL, even though the textbook is only now about to be published!
Stanford Reinforcement Learning Course by Emma Brunskill	A really great RL class from Stanford. The website has a really nice note set. Also, lecture videos are on Youtube.
Berkeley Deep Reinforcement Learning	RL class from Berkeley taught by top dogs in the field, lectures posted to Youtube.

Special Topics and Blog Posts

File	Description
Karpathy’s Pong From Pixels	Andrej Karpathy has a real gift for didactics. This is a self-contained explanation of deep reinforcement learning sufficient to understand a basic atari agent.
Weng’s A (Long) Peek into RL	A nice blog post covering the foundations of reinforcement learning
OpenAI’s Intro to RL	The introductory tutorial for OpenAIs new “Spinning Up in Deep RL” website

Applications in Biology and Medicine

File	Description
Medical ML Datasets github	Github repo of a bunch of medical ML datasets, compiled by Andrew Beam.
ML for protein design github	Nice github repo put together by Kevin Yang, covering a bunch of ground in the ML for proteins space.
Best Practices in Single-Cell RNA-Seq Tutorial	Excelllent tutorial on single-cell RNA-seq, walking through current best practices at every stage of scRNA-seq analysis.

Miscellaneous websites

File	Description
Chris Olah’s Blog	Essentially everything on here is gold. I am so grateful for the hours he must put into these posts.
distill.pub	Distill navigates a really interesting gap between super-blog and research journal. I wish that we had more publications like this.
Sebastian Ruder’s blog	Sebastian has produced a lot of really great explanations, like the one on gradient descent methods I linked to above. He also maintains a website tracking progress on NLP benchmarks
Lillian Weng’s Blog	Great blog on RL, meta-learning, and other topics
ShortScience	This website contains public summaries/discussions of machine learning, CS, and biology papers.
Berkeley AI Research (BAIR) Blog	BAIR produces a lot of great research, and uses this blog to release more accessible presentations of their papers.
Off the Convex Path	Nice blog on machine learning and optimization.
Ferenc Huszár’s blog	Pretty popular blog that has a lot of explorations/musings on ML from an author with a rigorous mathematical perspective
Thibaut Lienart’s Blog	This website has some notes on math and optimization that seem interesting.