📖 14 books 🎓 6 courses 3 videos 3 onlines

All Resources

Click on a resource to see study progress options

Resource Authors Type Role Phase(s) Progress
Axler Linear Algebra Done Right
Sheldon Axler book primary 1
Strang Introduction to Linear Algebra
Gilbert Strang book secondary 1
MIT 18.06 MIT 18.06 Linear Algebra
Gilbert Strang video secondary 1
Hubbard & Hubbard Vector Calculus, Linear Algebra, and Differential Forms: A Unified Approach
John H. Hubbard, Barbara Burke Hubbard book primary 1
Spivak Calculus on Manifolds
Michael Spivak book alternative 1
Blitzstein & Hwang Introduction to Probability
Joseph Blitzstein, Jessica Hwang book primary 1
Harvard Stat 110 Harvard Stat 110: Probability
Joseph Blitzstein video secondary 1
Wasserman All of Statistics: A Concise Course in Statistical Inference
Larry Wasserman book reference 1
Skiena The Algorithm Design Manual
Steven Skiena book primary 2
CS:APP Computer Systems: A Programmer's Perspective
Randal E. Bryant, David R. O'Hallaron book primary 2
PRML Pattern Recognition and Machine Learning
Christopher Bishop book primary 3
ESL The Elements of Statistical Learning
Trevor Hastie, Robert Tibshirani, Jerome Friedman book alternative 3
Shalev-Shwartz & Ben-David Understanding Machine Learning: From Theory to Algorithms
Shai Shalev-Shwartz, Shai Ben-David book reference 3
Prince Understanding Deep Learning
Simon Prince book primary 4
Nielsen Neural Networks and Deep Learning
Michael Nielsen online secondary 4
CS25 Stanford CS25: Transformers United
Jure Leskovec course primary 4
Illustrated Transformer The Illustrated Transformer
Jay Alammar online secondary 4
Let's build GPT Let's build GPT
Andrej Karpathy online secondary 4
CMU 10-714 CMU 10-714: Deep Learning Systems
J. Zico Kolter, Tianqi Chen course primary 5
GPU Mode GPU Mode
Jeremy Howard video primary 5
Hwu & Kirk Programming Massively Parallel Processors
David B. Kirk, Wen-mei W. Hwu book secondary 5
CS324 Stanford CS324: Large Language Models
Tatsunori Hashimoto, Percy Liang course primary 6
COS 597G Princeton COS 597G: Understanding Large Language Models
Sanjeev Arora course alternative 6
Sutton & Barto Reinforcement Learning: An Introduction
Richard S. Sutton, Andrew G. Barto book primary 6
Spinning Up OpenAI Spinning Up in Deep RL
OpenAI course secondary 6
CS285 UC Berkeley CS285: Deep Reinforcement Learning
Sergey Levine course reference 6

📖 Books

Primary textbooks and reference materials

primary

Axler

Linear Algebra Done Right

by Sheldon Axler

Axler's text is renowned for its decision to banish determinants to the end of the book. This forces the student to understand linear maps, eigenvalues, and inner product spaces based on their geometric properties rather than algebraic formulas. This "operator-centric" view aligns perfectly with modern deep learning, where layers are viewed as operators acting on function spaces. It builds the mental models necessary to understand concepts like Low-Rank Adaptation (LoRA) and the spectral properties of weight matrices, which are crucial for understanding model stability and compression.

Study Time
0h
secondary

Strang

Introduction to Linear Algebra

by Gilbert Strang

While Axler provides rigor, Strang provides the connection to computation. His focus on the "Four Fundamental Subspaces" provides a concrete mental image of how matrices manipulate data.

Study Time
0h
primary

Hubbard & Hubbard

Vector Calculus, Linear Algebra, and Differential Forms: A Unified Approach

by John H. Hubbard, Barbara Burke Hubbard

This book is legendary among mathematics enthusiasts for treating the derivative not just as a number or a vector, but as a linear transformation (the Jacobian matrix) that best approximates a function near a point. This viewpoint is exactly how automatic differentiation engines (like PyTorch's autograd) operate—computing Jacobian-Vector products. It integrates linear algebra and calculus seamlessly, which is how they appear in machine learning. It provides proofs that allow a researcher to understand *when* optimization might fail (e.g., non-differentiable points like ReLU at 0, saddle points).

Study Time
0h
alternative

Spivak

Calculus on Manifolds

by Michael Spivak

A concise, dense classic. While elegant, Hubbard & Hubbard is generally preferred for self-study due to its more explanatory nature and unified approach.

Study Time
0h
primary

Blitzstein & Hwang

Introduction to Probability

by Joseph Blitzstein, Jessica Hwang

Based on the famous Harvard Stat 110 course. This book is unrivaled in building *intuition*. It emphasizes "story proofs"—understanding *why* a formula works through narrative logic rather than algebraic manipulation.

Study Time
0h
reference

Wasserman

All of Statistics: A Concise Course in Statistical Inference

by Larry Wasserman

This book covers a massive amount of ground—from basic probability to VC dimension and bootstrapping—very quickly. It is an excellent bridge to the "Elements of Statistical Learning."

Study Time
0h
primary

Skiena

The Algorithm Design Manual

by Steven Skiena

Unlike the standard *Introduction to Algorithms* (CLRS), which is encyclopedic and theoretical, Skiena's book focuses on the *design* process and practical "war stories." It teaches you how to recognize a problem type and select the right tool, which is critical for research interviews and actual engineering work.

Study Time
0h
primary

CS:APP

Computer Systems: A Programmer's Perspective

by Randal E. Bryant, David R. O'Hallaron

This is the standard text for understanding how software interacts with hardware.

Study Time
0h
primary

PRML

Pattern Recognition and Machine Learning

by Christopher Bishop

This book is the gold standard for the **Bayesian** perspective. It explains regularization not just as a heuristic, but as a prior belief on the model parameters.

Study Time
0h
alternative

ESL

The Elements of Statistical Learning

by Trevor Hastie, Robert Tibshirani, Jerome Friedman

This text is more "frequentist" and statistical, excellent for understanding the bias-variance tradeoff and decision trees.

Study Time
0h
reference

Shalev-Shwartz & Ben-David

Understanding Machine Learning: From Theory to Algorithms

by Shai Shalev-Shwartz, Shai Ben-David

This book is mathematically dense and focuses on **PAC Learning** (Probably Approximately Correct). It answers the fundamental question: "Under what conditions is learning even possible?"

Study Time
0h
primary

Prince

Understanding Deep Learning

by Simon Prince

While Goodfellow's *Deep Learning* (2016) is a classic, it predates the Transformer revolution. Prince's book is modern, visually intuitive, and covers Transformers, Diffusion, and Generative AI. It is the superior choice for a student starting in 2025.

Study Time
0h
secondary

Hwu & Kirk

Programming Massively Parallel Processors

by David B. Kirk, Wen-mei W. Hwu

Study Time
0h
primary

Sutton & Barto

Reinforcement Learning: An Introduction

by Richard S. Sutton, Andrew G. Barto

The foundational text of the field.

Study Time
0h

🎓 Courses

University courses and MOOCs

primary

CS25

Stanford CS25: Transformers United

by Jure Leskovec

Study Time
0h
primary

CMU 10-714

CMU 10-714: Deep Learning Systems

by J. Zico Kolter, Tianqi Chen

This is arguably the most valuable course for an aspiring RE. You build a deep learning library (called "Needle") from scratch.

Study Time
0h
primary

CS324

Stanford CS324: Large Language Models

by Tatsunori Hashimoto, Percy Liang

Study Time
0h
alternative

COS 597G

Princeton COS 597G: Understanding Large Language Models

by Sanjeev Arora

Study Time
0h
secondary

Spinning Up

OpenAI Spinning Up in Deep RL

by OpenAI

While the original repo is older, forks and modern implementations (CleanRL) are the best way to learn PPO, DQN, and SAC.

Study Time
0h
reference

CS285

UC Berkeley CS285: Deep Reinforcement Learning

by Sergey Levine

Study Time
0h

▶ Videos

Video lectures and tutorials

secondary

MIT 18.06

MIT 18.06 Linear Algebra

by Gilbert Strang

Study Time
0h
secondary

Harvard Stat 110

Harvard Stat 110: Probability

by Joseph Blitzstein

Study Time
0h
primary

GPU Mode

GPU Mode

by Jeremy Howard

Practical, modern GPU optimization. Community-driven resource with lectures, reading groups, and an extensive collection of CUDA/GPU programming materials.

Study Time
0h

◎ Online Resources

Online articles, tutorials, and interactive resources

secondary

Nielsen

Neural Networks and Deep Learning

by Michael Nielsen

For a gentle introduction to backpropagation.

Study Time
0h
secondary

Illustrated Transformer

The Illustrated Transformer

by Jay Alammar

Study Time
0h
secondary

Let's build GPT

Let's build GPT

by Andrej Karpathy

Study Time
0h

Resources by Phase

See which resources are used in each curriculum phase

1 The Mathematical Substrate 8 resources
  • primary Axler — Linear Algebra Done Right
  • secondary Strang — Introduction to Linear Algebra
  • secondary MIT 18.06 — MIT 18.06 Linear Algebra
  • primary Hubbard & Hubbard — Vector Calculus, Linear Algebra, and Differential Forms: A Unified Approach
  • alternative Spivak — Calculus on Manifolds
  • primary Blitzstein & Hwang — Introduction to Probability
  • secondary Harvard Stat 110 — Harvard Stat 110: Probability
  • reference Wasserman — All of Statistics: A Concise Course in Statistical Inference
2 CS Fundamentals & Systems 2 resources
  • primary Skiena — The Algorithm Design Manual
  • primary CS:APP — Computer Systems: A Programmer's Perspective
3 Classical ML Theory 3 resources
  • primary PRML — Pattern Recognition and Machine Learning
  • alternative ESL — The Elements of Statistical Learning
  • reference Shalev-Shwartz & Ben-David — Understanding Machine Learning: From Theory to Algorithms
4 Deep Learning 5 resources
  • primary Prince — Understanding Deep Learning
  • secondary Nielsen — Neural Networks and Deep Learning
  • primary CS25 — Stanford CS25: Transformers United
  • secondary Illustrated Transformer — The Illustrated Transformer
  • secondary Let's build GPT — Let's build GPT
5 Frontier Systems 3 resources
  • primary CMU 10-714 — CMU 10-714: Deep Learning Systems
  • primary GPU Mode — GPU Mode
  • secondary Hwu & Kirk — Programming Massively Parallel Processors
6 Frontier Research Topics 5 resources
  • primary CS324 — Stanford CS324: Large Language Models
  • alternative COS 597G — Princeton COS 597G: Understanding Large Language Models
  • primary Sutton & Barto — Reinforcement Learning: An Introduction
  • secondary Spinning Up — OpenAI Spinning Up in Deep RL
  • reference CS285 — UC Berkeley CS285: Deep Reinforcement Learning
7 Research & Portfolio 0 resources

Recommended Resource Summary

Primary resources for each phase (from the curriculum conclusion)

Resource Type Recommended Resource Reasoning
Lin. Algebra Linear Algebra Done Right (Axler) Geometric intuition for latent spaces.
Calculus Hubbard & Hubbard Rigorous treatment of Jacobian/Hessian.
Probability Introduction to Probability (Blitzstein) Best intuition for random variables.
Algorithms The Algorithm Design Manual (Skiena) Practical design focus over theory.
ML Theory Pattern Recognition & ML (Bishop) Bayesian foundation is essential.
Deep Learning Understanding Deep Learning (Prince) Most up-to-date (Transformers/GenAI).
Systems CMU 10-714 (Needle) Build a framework from scratch.
RL Sutton & Barto The foundational text of the field.
GPU/CUDA CUDA Mode (YouTube) Practical, modern GPU optimization.