← all repositories
tinygrad/tinygrad

A deep learning stack that fits in your head

tinygrad is what happens when you keep PyTorch's ergonomics but make the entire compiler small enough to read in an afternoon.

33k stars Python ML Frameworks
tinygrad
Velocity · 7d
+16
★ / day
Trend
steady
star history

What it does tinygrad is a full end-to-end deep learning framework: tensors with autograd, an IR and compiler that fuses and lowers kernels, JIT execution, plus nn, optim, and datasets for real training. It runs on everything from CPU and CUDA to Metal, AMD, Qualcomm, and WebGPU. The pitch is simple: PyTorch-like API, but every layer of the stack is visible and hackable.

The interesting bit The project borrows from three heavyweights—PyTorch’s feel, JAX’s functional IR-based autodiff, TVM’s scheduling and codegen—then strips away the parts you can’t read on a plane. The “laziness” demo is telling: a matmul written in eager style gets fused into a single kernel, and you can toggle DEBUG=3 or DEBUG=4 to watch the compiler think. That’s the transparency the README keeps selling.

Key highlights

  • ~25 low-level ops are all a new accelerator needs to implement
  • TinyJit captures and replays kernels at function scope
  • BEAM search over kernels for scheduling, plus process-replay tests to catch compiler regressions
  • Contributing guide is unusually blunt: no code golf, no whitespace PRs, benchmark your “speedups,” and AI agents must include the word ORANGE in commits
  • Cash bounties for improvements, with a stated preference for 3-line features over 300-line ones

Caveats

  • Not 1.0 yet; the README admits bugs are still being found
  • No full vmap/pmap equivalent yet, so JAX migrants may miss some functional transforms
  • Code outside tinygrad/ core is “not well tested”—treat extra/ as experimental

Verdict Grab this if you want to understand how a modern DL stack actually works, or if you’re targeting an obscure accelerator and need to write a ~25-op backend. Skip it if you need production stability, full JAX-style transforms, or a large ecosystem of prebuilt models.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.