← all repositories

karpathy/makemore

An educational autoregressive language model that trains character-level neural networks from bigram to Transformer architectures.

4k stars Python Language ModelsLearning
makemore
Velocity · 7d
+2.7
★ / day
Trend
steady
star history

The project implements multiple neural network architectures for character-level language modeling, following seminal papers including the Transformer architecture from Vaswani et al. 2017. It trains on text data to generate new examples similar to the training set, supporting bigrams, MLP, RNN, LSTM, GRU, and Transformer models. Built with PyTorch as the sole dependency and designed primarily for educational demonstration purposes.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.