← all repositories

Om-Alve/smolGPT

A minimal PyTorch implementation for training a small GPT language model from scratch with modern architecture features.

1.5k stars Python Language Models
smolGPT
Velocity · 7d
+2.9
★ / day
Trend
steady
star history

This repository provides a from-scratch implementation of a GPT model in pure PyTorch with no abstraction overhead. It includes modern architectural components such as flash attention, RMSNorm, SwiGLU activation, and optional rotary embeddings (RoPE). The project supports the full training pipeline including mixed precision training, gradient accumulation, warmup scheduling, and a built-in TinyStories dataset processor with SentencePiece tokenizer integration.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.