← all repositories

AviSoori1x/makeMoE

A PyTorch implementation of a sparse mixture-of-experts autoregressive character-level language model, inspired by Andrej Karpathy's makemore.

807 stars Jupyter Notebook Language ModelsML Frameworks
makeMoE
Velocity · 7d
+0.9
★ / day
Trend
steady
star history

This repository implements a sparse mixture-of-experts (MoE) language model from scratch using only PyTorch. The model replaces the standard feed-forward layer with a router and multiple expert networks, using top-k and noisy top-k gating mechanisms. The character-level autoregressive model trains on Shakespeare-like text, similar to the original makemore project. It includes both a standalone Python module and an explanatory Jupyter notebook walking through the architecture.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.