AviSoori1x/makeMoE
A PyTorch implementation of a sparse mixture-of-experts autoregressive character-level language model, inspired by Andrej Karpathy's makemore.

This repository implements a sparse mixture-of-experts (MoE) language model from scratch using only PyTorch. The model replaces the standard feed-forward layer with a router and multiple expert networks, using top-k and noisy top-k gating mechanisms. The character-level autoregressive model trains on Shakespeare-like text, similar to the original makemore project. It includes both a standalone Python module and an explanatory Jupyter notebook walking through the architecture.