Is OpenMythos open source?

Yes — kyegomez/OpenMythos is open source, released under the MIT license.

What language is OpenMythos written in?

kyegomez/OpenMythos is primarily written in Python.

How popular is OpenMythos?

kyegomez/OpenMythos has 14.7k stars on GitHub and is currently accelerating.

Where can I find OpenMythos?

kyegomez/OpenMythos is on GitHub at https://github.com/kyegomez/OpenMythos.

← all repositories

kyegomez/OpenMythos

Reverse-engineering Claude Mythos as a trainable looped transformer

OpenMythos is an independent attempt to reconstruct Anthropic’s rumored Claude Mythos architecture as a trainable Recurrent-Depth Transformer with switchable attention and sparse MoE layers.

★14.7k stars Python Language Models

View on GitHub ↗ Homepage ↗

Velocity · 7d

+106

★ / day

Trend

↗accelerating

star history

What it does

OpenMythos implements a Recurrent-Depth Transformer—also called a Looped Transformer—in PyTorch. It splits the model into a Prelude, a recurrent middle block that can be looped up to max_loop_iters times per forward pass, and a Coda. The attention layer is swappable between Grouped Query Attention and Multi-Latent Attention, and the feed-forward network uses a sparse Mixture-of-Experts with routed and shared experts. Pre-configured model sizes range from 1B to 1T parameters, and a training script for the 3B variant on FineWeb-Edu is included.

The interesting bit

The project treats the looped transformer as a discrete dynamical system and constrains the learned injection matrix A so its spectral radius stays below 1 by construction, aiming to prevent the residual explosions that typically destabilize recurrent training. The theory is that looping the same weights silently in latent space—without emitting intermediate tokens—acts as an implicit chain-of-thought, allowing depth extrapolation and systematic generalization to emerge from a single forward pass.

Key highlights

Switchable attention: GQAttention with optional Flash Attention 2, or MLAttention with compressed KV caching
Sparse MoE feed-forward with configurable routed and shared experts
Stability mechanism borrowed from the Parcae architecture to keep the recurrent state bounded
Configurable loop count at inference time, trading compute for effective depth without adding parameters
Training script supports PyTorch DDP and mixed-precision training on hardware from older GPUs up to H100s

Caveats

This is explicitly a speculative, unaffiliated reconstruction based on public research and guesswork, not an official or verified implementation of any Anthropic system
The README truncates mid-sentence while describing the stability fix, leaving the full technical justification incomplete
Claims about emergent reasoning behaviors (systematic generalization, depth extrapolation) are theoretical and drawn from cited literature rather than demonstrated results in the repository itself

Verdict

Worth a look if you are researching looped transformers, test-time compute scaling, or reverse-engineering rumored architectures. Skip it if you need a production-ready language model with proven training results.

Frequently asked

What is kyegomez/OpenMythos?: OpenMythos is an independent attempt to reconstruct Anthropic’s rumored Claude Mythos architecture as a trainable Recurrent-Depth Transformer with switchable attention and sparse MoE layers.
Is OpenMythos open source?: Yes — kyegomez/OpenMythos is open source, released under the MIT license.
What language is OpenMythos written in?: kyegomez/OpenMythos is primarily written in Python.
How popular is OpenMythos?: kyegomez/OpenMythos has 14.7k stars on GitHub and is currently accelerating.
Where can I find OpenMythos?: kyegomez/OpenMythos is on GitHub at https://github.com/kyegomez/OpenMythos.