Is TransformerLens open source?

Yes — TransformerLensOrg/TransformerLens is open source, released under the MIT license.

What language is TransformerLens written in?

TransformerLensOrg/TransformerLens is primarily written in Python.

How popular is TransformerLens?

TransformerLensOrg/TransformerLens has 3.7k stars on GitHub.

Where can I find TransformerLens?

TransformerLensOrg/TransformerLens is on GitHub at https://github.com/TransformerLensOrg/TransformerLens.

← all repositories

TransformerLensOrg/TransformerLens

X-ray goggles for GPT-style models

TransformerLens lets you intercept, cache, and surgically edit the hidden activations of 50+ language models as they run.

★3.7k stars Python ML Frameworks LLMOps · Eval Language Models

View on GitHub ↗ Homepage ↗

Not currently ranked — collecting fresh signals.

star history

What it does

TransformerLens is a Python library for mechanistic interpretability: the practice of reverse-engineering what algorithms a trained model learned from its weights. It wraps HuggingFace models in a “hook” system that exposes every internal activation and lets you cache, replace, or remove them on the fly. The new TransformerBridge API (v3.0) supports 50+ architectures and preserves raw HuggingFace numerics by default; call enable_compatibility_mode() if you need the old folded-LayerNorm behavior.

The interesting bit

The library was built by Neel Nanda, formerly of Anthropic’s interpretability team, and its design was heavily inspired by Anthropic’s internal Garcon tool. That pedigree shows: the hook system turns a black-box forward pass into something resembling a debugger with breakpoints on every tensor. There’s also experimental Mamba/SSM support, including a utility that materializes Mamba-2’s SSD-derived “effective attention” matrix so you can compare it directly against transformer attention patterns.

Key highlights

Load 50+ open-source models (GPT-2, Llama, Mistral, Gemma, etc.) with a single call
Cache any internal activation; edit or replace activations mid-inference via hooks
TransformerBridge matches HuggingFace logits exactly; legacy HookedTransformer API still available but deprecated
Experimental Mamba-1 and Mamba-2 adapters with hook-based introspection of SSM internals
Extensive tutorial ecosystem: ARENA course, Neel Nanda’s video walkthroughs, 200+ concrete open problems

Caveats

Gated models require a HuggingFace token in your environment
SSM benchmark coverage is incomplete; the verify_models suite assumes transformer-shaped architectures and would need a refactor for full Mamba support
HookedSAETransformer was removed in v2.0 and moved to the separate SAELens project

Verdict

Essential if you’re doing mechanistic interpretability research or teaching it. Skip it if you just want inference-optimized model serving — the value is inspection and intervention, not speed.

Frequently asked

What is TransformerLensOrg/TransformerLens?: TransformerLens lets you intercept, cache, and surgically edit the hidden activations of 50+ language models as they run.
Is TransformerLens open source?: Yes — TransformerLensOrg/TransformerLens is open source, released under the MIT license.
What language is TransformerLens written in?: TransformerLensOrg/TransformerLens is primarily written in Python.
How popular is TransformerLens?: TransformerLensOrg/TransformerLens has 3.7k stars on GitHub.
Where can I find TransformerLens?: TransformerLensOrg/TransformerLens is on GitHub at https://github.com/TransformerLensOrg/TransformerLens.