← all repositories

openai/transformer-debugger

OpenAI's tool for investigating specific behaviors of small language models through automated interpretability and sparse autoencoders.

4.1k stars Python Other AILLMOps · Eval
transformer-debugger
Velocity · 7d
+5.0
★ / day
Trend
steady
star history

The Transformer Debugger (TDB) combines automated interpretability techniques with sparse autoencoders to enable rapid exploration of how language models arrive at specific outputs. It allows researchers to intervene in the forward pass and trace connections between model components such as neurons, attention heads, and autoencoder latents. The tool helps answer why a model made particular predictions by identifying which components contributed to the behavior and providing automatically generated explanations of component activations.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.