← all repositories

andyzoujm/representation-engineering

A research paper and codebase for analyzing population-level representations in deep neural networks to improve AI transparency.

1k stars Jupyter Notebook LLMOps · EvalLanguage Models
representation-engineering
Velocity · 7d
+1.0
★ / day
Trend
steady
star history

This repository contains the official implementation for a research paper introducing Representation Engineering (RepE), a top-down approach to AI transparency. It provides methods for monitoring and manipulating high-level cognitive phenomena in deep neural networks by analyzing population-level representations. The work draws from cognitive neuroscience and offers baselines and techniques for studying how representations encode information across transformer-based language models.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.