← all repositories

facebookresearch/large_concept_model

Meta's research on Large Concept Models, a 1.6B parameter sequence-to-sequence model that operates on semantic sentence representations in the SONAR embedding space.

2.4k stars Python Language ModelsML Frameworks
large_concept_model
Velocity · 7d
+4.4
★ / day
Trend
steady
star history

This repository provides official implementations of Large Concept Models, a language modeling approach that processes sentences as semantic concepts rather than individual tokens. The model uses the multilingual SONAR embedding space supporting 200+ languages and explores multiple generation approaches including MSE regression and diffusion-based methods. The work includes training recipes for 1.6B parameter models trained on approximately 1.3T tokens, built on top of fairseq2 and PyTorch.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.