← all repositories

MoonshotAI/MoBA

A novel mixture-of-block attention mechanism for training efficient long-context LLMs using parameter-less top-k gating to select relevant KV blocks.

MoBA
Velocity · 7d
+4.5
★ / day
Trend
steady
star history

MoBA is a trainable block sparse attention mechanism for transformer-based LLMs where query tokens learn to attend to the most relevant key-value blocks, reducing quadratic complexity of traditional full attention. It introduces a parameter-less top-k gating mechanism that dynamically selects the most informative blocks per query, enabling efficient long-context processing. The architecture allows seamless transitions between full and sparse attention modes during training and inference.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.