← all repositories

deepseek-ai/DeepSeek-V2

DeepSeek-V2 is a Mixture-of-Experts large language model with sparse activation, developed by DeepSeek AI.

5k stars Language Models
DeepSeek-V2
Velocity · 7d
+6.4
★ / day
Trend
steady
star history

The repository contains weights, training code, and configuration for DeepSeek-V2, a 236B parameter MoE model. It employs Multi-head Latent Attention (MLA) and DeepSeekMoE architecture with 16 experts per token for efficient sparse computation. The model is designed to be economical and efficient compared to dense models of equivalent capacity.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.