ymcui/Chinese-Mixtral
Chinese Mixtral is a sparse mixture-of-experts large language model adapted for Chinese via continual pretraining and instruction tuning.

This project takes Mistral’s Mixtral model with sparse MoE architecture and performs Chinese-language continual pretraining on large-scale unlabeled data to create a Chinese base model. It further applies instruction tuning to produce the Chinese Mixtral-Instruct variant. The model natively supports up to 32K context with reported capability up to 128K, and improvements in math reasoning and code generation. Inference is supported via llama.cpp quantization, requiring as low as 16GB memory.