zhanshijinwat/Steel-LLM
A personal project that trains a 1B-parameter Chinese LLM from scratch on 1T tokens, with all data processing and training code open-sourced.

Velocity · 7d
+1.0
★ / day
Trend
→steady
star history
Steel-LLM is an open-source Chinese large language model trained from scratch on 1 trillion tokens. The project covers the entire pipeline including data collection, data processing, pretraining framework selection, and model architecture design. The model achieves 42 on C-Eval and 36 on CMMLU benchmarks, outperforming larger models from institutional releases.