← all repositories

zhanshijinwat/Steel-LLM

A personal project that trains a 1B-parameter Chinese LLM from scratch on 1T tokens, with all data processing and training code open-sourced.

807 stars Jupyter Notebook Language ModelsML Frameworks
Steel-LLM
Velocity · 7d
+1.0
★ / day
Trend
steady
star history

Steel-LLM is an open-source Chinese large language model trained from scratch on 1 trillion tokens. The project covers the entire pipeline including data collection, data processing, pretraining framework selection, and model architecture design. The model achieves 42 on C-Eval and 36 on CMMLU benchmarks, outperforming larger models from institutional releases.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.