qibin0506/Cortex
A from-scratch LLM implementation featuring MoE architecture, full training pipeline (pretraining to RLHF), and LLM-as-Judge alignment.

Cortex is a comprehensive LLM project implementing the complete lifecycle of building large language models: pretraining, mid-training adaptation, supervised fine-tuning (SFT), and reinforcement learning from human feedback (RLHF). The project features a lightweight Mixture of Experts (MoE) architecture with ~0.1B total parameters and ~67M active parameters during inference. It introduces LLM-as-Judge for PPO training feedback and supports attention residuals and thinking control mechanisms. The training pipeline has been validated on domestic hardware (MLU370).