← all repositories

QwenLM/ParScale

ParScale is a novel LLM scaling paradigm that applies P parallel learnable transformations to inputs, executes model forward passes in parallel, and aggregates outputs to achieve logarithmic scaling comparable to O(log P) parameter growth.

ParScale
Velocity · 7d
+1.2
★ / day
Trend
steady
star history

This repository presents a theoretical and empirical framework for scaling language models beyond traditional parameter and inference-time scaling approaches. The method applies P diverse and learnable transformations to the input, executes forward passes in parallel, and dynamically aggregates the P outputs during both training and inference. The work establishes a logarithmic scaling law demonstrating that parallel computation can serve as an efficient substitute for parameter growth in larger models. Pre-trained models and code are provided via Hugging Face.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.