← all repositories

TencentARC/LLaMA-Pro

A research project introducing block expansion to progressively extend LLaMA models, published at ACL 2024.

513 stars Python Language ModelsML Frameworks
LLaMA-Pro
Velocity · 7d
+0.6
★ / day
Trend
steady
star history

This project presents a method for progressively extending LLaMA models by inserting and training new transformer blocks, enabling efficient model capacity expansion without full retraining. The work includes released model weights on HuggingFace and demonstrates improvements on code and math benchmarks. Extensions to Mistral models are also provided.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.