VinAIResearch/PhoGPT
A 4B-parameter generative language model for Vietnamese with base and chat variants fine-tuned on instruction data.

Velocity · 7d
+0.8
★ / day
Trend
→steady
star history
PhoGPT is a Vietnamese large language model series including PhoGPT-4B (base pre-trained on 102B tokens) and PhoGPT-4B-Chat (fine-tuned on 70K instructional prompts and 290K conversations). It supports 8192 context length with a 20K token vocabulary. The repository provides model weights, inference code, and fine-tuning guides for the Vietnamese NLP community.