QwenLM/qwen.cpp
A C++ implementation for running Qwen language models with support for quantization and streaming generation.

Velocity · 7d
+0.6
★ / day
Trend
→steady
star history
qwen.cpp is a pure C++ implementation of Qwen-LM based on ggml, enabling efficient inference of Qwen models on CPU and GPU. It supports model quantization, streaming text generation with typewriter effect, and provides Python bindings. The project was merged into llama.cpp in December 2023 and has since been deprecated in favor of that ongoing effort.