← all repositories

QwenLM/qwen.cpp

A C++ implementation for running Qwen language models with support for quantization and streaming generation.

qwen.cpp
Velocity · 7d
+0.6
★ / day
Trend
steady
star history

qwen.cpp is a pure C++ implementation of Qwen-LM based on ggml, enabling efficient inference of Qwen models on CPU and GPU. It supports model quantization, streaming text generation with typewriter effect, and provides Python bindings. The project was merged into llama.cpp in December 2023 and has since been deprecated in favor of that ongoing effort.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.