RWKV/rwkv.cpp
A C library and Python wrapper for running quantized RWKV language models on CPU using the ggml framework.

Velocity · 7d
+1.3
★ / day
Trend
→steady
star history
rwkv.cpp ports the RWKV language model architecture to the ggml ML library, enabling efficient CPU-based inference. It supports multiple quantization formats including INT4, INT5, INT8 alongside FP16 and FP32. The project provides both a C library and Python wrapper, focuses on CPU execution with optional cuBLAS support, and enables LoRA checkpoint merging for model customization.