marella/ctransformers
Python bindings for running transformer-based LLMs with GGML, supporting LLaMA, Falcon, GPT-NeoX, and other models.

Velocity · 7d
+1.7
★ / day
Trend
→steady
star history
CTransformers provides Python bindings for transformer language models implemented in C/C++ using the GGML library. It offers a unified interface for loading and running various open-source LLMs including LLaMA, Falcon, GPT-J, GPT-NeoX, and StarCoder. The library supports GPU acceleration via CUDA and Metal, as well as GPTQ quantization for efficient inference. It integrates with the Hugging Face Transformers library and LangChain.