EricLBuehler/mistral.rs
mistral.rs is a Rust library for fast, flexible LLM inference with GPU acceleration, quantization, and agentic runtime support.

Velocity · 7d
+8.7
★ / day
Trend
→steady
star history
mistral.rs provides high-performance LLM inference in Rust, supporting major model architectures with CUDA optimizations for NVIDIA GPUs. It offers multiple quantization formats including MXFP4 for memory efficiency, and includes an agentic runtime enabling web search, local Python code execution with model feedback, and custom tool hooks. The library exposes both OpenAI-compatible and Anthropic-compatible APIs.