← all repositories

EricLBuehler/mistral.rs

mistral.rs is a Rust library for fast, flexible LLM inference with GPU acceleration, quantization, and agentic runtime support.

7.3k stars Rust Inference · ServingAgents
mistral.rs
Velocity · 7d
+8.7
★ / day
Trend
steady
star history

mistral.rs provides high-performance LLM inference in Rust, supporting major model architectures with CUDA optimizations for NVIDIA GPUs. It offers multiple quantization formats including MXFP4 for memory efficiency, and includes an agentic runtime enabling web search, local Python code execution with model feedback, and custom tool hooks. The library exposes both OpenAI-compatible and Anthropic-compatible APIs.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.