← all repositories

intentee/paddler

An open-source load balancer and serving platform for running LLMs and VLMs on your own infrastructure using the llama.cpp engine.

paddler
Velocity · 7d
+2.1
★ / day
Trend
steady
star history

Paddler is an LLM load balancer and serving platform that enables self-hosted inference, deployment, and scaling of large language models. It includes a built-in llama.cpp engine for inference, LLM-specific load balancing, request buffering for scale-from-zero, dynamic model swapping, and a web admin panel for management and monitoring. Organizations can use it to maintain privacy, cost control, and independence from closed-source model providers while running LLMs on CPU or GPU.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.