ParisNeo/lollms_hub
A secure, high-performance orchestration proxy that unifies disparate AI inference backends into a single managed API.

LoLLMs Hub acts as a fortress gateway for AI compute resources, providing enterprise-grade security, intelligent model routing, and advanced workflow automation across multiple inference engines. It supports native Ollama, vLLM, llama.cpp, and OpenAI-compatible endpoints, enabling hierarchical master-slave clustering for distributed AI serving. The system offers load balancing, rate limiting, key-based authentication, and multi-node federation capabilities.