GradientHQ/parallax
A Python framework for distributed LLM inference that enables building AI clusters across multiple devices sharing GPUs.

Velocity · 7d
+5.0
★ / day
Trend
→steady
star history
Parallax is a distributed model serving framework that lets you build your own AI cluster anywhere. It enables hosting large language models across devices with shared GPU resources. The framework supports various LLM providers including DeepSeek, Qwen, Kimi, MiniMax, and integrates with established inference engines like vLLM and SGLang for production-grade LLM serving.