ray-project/ray-llm
RayLLM provides APIs for deploying and serving large language models on Ray's distributed compute platform.

Velocity · 7d
+1.1
★ / day
Trend
→steady
star history
RayLLM was a dedicated repository for running LLMs on Ray, now upstreamed into the main Ray project. It provided ray.serve.llm for model serving and ray.data.llm for integrating LLMs with data pipelines. The repository has been archived as the functionality is now directly maintained in the Ray core.