data-infra/cube-studio
Open-source cloud-native one-stop platform for machine learning, LLM training/fine-tuning, and MLOps with multi-tenant resource scheduling.

Cube Studio provides a comprehensive ML/LLM platform supporting the full algorithm lifecycle. It features distributed training across multiple machines and GPUs, fine-tuning for large models including SFT and reinforcement learning, and inference serving through vllm and ollama. The platform includes pipeline orchestration, hyperparameter optimization, VGPU virtualization, and supports heterogeneous domestic hardware like Ascend and Cambricon. It also offers intelligent agents with private knowledge base capabilities and multi-tenant resource scheduling for enterprise deployment.