← all repositories

ModelTC/LightLLM

A Python-based inference and serving framework for large language models optimized for lightweight design and high-speed performance.

LightLLM
Velocity · 7d
+3.9
★ / day
Trend
steady
star history

LightLLM is a framework designed for running and serving large language models with focus on performance and scalability. It incorporates optimizations from established projects like vLLM, FasterTransformer, TGI, and FlashAttention to provide efficient inference capabilities. The framework supports various LLM architectures including GPT and LLaMA variants.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.