← all repositories

X-LANCE/SLAM-LLM

A training framework and toolkit for building custom multimodal LLMs that process speech, language, audio, and music.

SLAM-LLM
Velocity · 7d
+1.1
★ / day
Trend
steady
star history

SLAM-LLM is a deep learning framework that enables researchers and developers to train custom multimodal large language models (MLLMs) for speech, language, audio, and music processing tasks. It provides training recipes, PEFT support for efficient fine-tuning, and high-performance inference checkpoints. The framework supports multi-task training for ASR and speech translation, and scales to datasets with hundreds of thousands of hours of speech data.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.