← all repositories

stas00/ml-engineering

An open technical handbook providing methodologies, tools, and step-by-step instructions for training and running large language models and vision-language models.

18.1k stars Python LearningLLMOps · Eval
ml-engineering
Velocity · 7d
+8.6
★ / day
Trend
steady
star history

This repository documents practical ML engineering knowledge accumulated while training large models including BLOOM-176B and IDEFICS-80B. It covers hardware selection (GPUs, storage, networking), orchestration with systems like SLURM, model training optimization, and inference deployment. The content is structured as technical guides with scripts and commands intended for LLM/VLM training engineers and ML operators.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.