← all repositories
axolotl-ai-cloud/axolotl

YAML your way to a fine-tuned LLM

Axolotl wraps the entire LLM fine-tuning pipeline—preprocessing, training, quantization, inference—into a single config file so you don't have to wrestle PyTorch directly.

12k stars Python ML FrameworksLanguage Models
axolotl
Velocity · 7d
+10
★ / day
Trend
steady
star history

What it does Axolotl is an open-source Python framework that streamlines post-training and fine-tuning for large language models. You define one YAML configuration and it handles dataset preprocessing, training, evaluation, quantization, and inference without hand-rolling distributed training code.

The interesting bit The breadth is the point. It supports full fine-tuning, LoRA, QLoRA, GPTQ, QAT, preference tuning (DPO, IPO, KTO, ORPO), RL (GRPO, GDPO), reward modeling, and even multimodal vision-language and audio models—all behind the same config interface. The README reads like a release-notes firehose: MoE expert quantization, ScatterMoE LoRA with custom Triton kernels, FP8 via torchao, ND parallelism composing CP/TP/FSDP, SageAttention, text diffusion training. Someone is clearly merging PRs fast.

Key highlights

  • Single YAML config drives the entire pipeline from data prep to inference
  • Supports dozens of model families: LLaMA, Mistral, Qwen, Gemma, GPT-OSS, and many Hugging Face Hub variants
  • Multi-GPU and multi-node training via FSDP1/FSDP2, DeepSpeed, Torchrun, and Ray
  • Performance integrations: Flash Attention 2/3/4, Flex Attention, Liger Kernel, Cut Cross Entropy, LoRA optimizations
  • Cloud-ready with Docker images, PyPI packages, and a Google Colab notebook

Caveats

  • Requires NVIDIA Ampere-or-newer GPU (or AMD) for bf16 and Flash Attention; older hardware is out
  • The README is feature-list heavy but light on architectural explanation—expect to read docs and examples to understand trade-offs
  • Now uv-first (April 2026), so if you’re still on pip/conda, adjust your workflow

Verdict Worth a look if you need to fine-tune production LLMs and would rather not maintain a custom training stack. Skip it if you’re doing research that needs low-level control over every gradient, or if you’re GPU-poor and just want to call APIs.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.