← all repositories
ZHZisZZ/dllm

Diffusion meets language: a unified toolkit for non-autoregressive text generation

dLLM wraps training, inference, and evaluation recipes for diffusion language models into one reproducible codebase built on familiar Hugging Face tooling.

dllm
Velocity · 7d
+9.6
★ / day
Trend
steady
star history

What it does

dLLM is a Python library that unifies the scattered world of diffusion language models—masked diffusion, block diffusion, edit flows—into a single training and evaluation pipeline. It sits on top of transformers Trainer, supports LoRA, DeepSpeed, and FSDP out of the box, and plugs into lm-evaluation-harness for benchmarking. The repo ships ready-made recipes for models like LLaDA, Dream, and even BERT-turned-chatbot, plus utilities to convert autoregressive checkpoints (Qwen, LLaMA, GPT-2) into diffusion variants.

The interesting bit

The project treats diffusion for text as an infrastructure problem, not just a research novelty. It includes GRPO reinforcement-learning training for reasoning tasks (GSM8K, MATH, Sudoku, Code) and Fast-dLLM inference acceleration with cache-aware decoding—suggesting the authors expect these models to actually be used, not just cited.

Key highlights

  • Training recipes for LLaDA, Dream, LLaDA2.x, BERT-Chat, and Edit Flows with insertion/deletion/substitution operations
  • A2D pipeline converts any autoregressive model to masked or block diffusion; Tiny-A2D releases 0.5B/0.6B checkpoints
  • Distributed training via Accelerate (DDP, ZeRO-1/2/3, FSDP) with optional 4-bit quantization and LoRA
  • Evaluation through lm-evaluation-harness submodule; Slurm cluster scripts included
  • diffu-GRPO reinforcement learning for diffusion models on reasoning benchmarks

Caveats

  • The README notes this is “primarily for educational purposes” and does not aim for exact reproduction of official models
  • Setup requires manual CUDA/PyTorch alignment and submodule initialization for evaluation
  • Several demo GIFs and assets are commented out in the source, suggesting documentation is still being polished

Verdict Worth a look if you’re experimenting with non-autoregressive text generation or need a standardized baseline to compare diffusion architectures. Skip it if you want battle-tested, drop-in replacements for production autoregressive APIs.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.