jianzhnie/LLamaTuner
An efficient fine-tuning toolkit for large language models supporting QLoRA, RLHF, DPO on various LLM architectures.

LLamaTuner provides efficient fine-tuning capabilities for large language models including Llama, Qwen, ChatGLM, and Mixtral. It supports quantization-aware training methods like QLoRA, reinforcement learning techniques such as RLHF and DPO, and integrates with DeepSpeed for ZeRO optimization across multi-node setups. The toolkit enables fine-tuning 7B models on single 8GB GPUs while also supporting distributed training for models exceeding 70B parameters.