Xirider/finetune-gpt2xl
A guide for fine-tuning GPT2-XL and GPT-NEO language models on a single GPU using Huggingface Transformers and DeepSpeed.

Velocity · 7d
+0.2
★ / day
Trend
→steady
star history
This repository provides step-by-step instructions for fine-tuning large language models that are too large to fit on a single GPU. It uses Huggingface Transformers library combined with DeepSpeed for memory optimization and gradient checkpointing to reduce VRAM requirements. The guide also includes instructions for setting up a Google Cloud VM with a V100 GPU for those without sufficient local hardware.