← all repositories

okuvshynov/slowllama

A fine-tuning tool for Llama2 and CodeLlama models, including 70B/35B variants, on MacBook Air or consumer GPUs without quantization.

449 stars Python Language ModelsML Frameworks
slowllama
Velocity · 7d
+0.4
★ / day
Trend
steady
star history

slowllama enables fine-tuning of large language models on memory-constrained devices by offloading model components to SSD or main memory during both forward and backward passes. It uses LoRA (Low-Rank Adaptation) to limit parameter updates to a smaller set of weights, making training feasible on limited hardware. The project supports Llama2 and CodeLlama variants up to 70B parameters on Apple M1/M2 devices and consumer NVIDIA GPUs.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.