Is qlora open source?

Yes — artidoro/qlora is open source, released under the MIT license.

What language is qlora written in?

artidoro/qlora is primarily written in Jupyter Notebook.

How popular is qlora?

artidoro/qlora has 10.9k stars on GitHub.

Where can I find qlora?

artidoro/qlora is on GitHub at https://github.com/artidoro/qlora.

← all repositories

artidoro/qlora

Fine-tune a 65B model on one GPU without selling your house

QLoRA squeezes giant language models into consumer hardware by backpropagating through frozen 4-bit weights using LoRA adapters.

★10.9k stars Jupyter Notebook Language Models ML Frameworks

View on GitHub ↗ Homepage ↗

Not currently ranked — collecting fresh signals.

star history

What it does

QLoRA lets you fine-tune massive language models — up to 65B parameters — on a single 48GB GPU by keeping the base model frozen and quantized to 4 bits, then training small Low Rank Adapter (LoRA) layers on top. It wraps the bitsandbytes quantization library and plugs into Hugging Face’s PEFT and transformers stacks. The repo includes scripts, Colab notebooks, and pre-trained Guanaco model weights.

The interesting bit

The trick is a stack of memory hacks that sound absurd but apparently work: a custom 4-bit “NormalFloat” data type theoretically optimal for weight distributions, double-quantization (quantizing the quantization constants), and paged optimizers that offload to CPU memory when VRAM spikes. The authors claim this preserves full 16-bit fine-tuning performance while cutting memory enough to train models that normally need multi-GPU setups.

Key highlights

Fine-tune 65B models on one 48GB GPU; 7B/13B models run in free Colab tiers
Ships with Guanaco model family (7B–65B) trained on OpenAssistant data, plus evaluation scripts using GPT-4 and human ratings
Supports LLaMA, LLaMA 2, and T5; multi-GPU training via Accelerate with device_map='auto'
Includes inference and fine-tuning Colab notebooks, Gradio demo hosting, and reproduction scripts for Guanaco hyperparameters
MIT-licensed code; Guanaco weights require LLaMA license compliance

Caveats

4-bit inference is currently slow — not integrated with optimized 4-bit matmul kernels
fp16 compute dtype can destabilize training (only ~80% of 7B LLaMA runs complete without error); bfloat16 or nf4 quantization type recommended
Resuming LoRA training runs not supported by Hugging Face Trainer
Adding new tokens requires manual embedding updates and storage/reload workaround

Verdict

Researchers and practitioners who need to fine-tune large models on limited hardware should grab this. If you already have an A100 cluster or only need inference, the rough edges around 4-bit speed and stability make it less compelling — though the pre-trained Guanaco weights and evaluation tools are still useful for benchmarking.

Frequently asked

What is artidoro/qlora?: QLoRA squeezes giant language models into consumer hardware by backpropagating through frozen 4-bit weights using LoRA adapters.
Is qlora open source?: Yes — artidoro/qlora is open source, released under the MIT license.
What language is qlora written in?: artidoro/qlora is primarily written in Jupyter Notebook.
How popular is qlora?: artidoro/qlora has 10.9k stars on GitHub.
Where can I find qlora?: artidoro/qlora is on GitHub at https://github.com/artidoro/qlora.