← all repositories

darrencxl0301/StageRAG

A lightweight RAG framework offering switchable 3-step (Speed) and 4-step (Precision) pipelines built on quantized Llama 3.2 1B/3B models.

StageRAG
Velocity · 7d
+4.2
★ / day
Trend
steady
star history

StageRAG is a production-ready framework for building hallucination-resistant RAG applications. It provides dual-mode pipelines that let users choose between speed (3-step, ~3-5s) and precision (4-step, ~6-12s) based on their needs. The framework integrates with knowledge bases via JSONL files, automatically builds vector indices for retrieval, and includes multi-component confidence scoring to detect uncertainty and reduce hallucinations. It runs on quantized Llama 3.2 1B and 3B models requiring 5-10GB GPU memory.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.