PrithivirajDamodaran/FlashRank
Ultra-lightweight Python library for re-ranking search results in RAG pipelines using cross-encoders and LLM-based listwise rerankers, running on CPU without Torch or Transformers.

FlashRank provides state-of-the-art pairwise and listwise re-ranking capabilities for search and retrieval pipelines. It integrates cross-encoder-based rerankers for fast pointwise/pairwise scoring and LLM-based listwise rerankers for more sophisticated ranking. The library is designed as a lightweight (~4MB models), CPU-friendly addition to RAG pipelines that operates without heavy ML framework dependencies. Users feed initial retrieval results through rerankers before passing them to downstream LLMs to improve answer quality.