← all repositories

zilliztech/GPTCache

Semantic cache layer for LLM query responses that reduces API costs and improves latency.

GPTCache
Velocity · 7d
+6.9
★ / day
Trend
steady
star history

GPTCache provides a semantic caching solution for LLM applications, storing and retrieving responses based on meaning rather than exact matches. It integrates with popular LLM frameworks like LangChain and llama_index, using vector similarity search to determine cache hits. The system supports multiple vector stores including Milvus and Redis for scalable deployment, and can serve as a Docker-based server for multi-language environments.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.