← all repositories
vectorize-io/hindsight

Your AI agent's memory, but it actually learns

Hindsight replaces conversation-history dumps with a biomimetic memory system that extracts facts, experiences, and mental models so agents improve over time.

16k stars Python AgentsRAG · Search
hindsight
Velocity · 7d
+72
★ / day
Trend
steady
star history

What it does Hindsight is a server-based memory layer for AI agents. You feed it text via retain, query it via recall, or ask it to synthesize insights via reflect. Behind the scenes it parses inputs into entities, relationships, and time series, then indexes them with sparse/dense vectors. Retrieval runs four strategies in parallel—semantic, keyword, graph, and temporal—merges results with reciprocal rank fusion, and reranks with a cross-encoder. It exposes Python and Node.js SDKs, plus a two-line LLM wrapper for drop-in use.

The interesting bit The project explicitly rejects the “chat log as memory” model. Instead it mimics human memory architecture: world facts, personal experiences, and higher-level mental models generated by reflection. The reflect operation is the unusual piece—it lets an agent form new connections and derive insights without new external input, like an AI project manager spotting risks from old notes.

Key highlights

  • Ships as a Docker container with a web UI; embedded Python mode needs no server
  • Supports OpenAI, Anthropic, Gemini, Groq, Ollama, LM Studio, and Minimax as backing LLMs
  • Benchmark claims top score on LongMemEval, with independent reproduction by Virginia Tech and The Washington Post (other vendors self-reported)
  • Storage backends: local PostgreSQL, external PostgreSQL, or Oracle AI Database for enterprise
  • Per-user memory isolation via metadata filtering on memory banks

Caveats

  • The README’s benchmark chart shows scores “as of January 2026”—a future date, likely a typo, which undermines the precision claim
  • The “two lines of code” wrapper is advertised but not actually shown in the provided snippets
  • Heavy infrastructure for simple workflows: the authors themselves note it may be “overkill” for basic n8n-style automations

Verdict Worth evaluating if you’re building long-running autonomous agents that need to accumulate knowledge and adapt. Skip it if you just need conversational context window management or simple RAG.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.