← all repositories
atomicstrata/llm-wiki-compiler

RAG forgets; this compiler remembers

A TypeScript CLI that turns raw notes and docs into a persistent, interlinked markdown wiki with citation-traceable pages.

llm-wiki-compiler
Velocity · 7d
+23
★ / day
Trend
steady
star history

What it does

llmwiki ingests raw sources—papers, notes, READMEs—and compiles them into a typed markdown wiki (concept, entity, comparison, overview) with paragraph- and claim-level citations back to source line ranges. It then runs hybrid retrieval (semantic embeddings + BM25 reranking + wikilink-graph expansion) over the compiled artifact, not the raw chunks. A local web viewer, eval harness with CI-gateable thresholds, and an MCP server for Claude/Cursor/etc. are included.

The interesting bit

The project explicitly inverts the RAG pattern: instead of re-discovering relationships at query time, it compiles knowledge once into a persistent artifact that compounds. Saved queries become new wiki pages, so the knowledge base grows smarter with use. The two-phase compile (extract concepts, then generate pages) eliminates order-dependence and merges duplicates across sources before anything hits disk.

Key highlights

  • Two-phase LLM pipeline extracts concepts globally, then generates pages—failures caught before writes, duplicates merged
  • Citation-traceable to line ranges with llmwiki lint validation and llmwiki eval scoring (0–100 health, precision, optional LLM-as-judge)
  • Incremental everywhere: hash-based source change detection, content-hash-aware embedding updates, cached citation judgements
  • Provider-portable: Anthropic (default), OpenAI-compatible (incl. local llama.cpp/vLLM), Ollama, GitHub Copilot
  • MCP server exposes get_context_pack for budgeted, citation-aware evidence packs to any MCP-compatible agent
  • Export bridge to @atomicmemory/llmwiki runtime memory system

Caveats

  • GitHub Copilot provider lacks embeddings endpoint; semantic search falls back to full-index selection without vectors
  • Per-concept prompt budget defaults to ~50k tokens; truncation warnings print to stderr when popular shared concepts hit the cap
  • Anthropic auth token or API key required for default provider; local servers need dummy API keys to satisfy SDK requirements

Verdict

Worth a look if you’re drowning in bookmarks, research papers, or scattered team docs and want a browsable, growing knowledge base rather than a graveyard of tabs. Skip it if you need real-time retrieval over rapidly changing corpora—traditional RAG still wins there.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.