Your documents, compiled once, queried forever
A desktop app that turns Andrej Karpathy's LLM wiki pattern into a persistent, self-organizing knowledge base with graph analysis and a two-step ingest pipeline.

What it does
LLM Wiki is a cross-platform desktop app that reads your documents and incrementally builds a structured, interlinked wiki — not a chatbot that re-reads everything on every query. It follows Andrej Karpathy’s three-layer architecture (raw sources → LLM-generated wiki → schema/rules) and keeps the output as plain Markdown with YAML frontmatter and [[wikilinks]], fully compatible with Obsidian.
The interesting bit
The ingest pipeline splits work into two sequential LLM calls: first an analysis pass that extracts entities, spots contradictions with existing knowledge, and plans structure; then a generation pass that writes pages, updates cross-references, and queues review items. A SHA256 cache skips unchanged files, and a persistent disk-backed queue with retry logic keeps the process serial and crash-resistant. The app also builds a navigable knowledge graph with a four-signal relevance model (direct links, source overlap, Adamic-Adar, type affinity) and runs Louvain community detection to surface clusters and gaps you didn’t know you had.
Key highlights
- Two-step chain-of-thought ingest with incremental SHA256 cache and auto-retry queue
- Knowledge graph visualization using sigma.js + ForceAtlas2, with 4-signal relevance weighting and Louvain community detection
- Graph insights flag isolated pages, sparse communities, bridge nodes, and surprising cross-community links — each with one-click “Deep Research” triggers
- Multimodal image ingestion extracts PDF images, captions them with a vision LLM, and surfaces them in search
- Local HTTP API (
127.0.0.1:19828) plus a ready-made agent skill for Claude Code / Codex integration - Obsidian-compatible output — the wiki directory is a valid Obsidian vault with
index.md,log.md, andoverview.mdauto-maintained
Caveats
- The README is extensive but leaves some operational details unclear: exact LLM API costs for large libraries, whether vector search (LanceDB) is on by default, and how the app handles very large files in the ingest queue
- Deep Research depends on external search APIs (Tavily, SerpApi, or SearXNG) — configuration burden and rate limits are not discussed
- “Cross-platform” is claimed but specific OS support and installation packaging are not detailed in the provided sources
Verdict
Worth a look if you maintain a large personal research corpus and want structured output you can browse offline, not just chat logs. Skip it if you need real-time collaborative editing or already have a heavily customized Zotero/Obsidian workflow you’re happy with.