The Compile-Time Case for Remembering Technical Books

Staff Writer

An open-source Claude Code skill that converts PDFs and EPUBs into structured, on-demand frameworks—extracting signal once so you can reason with it later.

virgiliojr94/book-to-skill

★6.7k stars Velocity · 7d +282 ★/day ↗accelerating

star history

View on GitHub ↗

The Shelf-to-Skill Problem

Technical books decay in human memory faster than their file formats. You finish a dense volume on distributed systems, mark a few passages, and within a quarter you cannot recall whether chapter seven addressed consensus algorithms or replication strategies. The usual workarounds are equally unsatisfying. Searching a PDF returns page numbers, not synthesized answers. Asking a general-purpose chatbot risks hallucinated chapter titles or quotes averaged from internet discussion rather than the text you own. Hand-written notes ossify into hundred-line documents that never get reopened. The knowledge is technically on your hard drive, practically out of reach, and epistemologically untrustworthy once a model starts improvising.

This is the precise frustration that book-to-skill targets. The project, which has gained traction among Claude Code users, treats a technical book not as a document to be retrieved, but as a codebase of mental models to be compiled. It accepts a wide range of source formats—PDF, EPUB, DOCX, Markdown, HTML, RTF, and MOBI among them—and produces a structured skill that lives inside the Claude Code environment. The premise is that a 400-page volume, which might cost roughly 200,000 tokens if injected raw into a conversation, should instead become a lightweight, callable extension of your workflow. A LinkedIn post describing the tool summarized the appeal succinctly: the author effectively sits beside you while you work, not waiting in a separate browser tab to be queried.

Compile-Time Extraction vs. Query-Time Retrieval

The genuinely special technical insight here is a deliberate rejection of retrieval-augmented generation. RAG systems chunk a book at query time, embed the fragments, and surface whatever vector similarity judges relevant. It is a search technology: optimized to find the paragraph that talks about X. The project’s documentation argues this misses the point. An author does not merely describe concepts; they crystallize years of experience into named frameworks, specific anti-patterns, and calibrated decision tables. A similarity search over sentences returns proximity, not structure. When you ask about replication, you do not want the three pages that mention the word most often; you want the author’s specific framework for choosing between leader-follower and multi-leader topologies.

Instead, book-to-skill operates at compile time. A single deep analysis pass ingests the source document, identifies the table of contents and metadata, and generates a suite of interlinked files: a master SKILL.md containing core mental models; per-chapter summaries that load only when invoked; a glossary of key terms with chapter references; a patterns file capturing techniques and algorithms; and a cheatsheet of decision tables. The extraction pipeline even selects its parser based on content characteristics, using Docling when it detects technical material rich in tables and code blocks, and falling back to faster prose extractors for text-heavy volumes. A benchmark cited in the project’s notes illustrates the trade-off: for a 103-page technical book, a fast text extractor finishes in tenths of a second but destroys tabular and code formatting, while the document-aware parser takes roughly 164 seconds yet preserves dozens of tables and code blocks as structured markdown. The human pays the cost once, during compilation, so that every subsequent query reasons over clean signal rather than noisy raw text.

The project playbooks describe three modes of operation: a full conversion that generates the complete skill suite; an analyze-only pass that previews the extracted frameworks before committing tokens; and a generate-from-analysis mode that lets users curate the intermediate output before final compilation. This tiered approach acknowledges that the deep analysis is expensive, and users should be able to inspect the signal before paying to structure it. The design principles are strict: density over completeness, a practitioner voice that says use X when Y rather than the book explains X, and front-loaded context that keeps the most important mental models within the first few thousand tokens. Chapters remain on disk until explicitly called. The result is not a searchable archive; it is a reasoning partner that understands the author’s exact formulations and can apply them.

The Token Economics of Deep Work

Large context windows have seduced developers into thinking that stuffing an entire corpus into a prompt is free. It is not. Context is currency, and long inputs dilute attention. The project’s playbooks explicitly estimate conversion costs before any generation begins, breaking down input and output token counts against Claude Sonnet and Haiku pricing. A dense technical book can easily consume tens of thousands of tokens during the initial compilation. The payoff is that subsequent queries become surgical: a targeted question might pull only a few thousand tokens of structured summary instead of forcing the model to reason over hundreds of thousands of raw source tokens.

The FAQ section of the project makes this case bluntly. Yes, you could dump the PDF into your Claude project context, but every conversation would burn that budget upfront. With the skill approach, only the chapters relevant to your question load; the rest stays on disk. More importantly, raw text injection is retrieval. A skill is reasoning. When you load a chapter file, the model is not searching for keyword matches—it is working with pre-extracted named frameworks, principles, and mental models structured for application, not for reading. The master SKILL.md holds roughly four thousand tokens. Each chapter file sits around a thousand. The glossary, patterns, and cheatsheet add another few thousand. The entire compiled artifact for a substantial book might total roughly twenty-five thousand tokens of structured output—an order of magnitude smaller than the source, and loaded piecemeal. It is a bet on signal over noise, and it mirrors how experienced practitioners actually use reference material: they want the rule, the anti-pattern, and the decision boundary, not the introductory paragraph that originally presented them.

Not a Library, But a Workbench

Positioning this tool against the broader landscape of document AI reveals its narrow, intentional focus. Adobe Acrobat’s AI Assistant and Nutrient.io’s Document SDK target enterprise workflows: contract analysis, form filling, redaction, and multi-file workspaces across web and mobile platforms. Solution Tree offers an AI Book Assistant tied to a specific educational title, essentially a publisher-branded tutor. NotebookLM excels when you want to synthesize insights across dozens of sources at once. Even the broader cultural trend—evidenced by tutorials on learning technologies rapidly with AI assistants—points toward speed of absorption across many sources.

book-to-skill does none of that. It is open source, runs locally inside Claude Code, and is built for the single-book deep dive. The documentation explicitly concedes that if your workflow is searching across fifty books, RAG or NotebookLM wins. But if your workflow is applying one author’s frameworks while you debug a service or design a schema, the skill format wins because it embeds that specific author’s voice and precision into your terminal session. It shines precisely where general training data fails: niche technical references, internal company documentation, recent publications, and translated works that the base model has never seen. It is less a library search engine and more a specialized copilot compiled from a single trusted source.

The Skepticism

The project is not without rough edges or critics. A LinkedIn commenter suggested that a recursive language model approach combined with an MCP layer could be even more token-efficient, loading the book into a programmatic variable and reasoning over it without pre-summarization. Another voiced direct doubt about whether this constitutes learning at all, implying that callable summaries might substitute for comprehension rather than reinforce it. The repository itself carries an “effort: high” badge, signaling that the initial compilation is computationally and financially costly. And the entire ecosystem is tethered to Claude Code; this is not a universal standard for agentic document handling.

These criticisms highlight an unresolved tension in the personal-knowledge-AI space. Pre-compiled structure saves query-time tokens and reduces hallucination, but it risks ossifying the material. If the initial extraction misses a nuance or misnames a framework, the skill will repeat that omission faithfully until recompiled. It trades the flexibility of raw text for the speed of structured reasoning. The project’s own FAQ anticipates the RAG comparison and dismisses it cleanly: RAG answers with chunks close to your query, while a skill answers with the twelve frameworks the author built, ready to reason with. That is a philosophical claim as much as a technical one. Blocked Reddit discussions and marketplace listings around document-to-skill conversion suggest the community is still experimenting with where exactly this trade-off should land.

The Outlook

What book-to-skill represents is a shift in how technically literate users relate to static documents. The PDF is no longer the artifact; the skill is. The value proposition lies in turning consumption into application—making the author’s years of crystallized expertise callable during the act of creation rather than sequestered on a digital shelf. As agentic coding environments become the default workspace, tools that compile human expertise into machine-readable structure will likely proliferate, whether they take the form of Claude skills, MCP servers, or some hybrid architecture. The underlying insight is durable: reasoning with a book requires more than access to its text. It requires understanding its architecture, and that understanding, it turns out, can be compiled.