grep's overachieving cousin now speaks embeddings
A Rust CLI that pipes PDFs and DOCX files into semantic search without leaving your terminal.

What it does
SemTools is a Rust CLI that parses documents (PDF, DOCX, PPTX) into markdown via LlamaParse, then runs local semantic search over them using static multilingual embeddings. It also bundles an AI agent (ask) that can search and read documents to answer questions, plus workspace management for caching embeddings across large collections.
The interesting bit
The design is aggressively Unix-native: everything speaks stdin/stdout, so you can chain semtools parse | xargs search | grep like any other shell filter. The search itself runs locally with model2vec embeddings and cosine similarity — no API calls, no network latency for the retrieval step. The workspace subcommand adds an IVF_PQ index that auto-updates when files change, which is the boring-but-valuable part that makes repeated searches on big corpora not painful.
Key highlights
- Local semantic search with per-line context matching and configurable distance thresholds
- Document parsing through LlamaParse API (cloud-backed, with caching and concurrent requests)
asksubcommand runs an agent loop with search/read tools, defaulting to OpenAI but accepting any OpenAI-compatible API- Workspace mode caches embeddings in
~/.semtools/workspaces/with automatic re-embedding on file changes - Installs via npm or cargo; npm falls back to local Rust build if no prebuilt binary exists
Caveats
parserequires a LlamaParse API key (free tier available);askrequires an OpenAI key — onlysearchandworkspaceare fully local- The README notes “more parsing backends (something local-only would be great!)” as explicit future work, so offline parsing isn’t here yet
- Default embedding model is 128M parameters — fast, but not the most nuanced for specialized domains
Verdict
Worth a look if you live in the terminal and want semantic search without spinning up a vector database. Skip it if you need fully offline document parsing or heavy-duty embedding models; the cloud dependencies are real.