← all repositories
run-llama/semtools

grep's overachieving cousin now speaks embeddings

A Rust CLI that pipes PDFs and DOCX files into semantic search without leaving your terminal.

semtools
Velocity · 7d
+6.3
★ / day
Trend
steady
star history

What it does

SemTools is a Rust CLI that parses documents (PDF, DOCX, PPTX) into markdown via LlamaParse, then runs local semantic search over them using static multilingual embeddings. It also bundles an AI agent (ask) that can search and read documents to answer questions, plus workspace management for caching embeddings across large collections.

The interesting bit

The design is aggressively Unix-native: everything speaks stdin/stdout, so you can chain semtools parse | xargs search | grep like any other shell filter. The search itself runs locally with model2vec embeddings and cosine similarity — no API calls, no network latency for the retrieval step. The workspace subcommand adds an IVF_PQ index that auto-updates when files change, which is the boring-but-valuable part that makes repeated searches on big corpora not painful.

Key highlights

  • Local semantic search with per-line context matching and configurable distance thresholds
  • Document parsing through LlamaParse API (cloud-backed, with caching and concurrent requests)
  • ask subcommand runs an agent loop with search/read tools, defaulting to OpenAI but accepting any OpenAI-compatible API
  • Workspace mode caches embeddings in ~/.semtools/workspaces/ with automatic re-embedding on file changes
  • Installs via npm or cargo; npm falls back to local Rust build if no prebuilt binary exists

Caveats

  • parse requires a LlamaParse API key (free tier available); ask requires an OpenAI key — only search and workspace are fully local
  • The README notes “more parsing backends (something local-only would be great!)” as explicit future work, so offline parsing isn’t here yet
  • Default embedding model is 128M parameters — fast, but not the most nuanced for specialized domains

Verdict

Worth a look if you live in the terminal and want semantic search without spinning up a vector database. Skip it if you need fully offline document parsing or heavy-duty embedding models; the cloud dependencies are real.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.