← all repositories
yomorun/yomo

Rust framework runs LLM tools at the edge, not in Virginia

YoMo wants your AI agent's weather-checking code closer to the user than the data center.

1.9k stars Rust AgentsLLMOps · Eval
yomo
Velocity · 7d
+0.9
★ / day
Trend
steady
star history

What it does YoMo is a Rust-based server for hosting LLM function-calling tools. You write a TypeScript handler (say, fetching weather), run yomo run, and the framework registers it as an OpenAI-compatible tool that any LLM can invoke. The server handles routing, TLS 1.3 on every packet, and can proxy to local or remote models via a YAML config.

The interesting bit The pitch is geo-distribution: running inference and tools near users instead of centralizing in one region. The README shows a diagram of edge nodes talking to each other, though the actual mechanics of multi-node deployment and synchronization are left as hand-waving. The underlying transport is QUIC, which explains the low-latency ambition.

Key highlights

  • OpenAI-compatible chat completions endpoint (/v1/chat/completions) with streaming support
  • TypeScript-first tool definitions with exported description, Argument types, and handler
  • TLS 1.3 by default; Bearer token auth for HTTP APIs
  • Local model support via Ollama (example shows qwen3.5 and gemma-4-31B-it)
  • Single-binary Rust CLI: yomo serve, yomo run, yomo init

Caveats

  • The “serverless” claim is aspirational: you run the binary yourself; there’s no hosted platform evident in the README
  • Geo-distributed architecture is described conceptually, but no docs explain how to actually deploy a mesh of nodes
  • Development section has typos (“Devleopment”) and the getting-started example uses a non-existent model (gemma-4-31B-it)

Verdict Worth a look if you’re self-hosting agent tools and want a lightweight Rust alternative to heavier frameworks. Skip it if you need managed multi-region deployment or mature documentation today.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.