← all repositories
undertheseanlp/underthesea

A Vietnamese NLP library that learned to build AI agents

Underthesea started as a Vietnamese NLP toolkit and now ships a stdlib-only agent framework with multi-provider LLM support, tool calling, and A2A serving.

1.7k stars Python AgentsLanguage Models
underthesea
Velocity · 7d
+0.5
★ / day
Trend
steady
star history

What it does

Underthesea is a Python package that does two things: a full Vietnamese NLP pipeline (word segmentation, POS tagging, NER, sentiment, etc.) and, since v9.3.0, a lightweight agentic AI toolkit. The agent layer talks to OpenAI, Azure, Anthropic, or Gemini using only urllib and json — no vendor SDKs required. It handles streaming, tool calling, multi-session orchestration, local JSON tracing, and A2A protocol serving via a raw ASGI app.

The interesting bit

The zero-dependency stance is the real flex. The authors skipped every LLM SDK and built their own provider classes, then added 12 built-in tools, session harnesses, and an A2A server with optional bundled UI — all without forcing you to install openai, anthropic, or even a web framework. The Vietnamese NLP side is mature; the agent side is a bet that glue code should stay thin.

Key highlights

  • Stdlib-only LLM clientsurllib + json for all four providers; auto-detect via LLM() and environment variables
  • 12 built-in tools — calculator, datetime, web search, wikipedia, file I/O, shell, python exec, plus custom @Tool wrapper
  • Tracing out of the box — every call writes to ~/.underthesea/traces/; optional Langfuse integration
  • A2A serving — raw ASGI app with JSON-RPC + SSE, discoverable AgentCard, and optional chat UI; no web framework in base install
  • Multi-session orchestration — long-running tasks with progress files and context reset, following Anthropic harness patterns
  • Vietnamese NLP — 12 pipelines including word segmentation, NER, dependency parsing, and TTS

Caveats

  • The A2A server convenience extra requires pip install 'underthesea[agent-server]' for uvicorn + starlette + httpx
  • The README notes the NLP docs live in a separate NLP.md; agent features are front-and-center now

Verdict

Worth a look if you want a minimal, vendor-agnostic agent toolkit or work with Vietnamese text. Skip it if you need a heavy orchestration layer like LangChain or LlamaIndex — this is intentionally thin glue, not a platform.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.