← all repositories
chonkie-inc/chonkie

A pygmy hippo that chunks text at 100 GB/s

Chonkie wraps every text-splitting strategy you keep rewriting into one install-what-you-need Python library.

4.1k stars Python RAG · SearchData Tooling
chonkie
Velocity · 7d
+9.5
★ / day
Trend
steady
star history

What it does Chonkie is a Python chunking toolkit for RAG pipelines. It bundles nine chunkers—from naive token splitting to LLM-based “Slumber” chunking—plus refineries, vector-DB handshakes, and a self-hosted REST API. The default install is 505 KB; extras are opt-in so you don’t drag in half of PyTorch just to split a README.

The interesting bit The Pipeline class lets you chain chunkers and refineries declaratively—recursive chunk at 2K tokens, semantic chunk at 512, add overlap, embed, and ship to Qdrant in a fluent API. Pipelines are also storable and reusable via the REST API’s SQLite-backed registry, which turns a Python library into a chunking microservice with chonkie serve.

Key highlights

  • FastChunker claims SIMD-accelerated, byte-based chunking at “100+ GB/s” on CPU
  • 32+ integrations including 8 vector DB handshakes (Chroma, Pinecone, pgvector, etc.) and multiple tokenizer backends
  • Optional installs per component—chonkie[semantic] for embeddings, chonkie[tiktoken] for OpenAI token counting, etc.
  • Self-hosted API with Docker Compose support and interactive /docs
  • 56-language support out of the box

Caveats

  • The “100+ GB/s” claim for FastChunker lacks reproducible benchmark details in the README; treat as a marketing figure until verified
  • The README is truncated mid-sentence in the transformers tokenizer section, so full tokenizer coverage is unclear
  • chonkie[all] is explicitly “not recommended for production environments”

Verdict Worth a look if you’re maintaining yet another bespoke chunking script and want one library with swap-in strategies. Skip it if you already have a deeply customized NLP pipeline that you trust—Chonkie is glue, not magic, and the mascot knows it.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.