isaacus-dev/semchunk
A Python library that splits text into semantically coherent chunks optimized for retrieval-augmented generation pipelines.

The library provides hierarchical text chunking that preserves semantic context within chunks for better downstream LLM retrieval. It supports AI-powered semantic analysis, configurable chunk overlap, and works with any tokenizer including Tiktoken and Hugging Face Transformers. Benchmarks claim 15% better RAG performance than competing chunking strategies, and it is integrated into production tooling including Microsoft Intelligence Toolkit.