r2d4/rellm
A Python library that constrains LLM token generation by filtering non-matching completions against a regex pattern.

ReLLM extracts exact structured output from any language model completion using regular expressions as constraints. It filters non-matching tokens pre-generation by masking logits for potential completions that do not match the partial regex pattern. The library integrates with Hugging Face Transformers and can enforce syntactic structure like JSON or XML, semantic patterns like dates or numbers, or templated content. It aims to improve completion quality by reducing the token space and making output easier to parse programmatically.