Is openevals open source?

Yes — langchain-ai/openevals is open source, released under the MIT license.

What language is openevals written in?

langchain-ai/openevals is primarily written in Python.

How popular is openevals?

langchain-ai/openevals has 1.1k stars on GitHub.

Where can I find openevals?

langchain-ai/openevals is on GitHub at https://github.com/langchain-ai/openevals.

← all repositories

langchain-ai/openevals

LLM-as-judge, but with the training wheels still on

OpenEvals ships prebuilt prompts and scorers so you can stop hand-rolling LLM-as-judge evaluators for every new app.

★1.1k stars Python LLMOps · Eval

View on GitHub ↗

Not currently ranked — collecting fresh signals.

star history

What it does

OpenEvals is a library of readymade evaluators for LLM applications. It wraps the “LLM-as-judge” pattern into reusable functions like create_llm_as_judge, and bundles prebuilt prompts for checking correctness, safety, RAG groundedness, code quality, and agent trajectories. It also includes deterministic checks—exact match, Levenshtein distance, embedding similarity—and can sandbox and type-check generated Python or TypeScript code.

The interesting bit

The prebuilt prompts are exposed as plain f-strings, so you can inspect and mutate them instead of treating them like black-box magic. It also goes beyond text: it can evaluate multimodal inputs, simulate multi-turn user conversations, and run Pyright or mypy against generated code.

Key highlights

Prebuilt prompts for quality, safety, RAG, code, voice, and image evals.
Deterministic evaluators (exact match, Levenshtein, embedding similarity) alongside LLM judges.
Sandboxed code execution and static type checking for Python and TypeScript outputs.
Agent trajectory matching with strict, unordered, subset, and tool-argument modes.
Multiturn simulation with LangGraph and built-in LangSmith logging.

Caveats

Examples reference a nonexistent gpt-5.4 model, so you’ll need to substitute a real model name before anything runs.
Several features (Pyright, mypy, sandbox execution) are split by language, so not every evaluator works in both Python and TypeScript.
The library is tightly coupled to the LangChain ecosystem; LangSmith logging is treated as a first-class citizen.

Verdict

Best for teams already using LangChain who need to bootstrap evals quickly without designing rubrics from scratch. Skip it if you want a framework-agnostic evaluation pipeline or already have mature, custom scorers.

Frequently asked

What is langchain-ai/openevals?: OpenEvals ships prebuilt prompts and scorers so you can stop hand-rolling LLM-as-judge evaluators for every new app.
Is openevals open source?: Yes — langchain-ai/openevals is open source, released under the MIT license.
What language is openevals written in?: langchain-ai/openevals is primarily written in Python.
How popular is openevals?: langchain-ai/openevals has 1.1k stars on GitHub.
Where can I find openevals?: langchain-ai/openevals is on GitHub at https://github.com/langchain-ai/openevals.