marcotcr/checklist
A behavioral testing framework for evaluating NLP models with templates, perturbation functions, and test suites.

Velocity · 7d
+0.9
★ / day
Trend
→steady
star history
CheckList provides a methodology and tooling for behavioral testing of NLP models beyond standard accuracy metrics. It includes templates for generating test cases, perturbation functions for creating adversarial examples, and integration with HuggingFace transformer pipelines. The framework helps practitioners systematically test model capabilities and vulnerabilities across different linguistic phenomena.