← all repositories

open-compass/LawBench

A benchmark suite for evaluating the legal knowledge and reasoning capabilities of large language models.

432 stars Python LLMOps · EvalDomain Apps
LawBench
Velocity · 7d
+0.4
★ / day
Trend
steady
star history

LawBench is an evaluation framework that benchmarks LLMs across 20 legal tasks based on China’s judicial system. It tests multiple dimensions of legal cognition including entity recognition, reading comprehension, crime calculation, and legal consultation. The benchmark addresses the gap in understanding how well LLMs perform in high-specialization, safety-critical legal applications.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.