camel-ai/crab
A Python framework for building and running benchmark environments to evaluate multimodal LLM agents.

Velocity · 7d
+0.6
★ / day
Trend
→steady
star history
CRAB provides a framework for creating standardized benchmarks to assess language model agents in cross-platform environments. It allows defining agent tasks through Python decorators and includes a novel graph-based evaluation methodology. The framework supports deploying agent environments via Docker, virtual machines, or in-memory processes while maintaining a unified interface for evaluation.