microsoft/waza
A Microsoft Go CLI for benchmarking and evaluating AI agent skills with support for multiple models and executors.
Collecting fresh signals — velocity needs a few days of history.
collecting data…
star history
Waza is a command-line framework for creating, running, and measuring AI agent skill evaluations. It provides tooling to scaffold evaluation suites, execute benchmarks across different AI models, and compare results to assess skill quality and effectiveness. The framework supports multiple executors including GitHub Copilot and offers templated skill templates for rapid evaluation setup.