← all repositories

microsoft/waza

A Microsoft Go CLI for benchmarking and evaluating AI agent skills with support for multiple models and executors.

1k stars Go AgentsLLMOps · Eval
Collecting fresh signals — velocity needs a few days of history.
collecting data…
star history

Waza is a command-line framework for creating, running, and measuring AI agent skill evaluations. It provides tooling to scaffold evaluation suites, execute benchmarks across different AI models, and compare results to assess skill quality and effectiveness. The framework supports multiple executors including GitHub Copilot and offers templated skill templates for rapid evaluation setup.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.