Skip to content

Releases

Released: 2026-02-23

  • A/B baseline testing--baseline flag runs tasks with and without skill, computes improvement scores
  • Pairwise LLM judgingpairwise mode on prompt grader with position-swap bias mitigation
  • Tool constraint gradertool_constraint type with expect_tools, reject_tools, max_turns, max_tokens
  • Auto skill discovery--discover flag walks directories for SKILL.md + eval.yaml
  • Resolved errcheck and ineffassign lint warnings
  • Added competitive analyses (OpenAI Evals, skill-validator) and eval registry design doc
  • Converted remaining ASCII diagrams to Mermaid

Pre-built binaries for all major platforms:

PlatformArchitectureDownload
macOSApple Silicon (ARM64)waza-darwin-arm64
macOSIntel (AMD64)waza-darwin-amd64
Linuxx64 (AMD64)waza-linux-amd64
LinuxARM64waza-linux-arm64
Windowsx64 (AMD64)waza-windows-amd64.exe
WindowsARM64waza-windows-arm64.exe

Terminal window
curl -fsSL https://raw.githubusercontent.com/microsoft/waza/main/install.sh | bash

Verify the installation:

Terminal window
waza --version

Install waza as an Azure Developer CLI extension.

Current version: v0.9.0

Terminal window
azd extension install azd-waza \
--source https://github.com/microsoft/waza/releases/download/azd-ext-microsoft-azd-waza_0.9.0/azd-ext-microsoft-azd-waza_0.9.0.tar.gz

Once installed, run evaluations through azd:

Terminal window
azd waza run eval.yaml

For older versions, changelogs, and pre-release builds, see the full GitHub Releases page.