Spec Verification
waza spec verify checks whether your eval suite exercises the promises made in SKILL.md. It is designed for agentic skills where routing quality depends on clear USE FOR and DO NOT USE FOR boundaries.
What gets verified
Section titled “What gets verified”The command parses SKILL.md deterministically and emits requirement IDs with source spans:
| Requirement kind | Example ID | Source |
|---|---|---|
| Description | req-description-001 | description: frontmatter |
| Positive trigger | req-use-001 | USE FOR: phrase |
| Negative trigger | req-dont-001 | DO NOT USE FOR: phrase |
| Parameter | req-param-001 | parameters, inputs, or arguments block |
Deterministic matching runs first. Semantic matching is opt-in with --semantic and uses the configured judge model (--judge-model, config.judge_model, then config.model).
Worked example
Section titled “Worked example”Given this skill description:
---name: pr-summarizerdescription: | Summarize PR diffs. USE FOR: summarize a PR diff, summarize PR discussion. DO NOT USE FOR: code review security PRs.---
## Parameters- repository: GitHub repository URL- pr_number: Pull request numberAnd an eval with one positive and one negative task:
tasks: - tasks/*.yamlid: pr-summary-basicname: PR summary basicinputs: prompt: Please summarize this PR diff for repository microsoft/waza.expected: should_trigger: trueid: security-review-negativename: Security review negative triggerinputs: prompt: Please do code review security PRs.expected: should_trigger: falseRun:
waza spec verify skills/pr-summarizer evals/pr-summarizer/eval.yamlExample output:
Spec VerificationCoverage: 4/5 requirements covered (1 uncovered)
OK req-use-001 "summarize a PR diff" -> covered by tasks: [pr-summary-basic]OK req-dont-001 "code review security PRs" -> covered by tasks: [security-review-negative]MISS req-use-002 "summarize PR discussion" -> no task exercises thisAdd a task for req-use-002, or use --semantic if the task covers the requirement without sharing obvious keywords.
CI snippet
Section titled “CI snippet”Use --format github-actions to emit annotations on uncovered requirements. Add --fail to make uncovered requirements gate the workflow.
name: Verify Skill Spec Coverage
on: pull_request: paths: - 'skills/**' - 'evals/**'
jobs: spec-verify: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - name: Install waza run: | curl -fsSL https://raw.githubusercontent.com/microsoft/waza/main/install.sh | bash echo "$HOME/bin" >> "$GITHUB_PATH" - name: Verify SKILL.md coverage run: | waza spec verify skills/pr-summarizer evals/pr-summarizer/eval.yaml \ --fail \ --threshold 1 \ --format github-actions