Reference
CLI Commands lint, eval, grade, compare, experiment, export, serve, ingest — flags, arguments, exit codes.
eval.yaml Schema Complete specification for eval spec files.
Experiment File Schema YAML format for `vally experiment run` (variants, matrix, vary, baseline).
SKILL.md Format How to write a valid skill definition file.
Grader Catalog All built-in graders with config options and taxonomy.
Trajectory Format The JSON schema for captured agent behavior.
Scoring Functions pass@k, pass^k, and multi-trial metrics.