CLI: lint

Usage

vally lint [path] [options]

Description

vally lint is the home for fast static checks on the assets in your Vally project — no agent execution required. It validates eval specs (eval.yaml) for configuration errors and checks any SKILL.md files in the given path against the spec graders. All checks complete in seconds.

Arguments

Argument	Required	Default	Description
`path`	No	Current directory	Root directory to scan for `SKILL.md` files

Options

Flag	Type	Default	Description
`--eval-spec, -e <path>`	string	—	Path to an eval spec file to validate
`--known-domains <path>`	string	—	Path to a known-domains file for URL reference scanning
`--allowed-external-deps <path>`	string	—	Path to an allowlist file for structural dependency scanning
`--grader-plugin <specifier>`	string	—	Grader plugin to load (npm package name or local path). Prevents unknown-grader-type errors for plugin graders.
`--param <key=value>`	string	—	Set a param value (repeatable, e.g. `--param MODEL=gpt-4o`). Suppresses `undefined-param` lint errors for params always provided at runtime.
`--strict`	boolean	`false`	Treat warnings as errors (exit 1 on any diagnostic, including dependency warnings)
`--verbose`	boolean	`false`	Show detailed grader evidence for each check

Exit codes

Code	Meaning
`0`	All checks passed (warnings may be present unless `--strict`)
`1`	One or more skills failed, a reference error was found, an eval spec error was found, or an error occurred

Eval spec checks

When --eval-spec is provided, the following checks run against the eval YAML:

Errors (block execution)

Code	What it catches
`unknown-grader-type`	Grader type not in registry. Includes fuzzy “did you mean?” suggestions.
`invalid-grader-config`	Unknown config keys, missing required fields, wrong types.
`invalid-scoring-weight-key`	Scoring weights reference a grader type not used in any stimulus.
`scoring-weight-sum`	`scoring.weights` values do not sum to 1.0 (±0.01). Weights must be a normalized distribution.
`invalid-constraint-value`	Negative limits, overlapping expect/reject lists.
`environment-file-not-found`	Environment `files[].src` path doesn’t exist on disk.
`environment-command-timeout-invalid`	Environment `commandTimeout` is not a positive duration string with a unit suffix (e.g. `2m`, `90s`).
`duplicate-stimulus-name`	Two stimuli share the same `name`.

Warnings (surface ambiguity)

Code	What it surfaces
`duplicate-grader-type-in-stimulus`	Same grader type used twice in one stimulus (shared scoring weight key).
`rubric-without-llm-judge`	`rubric` defined but no LLM judge grader configured.
`regression-without-baseline`	`type: regression` set but no baseline configuration.
`graders-without-execution`	Graders defined but `runs: 0` or no executor.
`ineffective-constraint`	A configured limit has no effect because another setting dominates — e.g. `constraints.max_agent_duration` ≥ the hard cap, or a `loop-outcome` `max_acceptable_retries` ≥ `simulation.max_iterations - 1`.
`scoring-defaults-applied`	No `scoring` block defined (using defaults: equal weights, no threshold — verdict is binary all-graders-pass).
`scoring-weight-coverage`	Grader types used in stimuli have no entry in `scoring.weights` (weight 0, excluded from score).
`scoring-weights-without-threshold`	`scoring.weights` is configured but `scoring.threshold` is absent — weights are applied to the aggregate score but the verdict uses binary all-graders-pass. Add `scoring.threshold` or pass `--threshold`.

SKILL.md checks

When a path is provided, vally lint discovers all SKILL.md files under it and runs two built-in graders:

Grader	What it checks
spec-compliance	Name format (kebab-case, ≤ 64 chars), description presence and length (≤ 1024 chars), valid frontmatter
valid-refs	Every file path referenced in the SKILL.md exists on disk

Reference checks

When --known-domains is provided, all SKILL.md, *.agent.md, and references/*.md files within discovered skill directories are scanned for URL reference issues:

Code	What it catches
`EXTERNAL-DOMAIN`	URL domain not in the known-domains allowlist
`HTTP-NOT-HTTPS`	Insecure `http://` URL (localhost and `schemas.microsoft.com` are exempt)
`PIPE-TO-SHELL`	`curl` or `wget` piped directly to a shell interpreter
`SCRIPT-NO-SRI`	External `<script>` tag without `integrity` (SRI) attribute

Known-domains file format

One domain per line. Lines starting with # are comments, blank lines are ignored.

Bare domains match the host and any subdomain: microsoft.com matches learn.microsoft.com
Path-scoped entries require the URL prefix to match: github.com/dotnet/runtime matches github.com/dotnet/runtime/issues but not github.com/dotnet/sdk

# Core domains
microsoft.com
github.com

# Path-scoped
github.com/dotnet/runtime

External dependency checks

When --allowed-external-deps is provided, each discovered skill is checked for structural external dependencies. These findings are advisory warnings — they do not fail the lint unless --strict is used.

Code	What it catches
`SCRIPT-FILE`	Script file (`.ps1`, `.sh`, `.py`, `.bat`, `.cmd`, `.bash`) in the skill’s `scripts/` directory
`INVOKES-SCRIPT`	Skill description contains an `INVOKES <script>` pattern
`NON-BUILTIN-TOOL-REF`	`#tool:xxx` reference to a non-built-in tool in the skill content

Allowlist file format

One entry per line. Lines starting with # are comments, blank lines are ignored. Keys are case-insensitive and use the format type:skill-name:detail:

# Allow the setup script in my-skill
script:my-skill:scripts/setup.ps1

# Allow the INVOKES pattern in nullable-ref skill
invokes:nullable-ref

# Allow a specific tool reference
tool-ref:my-skill:#tool:web/fetch

Each warning message includes the exact (allow: ...) key to add to the allowlist file.

Examples

Validate an eval spec

# Validate eval spec only
vally lint --eval-spec eval.yaml

# Validate eval spec AND lint SKILL.md files
vally lint ./skills --eval-spec eval.yaml

Lint SKILL.md files

# Lint current directory
vally lint

# Lint a specific skill
vally lint ./skills/my-skill

# Lint with detailed output
vally lint ./skills --verbose

Strict mode for CI

# Fail on any warnings too
vally lint . --eval-spec eval.yaml --strict

Reference scanning

# Check URLs against a known-domains list
vally lint ./skills --known-domains known-domains.txt

# Combine with eval validation
vally lint ./skills --known-domains known-domains.txt --eval-spec eval.yaml

External dependency scanning

# Check for external dependencies
vally lint ./skills --allowed-external-deps allowed-external-deps.txt

# Fail on dependency warnings in CI
vally lint ./skills --allowed-external-deps allowed-external-deps.txt --strict

Output format

Eval spec output

On success:

✔ eval.yaml is valid

With issues:

Eval: eval.yaml
✘ error [unknown-grader-type] Unknown grader type "output-contain"
  at stimuli[0].graders[0].type
  Did you mean "output-contains"?

⚠ warning [scoring-defaults-applied] Scoring config omitted
  at scoring
  Using defaults: equal weights, no threshold (verdict is binary all-graders-pass).

  1 error(s), 1 warning(s)

SKILL.md output

Found 2 skill(s): auth-helper, code-reviewer

✔ auth-helper
  ✔ spec-compliance (4/4 checks passed)
  ✔ valid-refs (3 references, 0 broken)

✘ code-reviewer
  ✘ spec-compliance (3/4 checks passed)
    ✗ name-format: Name must be kebab-case
  ✔ valid-refs

1/2 skills passed