Agent Skill¶

winml-cli ships a Copilot Skill (use-winml-cli) that lets AI coding agents drive the entire model-building pipeline on your behalf. When a coding agent has this skill attached, it can inspect models, generate configs, run builds, and interpret results — without you having to remember exact flags or stage ordering.

What the skill provides¶

The skill teaches the agent:

Capability	What the agent learns
Pipeline shape	The stage order (`inspect → export → analyze → optimize → quantize → compile → perf`) and when to enter mid-pipeline
Flag discovery	Always run `winml <command> --help` before quoting a command — never fabricate flags
Output mapping	Which command's `-o` produces the artifact the user actually needs
Scope awareness	Which model architectures are supported (classic DL) vs. out-of-scope (LLMs, diffusion)
Hardware detection	Use `winml sys --list-ep` to confirm what's available before targeting an EP
Two paths	When to use primitives (debugging, exploring) vs. config + build (production, CI)

How to use it¶

With GitHub Copilot Coding Agent¶

To make the Copilot Coding Agent (the cloud agent that creates PRs) follow the skill's guidance, reference it in .github/copilot-instructions.md. The Coding Agent reads that file automatically when working on this repository.

With other AI agents¶

For agents that support custom instructions (e.g., Copilot Extensions, Claude, ChatGPT with file uploads, or custom MCP tool servers), attach the skill file as context:

skills/use-winml-cli/SKILL.md

You can copy the file contents into your agent's system prompt, upload it as a reference document, or include it in a .github/copilot-instructions.md for VS Code Copilot Chat. The skill uses standard markdown with YAML front-matter — any agent that accepts text context can benefit from it.

Skill location¶

winml-cli/
└── skills/
    └── use-winml-cli/
        └── SKILL.md          ← the skill definition

Example agent interaction¶

User: Can I run ConvNeXt on my Snapdragon X Elite NPU?

Agent (with skill):
1. Runs `winml sys --list-ep` → confirms QNNExecutionProvider is registered
2. Runs `winml inspect -m microsoft/convnext-tiny-224` → confirms supported
3. Runs `winml config --onnx ... -d npu -o config.json`
4. Runs `winml build -c config.json -m microsoft/convnext-tiny-224 -o output/`
5. Runs `winml perf -m output/model.onnx -d npu --monitor`
6. Reports latency + NPU utilization to user