Governance
Twelve teams. Four agent stacks. One security review. Then it became 400 repos.
This guide is the spec for that scale. It tells you, with code-level honesty, exactly what APM governance can guarantee, where it can be bypassed, and which fields in the schema are not yet wired to enforcement. If you are deciding whether to make apm audit --ci a required check across an org, read sections 7, 8, and 14 first — they own the bypass contract, the install-gate guarantee, and the known gaps.
1. Read this if
Section titled “1. Read this if”For the CISO
Section titled “For the CISO”You own the trust boundary and need defensible answers when an auditor asks “what was running, and who allowed it?”
APM gives you a git-tracked record of every agent dependency deployed (apm.lock.yaml) and a policy file your security team controls (<org>/.github/apm-policy.yml). The forensic answer to “what was active during the incident?” is one git log command. The trust boundary is your .github repo’s branch protection.
Most relevant: section 7 (bypass contract), section 8 (install gate guarantees), section 12 (auditing the auditor), section 13 (enforcement audit log), section 14 (known gaps).
For the VP of Engineering
Section titled “For the VP of Engineering”You need to roll governance out across N repos without breaking the developer flow that earned you those N repos in the first place.
APM’s policy engine is opt-in per repo until your org policy file lands in <org>/.github. The recommended path is enforcement: warn first, measure violations through GitHub Code Scanning, then flip to block once the noise is gone. Developers do code review; you don’t ship a new tool. Air-gapped CI is supported with a one-line workaround.
Most relevant: section 1 (this), section 5 (enforcement points), section 9 (air-gapped), section 11 (rollout playbook).
For the Platform Tech Lead
Section titled “For the Platform Tech Lead”You will own the rollout, the policy YAML, the CI wiring, and the on-call escalation when a repo is unexpectedly blocked.
Read sections 5, 6, and 10 closely — they tell you where enforcement runs, how policies merge, and what happens when the network is flaky. Section 9 gives you the offline matrix. Section 11 gives you the staged playbook. Section 14 is the list of sharp edges; budget for them.
Most relevant: section 5 (enforcement points), section 6 (composition), section 9 (air-gapped), section 10 (failure semantics), section 11 (rollout), section 14 (gaps).
2. The 30-second mental model
Section titled “2. The 30-second mental model”Two files do all the work:
apm.lock.yaml— what was deployed. Pinned to exact commit SHAs, git-tracked, regenerated by everyapm install.apm-policy.yml— what is allowed. Lives at<org>/.github/apm-policy.yml, auto-discovered from the project’s git remote.
Four enforcement points read those files:
- The
apm installpipeline gate (after dependency resolve, before file targets). - The
apm install --mcp <ref>direct-install preflight (separate code path). - The
apm installtransitive-MCP preflight (a second pass after APM packages resolve their own MCP dependencies). apm audit --ci [--policy <scope>](the only enforcer of the audit-only checks).
The trust boundary is your <org>/.github repository. CODEOWNERS and branch protection on that repo are what make the policy authoritative. Section 12 covers how to lock that down.
apm compile and apm run enforce zero policy. They trust the artifacts that apm install placed on disk. APM is an install-time gate, not a runtime sandbox.
APM addresses three structural problems in agent tooling: fragile context, manual setup, and ungoverned configuration. This guide is the spec for the third.
3. What you can govern
Section titled “3. What you can govern”The scope matrix below is the contract. Every row maps a security or operational concern to the schema field that controls it, the named check that enforces it, and where that check actually runs. Rows marked [i] audit-only are NOT enforced by apm install; you must run apm audit --ci --policy <scope> in CI to enforce them. Rows marked [!] parsed but not enforced are accepted by the schema today but not consumed by any check — treat them as forward-compatibility, not as live controls.
| Concern | Schema field | Check name | Install enforces | Audit enforces |
|---|---|---|---|---|
| Dependency allowlist | dependencies.allow | dependency-allowlist | Yes | Yes |
| Dependency denylist | dependencies.deny | dependency-denylist | Yes | Yes |
| Required packages present | dependencies.require | required-packages, required-packages-deployed | Yes | Yes |
| Required package version | dependencies.require[].version | required-package-version | Yes | Yes |
| Transitive depth cap | dependencies.max_depth | transitive-depth | Yes (when < 50) | Yes |
| MCP server allowlist | mcp.allow | mcp-allowlist | Yes (direct + transitive) | Yes |
| MCP server denylist | mcp.deny | mcp-denylist | Yes (direct + transitive) | Yes |
| MCP transport allowlist | mcp.transport.allow | mcp-transport | Yes | Yes |
| Self-defined MCP control | mcp.self_defined | mcp-self-defined | Yes | Yes |
| Compilation target allowlist | compilation.target.allow (with enforce: true) | compilation-target | Yes (post-targets phase) | Yes |
| Compilation strategy | compilation.strategy.enforce | compilation-strategy | [i] audit-only | Yes |
| Source attribution | compilation.source_attribution | source-attribution | [i] audit-only | Yes |
| Required manifest fields | manifest.required_fields | required-manifest-fields | [i] audit-only | Yes |
| Manifest scripts policy | manifest.scripts | scripts-policy | [i] audit-only | Yes |
| Unmanaged files in governed dirs | unmanaged_files.action, .directories | unmanaged-files | [i] audit-only | Yes |
| Cache TTL override | policy.cache.ttl | — | [!] parsed but not enforced (cache reader uses hardcoded 1h) | — |
| Transitive MCP trust (policy field) | mcp.trust_transitive | — | [!] parsed but not enforced (gate is the --trust-transitive-mcp CLI flag) | — |
| Manifest content types | manifest.content_types | — | [!] parsed but not enforced | — |
The full schema and the canonical 6+16 check enumeration live in the Policy Reference. The 6 baseline lockfile checks (lockfile presence, ref consistency, deployed files present, no orphaned packages, MCP config consistency, content integrity) run on every apm audit --ci regardless of policy and are non-bypassable — they are covered in section 7.
4. What you cannot govern
Section titled “4. What you cannot govern”Be clear with stakeholders about what is out of scope today:
- Prompt and instruction content semantics. APM scans for hidden Unicode (zero-width chars, bidirectional overrides) via
apm auditcontent scanning. It does NOT do LLM-based prompt review, prompt-injection detection, or semantic safety review. - Runtime versions and model selection. Policy does not constrain which LLM model an agent runs against, which Copilot version is installed locally, or which runtime executes the workflow.
- MCP command and args content. The MCP matcher only inspects the registry name (e.g.
microsoft/playwright). It does not validate thecommand:orargs:fields of a self-defined MCP server — only the name and the self-defined flag. - File integration paths. Where files land on disk inside the repo is decided by the integrators in each APM package. Policy cannot rewrite a package’s file layout.
- Custom agent tools beyond MCP. If an agent stack ships its own non-MCP tool plugins, they sit outside policy scope. Govern them indirectly through
dependencies.allow/denyandunmanaged_files. - Token scopes and OAuth scopes. APM does not audit the scope of the GitHub PAT or app token used to fetch policies and packages. Manage that through the standard GitHub controls on the token issuer.
- Drift on deployed files after install.
deployed_file_hashesare recorded in the lockfile but never re-verified. A developer who hand-edits a deployed instruction file post-install will not be detected byapm installorapm audit --ci. See section 14. - Anything
apm compileorapm rundoes. Those commands trust whateverapm installplaced on disk. They do not re-check policy.
For the underlying threat model (what the content scanner protects against, MCP trust boundary, dependency provenance), see the Security Model.
5. How enforcement works
Section titled “5. How enforcement works”Four enforcement points share one route_discovery_outcome table so the rules behave consistently regardless of entry point. The diagram below traces a single apm install invocation through the pipeline.
graph TD
Start["apm install"] --> Resolve["Phase: resolve<br/>(populate deps_to_install)"]
Resolve --> Gate["[*] Enforcement point 1<br/>policy_gate phase<br/>deps + direct MCP + (...)"]
Gate -->|enforce: block + violation| Fail1["Exit 1<br/>PolicyViolationError"]
Gate -->|warn or pass| Targets["Phase: targets"]
Targets --> TargetCheck["[*] (cont'd EP1)<br/>policy_target_check phase<br/>compilation-target only"]
TargetCheck --> Download["Phase: download / integrate"]
Download --> TransMCP["[*] Enforcement point 3<br/>transitive MCP preflight<br/>(2nd pass on APM-package MCPs)"]
TransMCP -->|block| Fail2["sys.exit(1)<br/>APM packages stay,<br/>MCP configs not written"]
TransMCP -->|pass| Lockfile["Phase: lockfile<br/>(write apm.lock.yaml)"]
Lockfile --> Done["Exit 0"]
MCPBranch["apm install --mcp ref"] --> MCPPre["[*] Enforcement point 2<br/>install_preflight<br/>(separate code path)"]
MCPPre -->|block| Fail3["Exit 1"]
MCPPre -->|pass| MCPWrite["Write MCP config"]
DryRun["apm install --dry-run"] --> DryPre["[*] Preview only<br/>install_preflight dry_run=True<br/>'Would be blocked' lines, no raise"]
Audit["apm audit --ci<br/>--policy <scope>"] --> AuditRun["[*] Enforcement point 4<br/>6 baseline + 16 policy checks<br/>(only enforcer of audit-only fields)"]
style Start fill:#e3f2fd,stroke:#1976d2,stroke-width:2px,color:#000
style Gate fill:#f3e5f5,stroke:#7b1fa2,stroke-width:3px,color:#000
style TargetCheck fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px,color:#000
style TransMCP fill:#f3e5f5,stroke:#7b1fa2,stroke-width:3px,color:#000
style MCPPre fill:#f3e5f5,stroke:#7b1fa2,stroke-width:3px,color:#000
style AuditRun fill:#fff3e0,stroke:#ff9800,stroke-width:3px,color:#000
style Fail1 fill:#fce4ec,stroke:#c2185b,stroke-width:2px,color:#000
style Fail2 fill:#fce4ec,stroke:#c2185b,stroke-width:2px,color:#000
style Fail3 fill:#fce4ec,stroke:#c2185b,stroke-width:2px,color:#000
style Done fill:#e8f5e8,stroke:#388e3c,stroke-width:2px,color:#000
style DryPre fill:#fff3e0,stroke:#ff9800,stroke-width:2px,color:#000
5a. Install pipeline gate
Section titled “5a. Install pipeline gate”Runs after dependency resolution and before file targets. Enforces the dependency, MCP, and (post-targets) compilation-target rules against the resolved set. On enforcement: block, the CLI emits an inline [x] Policy violation: ... line per finding, raises PolicyViolationError, and aborts before any file is written. On enforcement: warn, every finding is recorded as a [!] warn diagnostic that surfaces in the end-of-install summary; install continues to completion.
5b. Install --mcp <ref> preflight
Section titled “5b. Install --mcp <ref> preflight”apm install --mcp owner/repo is a separate command branch that constructs a temporary MCP dependency and runs the preflight directly. Same checks as the gate (allow/deny/transport/self-defined) but a different code path. On block, the process exits with code 1 before any MCP config file is written.
5c. Install transitive MCP preflight
Section titled “5c. Install transitive MCP preflight”When you install an APM package that itself declares MCP dependencies, those MCPs are first resolved by the APM resolver and then passed through a SECOND policy preflight. APM packages already passed the gate, so on transitive-MCP block the APM packages stay installed but the MCP configs are NOT written and the process exits 1. This preserves the rule that no transitive MCP server reaches your runtime config without passing the same mcp.* rules as a direct one.
5d. apm audit --ci --policy <scope>
Section titled “5d. apm audit --ci --policy <scope>”The only enforcer of the audit-only checks (compilation-strategy, source-attribution, required-manifest-fields, scripts-policy, unmanaged-files). Runs the 6 baseline lockfile checks unconditionally, then — if a policy is discovered or supplied — runs the 16 policy checks. This is the check you wire into branch protection.
5a. What does NOT enforce policy
Section titled “5a. What does NOT enforce policy”apm compile, apm run, and apm pack enforce zero organizational policy. They read what install placed on disk and proceed. If you assume “compile cannot bypass policy”, that is only true because the artifacts compile reads were placed there by an apm install that DID enforce policy. Compile itself does not re-check.
This is the most commonly misunderstood point in the model. The four enforcement points listed in section 2 are exhaustive. Anything outside apm install, apm install --mcp, the transitive-MCP preflight, and apm audit --ci is trust-by-construction, not trust-by-check.
6. Policy composition (inheritance)
Section titled “6. Policy composition (inheritance)”Policies can extend other policies up to 5 levels deep (MAX_CHAIN_DEPTH = 5, enforced both during the walk and after). Cross-host extends: is rejected at resolution time — a policy on github.com cannot extend one on ghe.example.com, as a credential-leakage mitigation. Cycles are detected and refused. The merge is tighten-only: children can narrow allowlists, add deny entries, escalate enforcement, and shorten max depth, but never relax a parent constraint.
graph TD
Hub["enterprise-hub-org/.github/<br/>apm-policy.yml<br/>broad allow lists,<br/>enforcement: warn"] --> Org["contoso/.github/<br/>apm-policy.yml<br/>extends: enterprise-hub-org<br/>adds deny + tightens<br/>enforcement: block"]
Org --> Repo["contoso/web-app/<br/>apm.yml policy stanza<br/>policy.hash pin,<br/>fetch_failure_default: block"]
Hub --> Merge["[*] merge_policies()<br/>tighten-only:<br/>allow=intersect, deny=union,<br/>enforcement=max(...),<br/>max_depth=min(...)"]
Org --> Merge
Repo --> Merge
Merge --> Effective["Effective policy<br/>used by all 4<br/>enforcement points"]
style Hub fill:#e3f2fd,stroke:#1976d2,stroke-width:2px,color:#000
style Org fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px,color:#000
style Repo fill:#fff3e0,stroke:#ff9800,stroke-width:2px,color:#000
style Merge fill:#fce4ec,stroke:#c2185b,stroke-width:3px,color:#000
style Effective fill:#e8f5e8,stroke:#388e3c,stroke-width:3px,color:#000
Worked example
Section titled “Worked example”Enterprise hub — enterprise-hub-org/.github/apm-policy.yml:
name: "Enterprise baseline"version: "1.0.0"enforcement: warndependencies: allow: - "microsoft/*" - "contoso/*" - "partner-corp/*"mcp: allow: - "microsoft/*" transport: allow: ["stdio", "http"]Org — contoso/.github/apm-policy.yml:
name: "Contoso engineering"version: "1.0.0"extends: "enterprise-hub-org"enforcement: blockdependencies: deny: - "untrusted-org/*"mcp: transport: allow: ["stdio"] self_defined: denyRepo — contoso/web-app/apm.yml:
packages: - name: contoso/web-standards version: "^1.0"policy: hash: "sha256:abc123..." # pin the fetched org policy fetch_failure_default: block # fail-closed if discovery failsEffective policy seen by every enforcement point in contoso/web-app:
| Field | Value | Why |
|---|---|---|
enforcement | block | Org escalated warn -> block (max(warn, block)); repo cannot relax. |
dependencies.allow | microsoft/*, contoso/*, partner-corp/* | Inherited from hub; org did not narrow it. Intersect rule (no child set means parent wins). |
dependencies.deny | untrusted-org/* | Org added a deny; parent had none. Union rule. |
mcp.allow | microsoft/* | Inherited from hub. |
mcp.transport.allow | stdio | Org narrowed [stdio, http] to [stdio]. Intersect rule. |
mcp.self_defined | deny | Org escalated. |
Counter-example: a child cannot relax a parent. If contoso/web-app/apm.yml tried to override the org’s block back down to warn:
policy: enforcement: warn # rejected: org policy is block; child cannot relaxThe merge rule for enforcement is max(parent, child) ordered warn < block, so the org’s block wins. The child’s warn is silently dropped from the effective policy. The same applies to allow-list widening (intersect rule) and deny-list removal (union rule): tightening flows down, relaxation does not.
For the full 12-row merge rule table, see Tighten-only merge rules in the Policy Reference.
7. The bypass / non-bypass contract
Section titled “7. The bypass / non-bypass contract”This is the certitude section. Read it twice if you are deciding whether apm audit --ci is good enough for branch protection.
| Surface | What it bypasses LOCALLY | What it CANNOT bypass | Reviewable in |
|---|---|---|---|
apm install --no-policy | All 16 policy checks at install (incl. transitive MCP, hash pin) | The 6 baseline checks in apm audit --ci | git diff of apm.lock.yaml in PR |
APM_POLICY_DISABLE=1 env | Same as --no-policy plus the 16 audit policy checks | The 6 baseline checks in apm audit --ci | PR diff; CI env vars in Actions logs |
Manual edit to apm.lock.yaml | Nothing; install regenerates the file each run | Audit baseline ref-consistency and deployed-files-present | git diff |
| Manual edit to deployed file post-install | Content-equality drift (no re-hash) | Hidden-Unicode scan in apm audit content mode | git diff of the deployed file in PR |
Direct git clone of an APM package, bypassing install | Everything; nothing detects out-of-band file drops | Audit baseline no-orphaned-packages and audit-only unmanaged-files | git diff |
| Fork repo to a personal org | Org policy auto-discovery (resolves to fork’s .github) | Whatever your CI requires on the canonical repo | branch protection on canonical repo |
--trust-transitive-mcp CLI flag | The transitive MCP preflight (second pass) | Direct MCP preflight; baseline content scan; audit MCP checks | CI command lines and Actions logs |
--allow-insecure CLI flag | The HTTP-MCP refusal (lets a http:// MCP through) | All mcp.* policy rules; audit MCP checks | CI command lines and Actions logs |
apm install --force | On-disk collision detection AND content-scan blocks | The 16 policy checks; baseline checks at next audit | CI command lines; PR diff of overwritten files |
Notes on specific rows:
apm install --no-policyalso bypasses theapm install --mcppreflight, the transitive-MCP preflight, and any project-sidepolicy.hashpin.APM_POLICY_DISABLE=1short-circuits discovery tooutcome="disabled"everywhere — includingapm audit --ci, where the 16 policy checks are skipped (the 6 baseline checks still run).- Manual lockfile edits:
content_hashmismatch on registry-proxy deps is caught at the next install when downloads resume. - Direct
git clone:unmanaged-filesonly flags governed dirs and only when configured towarn/deny. - Fork-to-personal-org: discovery resolves via
git remote get-url origin; branch protection on the upstream repo is the trust boundary.
The non-bypass contract. The 6 baseline lockfile checks (run by apm audit --ci without --no-policy or APM_POLICY_DISABLE=1) are unconditional. They do not consult the policy file, do not depend on org discovery, and are not affected by either escape hatch. Combined with branch protection that requires apm audit --ci to pass, no developer override is invisible: a --no-policy install leaves a lockfile that audit will reject if the result is inconsistent, and an APM_POLICY_DISABLE=1 audit run cannot itself bypass the baseline checks. Every override appears in the PR diff, in the workflow file, or in the Actions environment configuration — all of which are reviewable in code review.
Workstation blast radius. Because “file presence is execution” for agent files (an instruction or chat-mode file on disk is consumed by the agent runtime as soon as it is opened), the fork-to-personal-org bypass mitigates the org’s trust gate but not the individual workstation’s: between fork-clone-install and PR creation, the developer’s machine already has the tainted files. Compensating control: an MDM-deployed mirror of <org>/.github/apm-policy.yml consulted by a wrapper script around apm install, or a workstation-level allowlist of permitted git remotes for APM-managed repos.
8. What the install gate guarantees — precisely
Section titled “8. What the install gate guarantees — precisely”These guarantees assume APM_POLICY_DISABLE is unset and --no-policy is not passed in the CI environment. See section 7 for the full bypass contract.
When apm install returns exit code 0 and the effective org policy is in enforcement: block mode, you ARE guaranteed:
- Every APM dependency declared in
apm.ymlmatchesdependencies.allowand is not matched bydependencies.deny. - Every package in
dependencies.requireis present and at the required version (with one nuance:require_resolution: project-winsdowngrades version mismatches to warnings, by design). - Every direct MCP server matches
mcp.allow, is not matched bymcp.deny, uses an allowed transport, and respects themcp.self_definedrule. - Every transitive MCP server discovered from APM packages was re-checked against the same
mcp.*rules in a second pass (unless--trust-transitive-mcpwas passed). - Compilation targets that would be written match
compilation.target.allow(whenenforce: trueon that field). - The fetched policy file matched any
policy.hashpin inapm.yml. If the hash did not match, install failed closed regardless offetch_failure_default— hash-mismatch is unconditionally fail-closed in non-dry-run install paths. (apm install --dry-runlogs the mismatch but does not exit non-zero — see section 14 gap.) - The lockfile (
apm.lock.yaml) was regenerated from the resolved, gated set.
You are NOT guaranteed:
- That files on disk are still what install wrote. There is no drift detection on
deployed_file_hashes. - That prompt or instruction content is semantically safe. Only the hidden-Unicode scan runs.
- That the audit-only checks (
compilation-strategy,source-attribution,required-manifest-fields,scripts-policy,unmanaged-files) passed. Runapm audit --ci --policy <scope>in CI for those. - That non-APM files in the repo conform to anything. APM only governs files it placed.
- Anything about runtime behavior. APM is install-time only.
- That a
policy.cache.ttlshorter or longer than 1 hour took effect. The cache reader uses a hardcoded 1-hour TTL; thepolicy.cache.ttlfield is parsed but not honored.
9. Air-gapped and offline
Section titled “9. Air-gapped and offline”This section covers offline policy enforcement (the apm-policy.yml cache). For offline dependency traffic (routing installs through Artifactory), see Registry Proxy & Air-gapped.
For air-gapped CI, run apm audit --ci --policy ./vendored-policy.yml as your gating check; do not rely on apm install enforcement.
| Network state | Install gate | Install --mcp | apm audit --ci --policy <file> | apm audit --ci (auto-discovery) |
|---|---|---|---|---|
| Online | Discovers + enforces | Discovers + enforces | Loads from path, enforces | Discovers + enforces |
| Cache fresh (< 1h) | Cache hit, enforces | Cache hit, enforces | n/a (file path skips cache) | Cache hit, enforces |
| Cache stale (1h - 7d) | Refresh attempted; on fail, cached_stale outcome — proceed with cached unless policy.fetch_failure: block | Same | n/a | Same |
| Offline, cache > 7d | cache_miss_fetch_fail — fail-OPEN by default; fail-closed only if policy.fetch_failure_default: block in apm.yml | Same | Loads from path, full enforce | Same as install |
Workarounds when the network is unreliable:
- Audit in CI is fully offline-capable with
apm audit --ci --policy /path/to/vendored-policy.yml. The--policyargument accepts a local file path and bypasses GitHub discovery entirely. Vendor your org policy into the repo (or a sidecar mount) and audit works in any air-gapped environment. - Install does not have a
--policy <path>flag. This is a known gap (section 14). The current workaround isextends: <internal-mirror-url>from a reachable<org>/.github/apm-policy.yml, but the leaf is still fetched via the GitHub API. - Cache prewarm for repeatable offline builds. The cache lives at
<project_root>/apm_modules/.policy-cache/<key>.ymlwhere<key>issha256(repo_ref)[:16]. Prewarming means stashing valid<key>.ymland<key>.meta.jsonfiles in that directory before install runs. - Make policy fail-closed offline. Set
policy.fetch_failure_default: blockin your projectapm.yml. With this set, network failure or a malformed policy aborts install instead of warning. Combine withpolicy.hashto detect a tampered mirror.
10. Failure semantics
Section titled “10. Failure semantics”| Outcome | Default behavior | Override to fail-closed | Citation |
|---|---|---|---|
Network failure (cache_miss_fetch_fail) | Fail-OPEN, log warning, install proceeds with no policy | policy.fetch_failure_default: block in apm.yml | policy-reference#95-network-failure-semantics |
| Cached stale (1h - 7d, refresh failed) | Warn and proceed with cached policy | policy.fetch_failure: block set in the cached policy itself | policy-reference#95-network-failure-semantics |
Malformed YAML (malformed) | Fail-OPEN by default | policy.fetch_failure_default: block | policy/parser.py |
| Hash-mismatch (project pin vs fetched) | Always fail-CLOSED | n/a (cannot be relaxed) | policy-reference#95-network-failure-semantics |
| Garbage response | Fail-OPEN by default | policy.fetch_failure_default: block | policy-reference#95-network-failure-semantics |
extends: cycle detected | Fail-CLOSED, raises PolicyInheritanceError | n/a | policy/inheritance.py |
Cross-host extends: rejected | Fail-CLOSED, raises before any fetch | n/a (security mitigation, cannot be relaxed) | policy/discovery.py |
Why fail-open is the default for fetch failures: the design choice is to not break the developer flow on a transient network blip. A developer on a flaky hotel WiFi who runs apm install should not be locked out. The trade-off is that compliance-critical environments must explicitly opt into fail-closed via policy.fetch_failure_default: block. Combine that with policy.hash and a CI environment that is expected to be online, and the result is: any policy that does not fetch cleanly and match the pin aborts the build.
Hash-mismatch is the one outcome that can never be overridden. If your apm.yml pins policy.hash: sha256:... and the fetched policy hashes to something else, install fails closed unconditionally. This is the defense against silent mirror tampering or upstream policy drift you have not approved.
On-call quick reference:
cache_miss_fetch_failoutcome -> network; check egress to api.github.com; verify cache dir writable.hash_mismatchoutcome -> SUSPECTED TAMPER; do not override; investigate org policy commit history.cached_staleoutcome -> normal if recently degraded network; force refresh withapm policy status --no-cache.extends rejectedoutcome -> cross-host extends; remove non-canonical host fromapm-policy.ymlextends chain.
11. Rolling out without breaking N repos
Section titled “11. Rolling out without breaking N repos”The phased playbook below assumes you have an existing fleet of repos and need to introduce policy without surprise breakages. Each phase is independently committable and reversible.
graph TD
P1["Phase 1<br/>Ship apm-policy.yml<br/>to org/.github<br/>enforcement: warn<br/>(NOTHING BREAKS)"] --> P2["Phase 2<br/>Add apm audit --ci<br/>to repo CI<br/>via shared template<br/>SARIF -> Code Scanning<br/>(WARN MODE = no exit fail)"]
P2 --> P3["Phase 3<br/>Triage violations<br/>per repo<br/>fix apm.yml<br/>OR PR exception<br/>to org/.github"]
P3 --> P4["Phase 4<br/>Org-wide flip<br/>warn -> block<br/>monitor with<br/>apm policy status --check<br/>before each repo's gate"]
style P1 fill:#e3f2fd,stroke:#1976d2,stroke-width:2px,color:#000
style P2 fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px,color:#000
style P3 fill:#fff3e0,stroke:#ff9800,stroke-width:2px,color:#000
style P4 fill:#fce4ec,stroke:#c2185b,stroke-width:3px,color:#000
Phase 1 — ship a warn-mode policy. Land apm-policy.yml in <org>/.github with enforcement: warn. Nothing breaks anywhere. Every apm install in the org now discovers the policy, runs the checks, and logs [!] warnings for violations — but proceeds.
Phase 2 — wire audit into CI. Use a shared GitHub Actions template (or composite action) that runs apm audit --ci --policy org -f sarif and uploads the SARIF to GitHub Code Scanning. Violations become visible to repo owners as code-scanning alerts. Be honest with stakeholders here: in warn mode, audit rewrites violations to passed=True so the exit code stays 0. CI does not fail. The visibility is in the SARIF + Code Scanning UI, not in the green/red check. Branch protection cannot enforce yet.
Minimal workflow steps (drop into .github/workflows/apm-audit.yml):
jobs: apm-audit: runs-on: ubuntu-latest permissions: contents: read security-events: write steps: - uses: actions/checkout@v4 - uses: actions/setup-python@v5 with: { python-version: "3.12" } - run: pip install apm-cli==X.Y.Z # REPLACE: pin to current version, see /installation - run: apm install - run: apm audit --ci -f sarif --output-file apm-audit.sarif - uses: github/codeql-action/upload-sarif@v3 with: { sarif_file: apm-audit.sarif }For richer customization (matrix builds, monorepo splits, vendored policy paths) see the CI Policy Enforcement guide.
Phase 3 — triage and clean up. Repo owners either fix their apm.yml to comply with the policy, or they open a PR to <org>/.github/apm-policy.yml to add an explicit allow entry with rationale. The PR flow is the policy change-management trail (see section 12).
Phase 4 — flip to block. Once Code Scanning shows the violation backlog is drained, change enforcement: block in <org>/.github/apm-policy.yml. Stage by team if the org is large: a team can adopt block early by setting enforcement: block in its own team-level intermediate policy, leaving the org policy at warn. (Tighten-only merge means the team’s block wins for repos under that team’s extends: chain.) Use apm policy status --check in CI as a pre-flight that explains the effective policy and surfaces what would be blocked, before the gate phase actually blocks it.
Circuit-breaker rollout for large fleets. For 100+ repos, do not flip block org-wide in one commit. Stage: enable block for 10% of repos for 1 week (via team-level extends), monitor SARIF alert volume and on-call pages, expand to 50% for 1 week, then 100%. If SARIF volume spikes or on-call escalations cluster, revert to warn at the org level (one commit) while you triage.
For step-by-step CI YAML and SARIF upload examples beyond the snippet above, see the CI Policy Enforcement guide.
12. Auditing the auditor
Section titled “12. Auditing the auditor”The org policy file is the trust root. Protecting it is on you, not on APM.
- CODEOWNERS on
<org>/.github/apm-policy.yml— restrict to a security team. Every change requires their review. - Branch protection on
<org>/.githubmain — required reviewers, no force push, no direct push to main, dismiss stale approvals on new commits. - GitHub Ruleset on the org
.githubrepo (recommended) — requires approval from a specific team for any change to policy files. See GitHub Rulesets. - Change history is
git log apm-policy.yml. Rationale lives in commit messages and PR descriptions. Make commit-message rationale a CODEOWNERS-checked review item. - Policy change cooling period (recommended) — every change to
apm-policy.ymlrequires a PR with rationale and a 24-72 hour waiting period before merge. This is a process control, not a code control, but it is the single most important thing you can add.
Separation of duties for SOX / SOD-sensitive environments. CODEOWNERS for apm-policy.yml should require approvals from a team distinct from the team authoring the change. Configure GitHub Rulesets on the <org>/.github repo to require reviewers from @org/policy-approvers, where that team is disjoint from @org/policy-authors. The same author cannot self-approve, and the approval team has no commit rights to the policy file directly.
Lint for bypass flags in CI workflows. Add a pre-merge check that fails any PR introducing a policy-bypass flag without an explicit security review label:
# Pre-merge lint: detect policy-bypass flags in CI workflowsgrep -rEn '(--no-policy|--force|APM_POLICY_DISABLE|--trust-transitive-mcp|--allow-insecure)' .github/workflows/ \ && { echo "Policy bypass flag detected; requires security review"; exit 1; } || true--force is included because it bypasses the pre-deploy hidden-Unicode security scan (see section 7); teams may choose to allow it locally for developer ergonomics but it should never appear in CI workflows without security review.
When a reviewer asks “who approved this policy change and why?”, the forensic answer is one git command:
git -C <org>/.github log --follow --patch -- apm-policy.ymlFor lockfile-side forensic recipes (git log apm.lock.yaml, git show <sha>:apm.lock.yaml, etc.), see the companion Governance & Compliance page.
13. The enforcement audit log
Section titled “13. The enforcement audit log”Two complementary trails answer two different questions:
apm.lock.yamlgit history answers what configurations existed. Everyapm installregenerates it; every change is committed;git logis the deployment log.- GitHub Code Scanning (SARIF) answers what was blocked or warned.
apm audit --ci -f sarifemits SARIF; the GitHub Actionsupload-sarifstep writes it to Code Scanning. This is the durable record of enforcement decisions.
Retention follows GitHub Advanced Security policy: alerts persist on the repository indefinitely; alert state changes (resolved, dismissed) are tracked. Code Scanning alerts on closed PRs follow the standard ~30-day retention for ephemeral PR analyses; alerts on the default branch persist until dismissed. For SOC 2 / ISO 27001 7-year retention requirements, export SARIF to your SIEM (Splunk HEC, Azure Monitor, S3 + Athena) — APM emits the SARIF, customer pipelines persist it.
Querying the SARIF audit log with the gh CLI (the REST API; gh code-scanning is not a built-in subcommand):
# Filter alerts by rule (e.g. dependency-denylist) across the repogh api /repos/{owner}/{repo}/code-scanning/alerts \ --paginate -q '.[] | select(.rule.id == "dependency-denylist")'
# Filter by state (open, dismissed, fixed) -- the change-management evidence trailgh api /repos/{owner}/{repo}/code-scanning/alerts \ --paginate -q '.[] | select(.state == "dismissed")'Distinct from the lockfile audit log: the lockfile records what files were deployed and from which commit. SARIF records what the policy gate decided. A complete audit answer for an incident usually needs both: “the lockfile shows package X at commit Y was deployed on date Z, and SARIF shows that the policy check for that package passed under policy version V.”
14. Known gaps and limitations
Section titled “14. Known gaps and limitations”We publish this list because silent gaps are worse than known ones. Every item below names the operational mitigation available today. No proprietary governance vendor will give you this list — that’s the point.
These are the sharp edges. Plan around them; do not assume they are solved.
- No drift detection on deployed files.
deployed_file_hashesare recorded in the lockfile but never re-verified byapm installorapm audit --ci. A developer who hand-edits a deployed instruction file post-install will not be detected. Operational mitigation: rely on git diff in PR review for files in.github/,.apm/, and other governed directories. policy.cache.ttlfield is parsed but not honored. The cache reader uses a hardcoded 1-hour TTL. Settingpolicy.cache.ttl: 86400in your policy will be silently ignored. Operational mitigation: do not rely on this field; assume 1-hour cache TTL universally.mcp.trust_transitivepolicy field is parsed but not enforced. The transitive-MCP gate is the--trust-transitive-mcpCLI flag, NOT the policy field. Operational mitigation: govern transitive MCP trust through CI command lines and code review of workflow files, not through policy YAML.manifest.content_typesfield is parsed but no check enforces it. Operational mitigation: do not advertise this field as a control to stakeholders.- Audit-only checks are not enforced at install.
compilation-strategy,source-attribution,required-manifest-fields,scripts-policy, andunmanaged-filesonly run underapm audit --ci --policy <scope>. Operational mitigation: makeapm audit --cia required status check in branch protection. Without that, these rules are advisory only. apm compileandapm rundo not re-check policy. They trust install. Operational mitigation: ensure that no compile or run step in CI is reachable without a precedingapm installthat ran the gate.apm audit --ciinwarnmode rewrites violations topassed=True. Warn mode never fails CI exit. The visibility is in the SARIF output, not the exit code. Operational mitigation: monitor Code Scanning alerts during the warn-mode rollout phase; do not assume CI green means “no policy violations” while in warn mode.apm installhas no--policy <path>flag. Onlyapm auditdoes. This is the air-gapped install gap. Operational mitigation: useextends:from a reachable mirror, or run audit (which does support--policy <path>) as the gating check and skip install-time enforcement in air-gapped CI.- Non-GitHub remotes are not auto-discovered. If your project’s
git remote get-url originpoints to ADO, GitLab, or a plain git host, policy auto-discovery falls through with no policy applied. Operational mitigation: passapm audit --ci --policy <path-or-url>explicitly in those CI environments. - Trust anchor is
git remote get-url origin. A developer who pushes the project to a personal org will have policy discovery resolve<their-org>/.github/apm-policy.yml— which they control. Operational mitigation: branch protection on the canonical repo is the trust boundary; nothing about a personal fork can bypass what your CI requires before merge. apm install --dry-runsilently downgrades hash-mismatch. In dry-run,raise_blocking_errors=False(outcome_routing.py:104-119) causes the mismatch to surface asdiscovery_misswith no “Would be blocked” line and exit 0. Operational mitigation: rely onapm audit --ciin CI for hash-pin verification, not onapm install --dry-run.apm install --no-policyhelp text is misleading. It claims “Does NOT bypass apm audit —ci” — this is only true for the 6 baseline lockfile checks; the 16 policy checks ARE bypassed in audit when this flag (orAPM_POLICY_DISABLE=1) is set. Operational mitigation: do not rely on the help text; the bypass contract in section 7 is authoritative.- Gate + transitive-MCP preflight may double-emit the same MCP violation. A single bad transitive MCP can produce two SARIF alerts with the same rule and different code paths. Operational mitigation: dedupe by
(rule_id, server_name)when aggregating alerts in your SIEM or dashboard. - No signed attestation that the gate ran. APM does not currently produce a signed (e.g. SLSA / sigstore) attestation for the install gate or the audit run. Non-repudiation depends on the GitHub Actions audit log plus branch-protection enforcement of the required check. Operational mitigation: pair APM with branch protection requiring
apm audit --cias a status check; rely on GitHub’s audit log for auditor evidence.
For features that would close these gaps, watch the CHANGELOG and the policy-engine experimental status.
15. Decision tree
Section titled “15. Decision tree”graph TD
Start["Do I need an<br/>apm-policy.yml?"] --> Q1{"More than 1 repo<br/>using APM?"}
Q1 -->|"No"| Skip["No policy needed yet.<br/>Use lockfile + audit baseline.<br/>Revisit at 5+ repos."]
Q1 -->|"Yes"| Q2{"How many repos<br/>governed?"}
Q2 -->|"1 - 10"| Min["Minimal policy<br/>1 file, no inheritance"]
Q2 -->|"10 - 100"| Org["Org policy<br/>1 hub + per-team extends"]
Q2 -->|"100+"| HubOrg["Enterprise hub<br/>+ org + repo overrides<br/>(up to 5 levels)"]
Min --> MinYaml["enforcement: warn<br/>dependencies.allow:<br/> - your-org/* # REPLACE<br/>mcp.allow:<br/> - microsoft/*"]
Org --> OrgYaml["enforcement: warn (start)<br/>dependencies.allow + deny<br/>mcp.allow + transport.allow<br/>compilation.target.allow"]
HubOrg --> HubYaml["Hub: broad allows<br/>Org: extends hub, adds deny<br/>Repo: pin policy.hash,<br/>fetch_failure_default: block"]
style Start fill:#e3f2fd,stroke:#1976d2,stroke-width:2px,color:#000
style Q1 fill:#fff3e0,stroke:#ff9800,stroke-width:2px,color:#000
style Q2 fill:#fff3e0,stroke:#ff9800,stroke-width:2px,color:#000
style Skip fill:#e8f5e8,stroke:#388e3c,stroke-width:2px,color:#000
style Min fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px,color:#000
style Org fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px,color:#000
style HubOrg fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px,color:#000
style MinYaml fill:#fce4ec,stroke:#c2185b,stroke-width:1px,color:#000
style OrgYaml fill:#fce4ec,stroke:#c2185b,stroke-width:1px,color:#000
style HubYaml fill:#fce4ec,stroke:#c2185b,stroke-width:1px,color:#000
Minimal policy (1-10 repos) — one file at <org>/.github/apm-policy.yml:
name: "Starter policy"version: "1.0.0"enforcement: warndependencies: allow: - "your-org/*" # REPLACE: your GitHub org namemcp: allow: - "microsoft/*"Org policy (10-100 repos) — start similar, then add denies and target enforcement as you learn what to constrain:
name: "Contoso engineering"version: "1.0.0"enforcement: warndependencies: allow: ["microsoft/*", "contoso/*"] deny: ["untrusted-org/*"]mcp: allow: ["microsoft/*", "contoso/*"] transport: allow: ["stdio"]compilation: target: allow: ["copilot", "claude"] enforce: trueHub + org + repo (100+ repos) — enterprise hub with broad allows, org extending and tightening, repos pinning the hash:
name: "Enterprise baseline"version: "1.0.0"enforcement: warndependencies: allow: ["microsoft/*", "contoso/*", "partner-corp/*"]
# contoso/.github/apm-policy.ymlextends: "enterprise-hub-org"enforcement: blockdependencies: deny: ["untrusted-org/*"]
# contoso/web-app/apm.ymlpolicy: hash: "sha256:abc123..." fetch_failure_default: block16. Where to next
Section titled “16. Where to next”apm-policy.yml— the file’s mental model.- CI Policy Enforcement — step-by-step CI wiring with YAML.
- Policy Reference — complete schema, the canonical 6+16 check enumeration, the 12-row merge rule table, exit codes.
- Security Model — threat model, MCP trust boundary, content scanning, token handling.
- Adoption Playbook — broader APM rollout (governance is one phase).
- Lockfile Spec — lockfile schema for forensic queries.
- GitHub Rulesets — enforcing audit as a required check.
- Governance & Compliance — companion page covering the lockfile audit trail and SOC 2 / change-management scenarios.