ADR-009: Remove Azure AI Foundry¶
Status: Accepted Date: March 2026 Deciders: Project contributors
Context¶
The PoC originally used Azure AI Foundry to host agent definitions and generate Foundry Agent IDs. These IDs were bridged into SPIFFE identities (spiffe://aim.microsoft.com/foundry/<agent-name>/aid/<asst_...>) and used throughout the enforcement stack.
Over time it became clear that Foundry provided zero enforcement value. All three real security layers — mTLS (transport), RBAC (application), and OAuth2 (token exchange) — work entirely through SPIFFE identities and Entra Agent Identity. Foundry was an extra dependency that:
- Added infrastructure complexity (Cognitive Services account, model deployment, SDK dependencies)
- Introduced deployment fragility (role propagation races on
agents/write, soft-deleted account conflicts, API version sensitivity) - Created confusion about what was actually enforcing security (the proxy + SPIRE, not Foundry)
- Required a
create-foundry-agents.pyscript with its own retry logic and error handling
Decision¶
Remove Azure AI Foundry entirely. Adopt the Entra Agent Identity Blueprint SPIFFE ID format as the canonical identity format.
SPIFFE ID format change:
Old: spiffe://aim.microsoft.com/foundry/<agent-name>/aid/<asst_...>
New: spiffe://aim.microsoft.com/ests/bp/<blueprint-oid>/aid/<agent-oid>
The ests namespace reflects ESTS (Entra STS) as the identity authority. The Blueprint OID is shared across all agents; the Agent OID is per-agent.
Key Changes¶
- Deleted
infra/modules/foundry.bicepandscripts/create-foundry-agents.py - Replaced
FoundryAgentIDwithNamefield in RBACCallerPolicy - Rewrote
EnrichFromEnv()to use Name-based env var lookup (SPIFFE_PREFIX_<NAME>,ENTRA_ID_<NAME>) - Removed
X-SPIFFE-Foundry-Agent-IDheader from proxy chain - Updated RBAC policy YAML to v3.0 format with
namefield - Removed Foundry agent creation from
deploy.sh(Step 2.5) - Updated portal UI and server to remove all Foundry references
- Updated all test SPIFFE IDs to new format
- Cleaned up
teardown.sh,add/remove-demo-agent.shscripts
Net result: -597 lines of code.
Rationale¶
-
Foundry adds no security enforcement. The SPIFFE proxy enforces mTLS and RBAC using SPIFFE IDs from SPIRE SVIDs. Foundry Agent IDs were simply embedded in the SPIFFE path — they didn't participate in any cryptographic verification or authorization decision.
-
Entra Agent Identity is the real identity anchor. The Entra Agent Identity Blueprint provides the authoritative identity for agents. Bridging through Foundry was an unnecessary indirection — the SPIFFE ID should map directly to Entra identities.
-
Simpler deployment. Removing Foundry eliminates an entire infrastructure module (Cognitive Services account + model deployment), a Python script with retry logic, and multiple gotchas (soft-deleted accounts, API version sensitivity, role propagation delays on
agents/writevsagents/read). -
Cleaner demo story. "Entra Agent ID → SPIFFE → mTLS + RBAC" is a direct, compelling chain. "Entra Agent ID → Foundry Agent ID → SPIFFE → mTLS + RBAC" was confusing for stakeholders.
Consequences¶
- Positive: Simpler infra, faster deploys, clearer identity chain, fewer failure modes
- Positive: RBAC policy is now name-based rather than Foundry-ID-based, making it human-readable
- Negative: Portal UI needed updates (some Foundry references in display logic remain as bugs to fix)
- Negative: Any scripts or docs referencing the old SPIFFE format need updating
Follow-Up Fixes Required After Removal¶
The removal was a large change (22 files) and surfaced several follow-up bugs:
create-entra-agent-ids.pystill had Foundry references — leftover imports and variable names (commit3d4abf0)deploy.shused undefinedSPIRE_SERVER_IP— should beSPIRE_SERVER_FQDN(commite54836c)deploy.shread wrong azd env variable names —ENTRA_BP_OIDvsENTRA_BLUEPRINT_OBJECT_ID(commitad7a47c)az vm run-command invokecalls had no timeout — could hang indefinitely on macOS (commit3f8592c)
Alternatives Considered¶
- Keep Foundry but make it optional: Rejected. Optional complexity is still complexity. If it's not enforcing anything, it shouldn't be in the critical path.
- Use Foundry Agent IDs as the canonical identity: Rejected. Entra Agent Identity is the Microsoft-endorsed identity primitive for agents. Foundry IDs are implementation details of the AI platform, not identity anchors.
Related¶
- ADR-007: Container Apps, not AKS (mentions Foundry hosting)
docs/runbooks/hard-won-learnings.md— Learnings #10-14 are Foundry-specific (now historical)