メインコンテンツへスキップ

Establish Red Teaming Requirement for New Agent Deployments

Implementation Effort: Low – Establishing the requirement is a policy and process task; the underlying tooling (AI Red Teaming Agent) is already provisioned.
User Impact: Low – Affects development and security review workflows; end users are not impacted.

Overview

Once the AI Red Teaming Agent is configured in Azure AI Foundry, the organization needs a policy that requires every new agent to pass a red teaming scan before it is approved for production deployment. Without this requirement, red teaming remains an optional activity that some teams perform and others skip, leaving security validation coverage inconsistent across the agent fleet.

The requirement should define which attack scenarios must be run against each new agent, what severity thresholds constitute a pass or fail, and who reviews the results. Not every agent needs the same depth of testing — an internal FAQ agent that answers questions from a static knowledge base has a different risk profile than a customer-facing agent that can invoke financial transaction APIs. The requirement should include a risk-tiering framework that maps agent risk categories to testing intensity, so high-risk agents receive more exhaustive testing while lower-risk agents undergo a baseline scan.

The process should integrate into the organization's existing agent deployment workflow. Before an agent moves from staging to production, the development team must submit red teaming scan results to the security review team. If the scan reveals vulnerabilities above the defined severity threshold, the agent is blocked from production until the team remediates the findings and reruns the scan. This creates a security gate that prevents agents with known adversarial vulnerabilities from reaching production users.

This task supports Verify Explicitly by requiring empirical evidence of agent security before granting production approval, rather than approving based on design documentation or developer assurances alone. It supports Assume Breach by proactively identifying the attack paths that threat actors would use against the agent, and remediating them before the agent is exposed to real users and real adversarial traffic. Organizations that do not establish this requirement deploy agents to production without testing their resilience to adversarial inputs, which means the first red team test the agent faces is from an actual threat actor. Microsoft Foundry Control Plane reinforces this requirement at the fleet level by providing the Compliance pane, where administrators can define and enforce guardrail policies that include red teaming as a deployment gate. The Compliance pane surfaces noncompliant agents that have not passed required scans and enables bulk remediation, so the requirement is enforced consistently across projects rather than relying on individual teams to self-enforce.

Reference