メインコンテンツへスキップ

Require Content Safety SDK Integration for All Agent Inputs Across Hosting Platforms

Implementation Effort: Low – Establishing the requirement is a policy and documentation task; SDK integration per agent is lightweight with available client libraries.
User Impact: Low – Applies to agent developers; end users experience no direct change.

Overview

Every agent deployed in the organization — whether built on Azure AI Foundry, Semantic Kernel, LangChain, custom orchestration, or any third-party framework, and whether hosted in Azure, on-premises, or in another cloud — must integrate the Azure AI Content Safety SDK to validate all inputs before passing them to the language model. The Content Safety SDK is a standalone API service, not a Foundry-specific feature; any agent that can make an HTTPS call can use it. It provides Prompt Shields for detecting direct prompt injection attempts in user input, and Prompt Shields for Documents for detecting indirect injection attacks embedded in uploaded files or retrieved content. This SDK-level integration complements network-level prompt inspection via Global Secure Access — the network layer catches threats at the perimeter for traffic routed through Global Secure Access, while the SDK layer catches threats at the application layer for agents that communicate via internal service-to-service calls, custom orchestration paths, or model endpoints outside the Global Secure Access perimeter. Both layers together eliminate the gaps that either layer alone would leave.

This task supports Assume Breach by providing an independent detection layer that operates outside the agent's own reasoning — even if a threat actor crafts a sophisticated injection through direct input or a malicious document planted in a data source, the SDK-level check intercepts it before it reaches the model. It supports Verify Explicitly by requiring every agent to validate inputs against a known threat pattern library before acting on them, rather than trusting that the agent's own logic will reject adversarial content. Organizations that do not mandate this integration leave prompt injection detection to individual development teams — some may implement it, others may not — and that inconsistency creates gaps that threat actors exploit by targeting the least-protected agents in the fleet.

Reference