Ship agents to production without losing sleep¶
Policy enforcement, identity, sandboxing, and SRE for autonomous AI agents.
One pip install, any framework.
The problem¶
Your AI agents call tools, browse the web, query databases, and delegate to other agents. Once deployed, they make decisions autonomously. You need answers to three questions:
1. Is this action allowed? An agent with access to send_email and query_database should not be able to drop_table. OAuth scopes and IAM roles control which services an agent can reach, not what it does once connected.
2. Which agent did this? In a multi-agent system, five agents might share a single API key. When something goes wrong, "an agent did it" is not an incident response.
3. Can you prove what happened? Auditors and regulators need tamper-evident records of every decision: what policy was active, what the agent requested, and why it was allowed or denied.
Govern any agent in 2 lines¶
Wrap any tool function with govern(). Policy enforcement, audit logging, and denial handling are automatic.
That's it. safe_tool evaluates your YAML policy on every call, logs the decision, and raises GovernanceDenied if the action is blocked. Works with LangChain, CrewAI, OpenAI Agents, AutoGen, Google ADK, and any other framework.
# policy.yaml
apiVersion: governance.toolkit/v1
name: production-policy
default_action: allow
rules:
- name: block-destructive
condition: "action.type in ['drop', 'delete', 'truncate']"
action: deny
description: "Destructive operations require human approval"
- name: require-approval-for-send
condition: "action.type == 'send_email'"
action: require_approval
approvers: ["security-team"]
How it works¶
flowchart LR
A["🤖 Agent"] -->|govern| PE
subgraph GK [" Agent Governance Toolkit "]
direction LR
PE["Policy Engine<br>YAML · OPA · Cedar"]
ID["Identity<br>SPIFFE · DID · mTLS"]
AL["Audit Log<br>Tamper-evident"]
PE --> ID --> AL
end
AL -->|Allowed| T["Tool executes"]
PE -->|Denied| D["GovernanceDenied"] Every layer is optional. Start with govern() and add layers as your risk profile grows. Most teams run policy enforcement + audit logging and never need the full stack.
Packages¶
Language SDKs¶
| SDK | Install |
|---|---|
| 🐍 Python | pip install agent-governance-toolkit[full] |
| 📘 TypeScript | npm install @microsoft/agent-governance-sdk |
| 🔷 .NET | dotnet add package Microsoft.AgentGovernance |
| 🦀 Rust | cargo add agent-governance |
| 🐹 Go | go get github.com/microsoft/agent-governance-toolkit/agent-governance-golang |
Framework Integrations¶
Works with any agent framework: LangChain, CrewAI, AutoGen, Google ADK, OpenAI Agents, LlamaIndex, Haystack, Mastra, MCP, A2A, and more. See the full list.
Examples¶
| Example | Framework | What it demonstrates |
|---|---|---|
| openai-agents-governed | OpenAI Agents SDK | Policy-gated tool calls with trust tiers |
| crewai-governed | CrewAI | Multi-agent governance with role-based policies |
| smolagents-governed | HuggingFace smolagents | Lightweight agent governance |
| maf-integration | MAF | Microsoft Agent Framework integration |
| mcp-trust-verified-server | MCP | Trust-verified MCP server implementation |
Specifications¶
Every major component has a formal RFC 2119 specification with conformance tests.
25 Architecture Decision Records document the reasoning behind key design choices.
Standards Compliance¶
| Standard | Coverage |
|---|---|
| OWASP Agentic AI Top 10 | All ASI risk categories mapped with deterministic controls |
| NIST AI RMF 1.0 | Full GOVERN, MAP, MEASURE, MANAGE alignment |
| EU AI Act | Compliance mapping with automated evidence |
| SOC 2 | Control mapping with audit trail export |