Skip to content

Testing Guide — How to Use the Agent Governance Toolkit

This guide walks you through using the Agent Governance Toolkit as an external customer would, from zero to a fully governed AI agent. No prior knowledge required.

What Is This Toolkit?

The Agent Governance Toolkit is not a portal or a UI. It's a Python library that sits between your AI agent and the tools it uses (APIs, databases, files). Think of it as a security middleware — every action your agent tries to take gets checked against your policies before it's allowed to execute.

Your Agent → [Governance Toolkit] → Tool/API/Database
              Policy check: ALLOW or DENY
              Audit log: who did what, when

Prerequisites

  • Python 3.10+ installed
  • An API key for any LLM provider (OpenAI, Azure OpenAI, or Google Gemini)
  • 10 minutes

Path 1: Run the Live Demo (Fastest — 5 minutes)

This is the best way to see what the toolkit does. The demo runs 4 governance scenarios with a real LLM and shows you policy enforcement in action.

Step 1: Clone and install

git clone https://github.com/microsoft/agent-governance-toolkit.git
cd agent-governance-toolkit
pip install -e "agent-os[dev]"
pip install -e "agent-mesh"
pip install -e "agent-compliance"
pip install httpx pyyaml

Step 2: Set your LLM API key

Pick ONE of these (the demo auto-detects which one you have):

# Option A: OpenAI
export OPENAI_API_KEY="sk-..."

# Option B: Azure OpenAI
export AZURE_OPENAI_API_KEY="..."
export AZURE_OPENAI_ENDPOINT="https://your-resource.openai.azure.com"

# Option C: Google Gemini
export GOOGLE_API_KEY="..."

On Windows PowerShell, use $env:OPENAI_API_KEY = "sk-..." instead of export.

Step 3: Run the demo

python examples/demos/maf_governance_demo.py

You'll see 4 scenarios:

Scenario What Happens What Governance Does
1. Policy Enforcement Agent tries to use tools Policies allow/deny based on YAML rules
2. Capability Sandboxing Agent tries a blocked tool Governance blocks it before execution
3. Rogue Agent Detection Agent behaves suspiciously Behavioral scoring flags the anomaly
4. Content Filtering Agent gets a dangerous prompt Governance blocks it pre-LLM

Step 4: Run adversarial attacks

python examples/demos/maf_governance_demo.py --include-attacks

This adds a 5th scenario that tries to break governance: - Prompt injection ("ignore all previous instructions...") - Tool alias bypass (calling a blocked tool by another name) - Trust score manipulation - SQL injection

You'll see each attack attempt and whether governance blocked it.

What to look for

  • ⚠ Storage: IN-MEMORY ONLY warning at startup (expected — demo doesn't persist)
  • ⚠ Policy: SAMPLE CONFIG warning (expected — demo uses sample policies)
  • BLOCKED messages when governance denies an action
  • Audit entries count at the end showing all governance decisions were logged

Path 2: Govern Your Own Agent (15 minutes)

This shows how a real customer would add governance to their existing agent code.

Step 1: Install

pip install agent-governance-toolkit[full]

Step 2: Create a policy file

Save this as my-policy.yaml:

version: "1.0"
name: my-first-policy
description: Basic governance for testing

kernel:
  mode: strict

policies:
  - name: tool-restrictions
    description: Control which tools agents can use
    deny:
      - patterns:
          - "execute_shell"
          - "run_command"
          - "delete_.*"
    allow:
      - action: "search_web"
      - action: "read_file"
      - action: "write_report"
    limits:
      - action: tool_call
        max_per_session: 20

audit:
  enabled: true
  level: all

Step 3: Use the toolkit in your code

from agent_os import StatelessKernel, ExecutionContext, Policy

# Create the governance kernel
kernel = StatelessKernel()

# Define what this agent is allowed to do
ctx = ExecutionContext(
    agent_id="my-test-agent",
    policies=[
        Policy.read_only(),                     # No write operations
        Policy.rate_limit(100, "1m"),           # Max 100 calls/minute
        Policy.require_approval(
            actions=["send_email", "deploy_*"],  # These need human OK
            min_approvals=1,
        ),
    ],
)

# This is where your agent's tool call gets intercepted
result = await kernel.execute(
    action="search_web",
    params={"query": "latest AI news"},
    context=ctx,
)
print(f"Allowed: {result.allowed}")
print(f"Reason: {result.reason}")

# Try a blocked action
result2 = await kernel.execute(
    action="execute_shell",
    params={"command": "rm -rf /"},
    context=ctx,
)
print(f"Allowed: {result2.allowed}")  # False
print(f"Reason: {result2.reason}")    # "Blocked by policy"

Step 4: Add to a LangChain/CrewAI/ADK agent

See the quickstart examples: - examples/quickstart/langchain_governed.py - examples/quickstart/crewai_governed.py - examples/quickstart/google_adk_governed.py - examples/quickstart/openai_agents_governed.py - examples/quickstart/autogen_governed.py

Each is a single file that creates a governed agent in the respective framework.


Path 3: Test the SQL Policy (Security Focus)

This tests the specific security policies that were recently hardened.

Step 1: Install

pip install -e "agent-os[dev]"

Step 2: Run the SQL policy tests

cd agent-os
python -m pytest modules/control-plane/tests/test_sql_policy.py -v

You should see 40 tests passing — covering: - DROP, TRUNCATE, ALTER blocked - GRANT, REVOKE blocked (privilege escalation) - CREATE USER blocked (backdoor creation) - EXEC xp_cmdshell blocked (OS command execution) - UPDATE without WHERE blocked (mass data modification) - SELECT, INSERT, UPDATE with WHERE allowed

Step 3: Test with custom YAML config

from agent_control_plane.policy_engine import create_policies_from_config

# Load the strict SQL policy (SELECT only)
rules = create_policies_from_config("examples/policies/sql-strict.yaml")

# Or the balanced default
rules = create_policies_from_config("examples/policies/sql-safety.yaml")

# Or the read-only policy
rules = create_policies_from_config("examples/policies/sql-readonly.yaml")

Path 4: Run the Full Test Suite

# Run all 6,100+ tests
cd agent-os && python -m pytest tests/ -q
cd ../agent-mesh && python -m pytest tests/ -q
cd ../agent-hypervisor && python -m pytest tests/ -q
cd ../agent-sre && python -m pytest tests/ -q
cd ../agent-compliance && python -m pytest tests/ -q

What to Test (Test Matrix)

Area How to Test Expected Result
Install pip install agent-governance-toolkit[full] Installs cleanly, no errors
Demo python examples/demos/maf_governance_demo.py 4 scenarios run, blocked actions shown
Adversarial python examples/demos/maf_governance_demo.py --include-attacks All 4 attacks blocked
Policy loading Load YAML from examples/policies/ Policies parse without errors
SQL safety Run test_sql_policy.py 40 tests pass
Framework integration Run any examples/quickstart/*.py Governed agent works end-to-end
Audit trail Check evaluator.get_audit_log() after actions All decisions logged with timestamps
Trust identity from agentmesh import AgentIdentity Ed25519 DID generated
CLI agent-governance --help CLI shows available commands

Common Issues

Problem Solution
ModuleNotFoundError: No module named 'yaml' pip install pyyaml
ModuleNotFoundError: No module named 'cryptography' pip install cryptography
Demo shows "no backend detected" Set one of the API key env vars (see Step 2 above)
DeprecationWarning: create_default_policies() Expected — use create_policies_from_config() instead
Tests fail with import errors Run pip install -e ".[dev]" in the package directory

Architecture Summary

┌─────────────────────────────────────────────┐
│            Your Agent (any framework)        │
│  LangChain / CrewAI / ADK / AutoGen / etc.  │
└──────────────────┬──────────────────────────┘
                   │ Tool call
┌─────────────────────────────────────────────┐
│         Agent Governance Toolkit             │
│                                              │
│  1. Identity Check (Ed25519 DID)            │
│  2. Policy Evaluation (YAML/OPA/Cedar)      │
│  3. Action: ALLOW / DENY / ESCALATE         │
│  4. Audit Log (tamper-proof)                │
└──────────────────┬──────────────────────────┘
                   │ If ALLOWED
┌─────────────────────────────────────────────┐
│         Tool / API / Database                │
└─────────────────────────────────────────────┘

Questions?

  • GitHub Discussions: https://github.com/microsoft/agent-governance-toolkit/discussions
  • Issues: https://github.com/microsoft/agent-governance-toolkit/issues