Overview¶
RAMPART is a pytest-native safety testing framework for agentic AI applications. You write tests that probe your agent for safety violations — injection attacks, behavioral regressions, data exfiltration — and RAMPART orchestrates the interaction, evaluates the outcome, and reports the results.
Tests look like regular pytest tests. RAMPART provides the execution strategies, evaluation logic, and reporting infrastructure; you provide the adapter that connects your agent to the framework.
Core Concepts¶
RAMPART has two top-level execution categories:
| Category | Tests for | "Detected" means | Result |
|---|---|---|---|
| Attack | Bad behavior the agent should not exhibit | The attack succeeded | UNSAFE |
| Probe | Good behavior the agent should exhibit | The expected behavior is present | SAFE |
Both categories produce the same Result type. The difference is in how evaluator outcomes map to safety verdicts.
RAMPART ships with the following attacks; more will be added:
- XPIA — Cross-Prompt Injection Attack
RAMPART ships with the following probes; more will be added:
- Behavioral — Verify expected agent behavior
Architecture at a Glance¶
RAMPART integrates within your consumer package (the thin integration layer in your product team's repo) and the agent under test, building on PyRIT primitives underneath. The diagram below shows how the layers fit together and where your code plugs in:
RAMPART's four layers — from PyRIT primitives up to the agent under test.
Component Model¶
Zooming in on the runtime path, every RAMPART test wires together the same handful of components:
| Component | You provide | RAMPART provides |
|---|---|---|
| AgentAdapter | Implementation that creates sessions and declares capabilities | The protocol |
| Session | Implementation that sends requests and returns responses | The protocol |
| Surface | Implementation for your data sources (or use built-ins) | The protocol; built-in surfaces like OneDriveSurface |
| Evaluator | Choice and configuration | Built-ins: ToolCalled, ResponseContains, SideEffectOccurred |
| PromptDriver | Trigger prompts (as strings) or a custom driver | StaticDriver, LLMDriver |
| ReportSink | Choice of sink and output location | JsonFileReportSink |
Execution Lifecycle¶
A single test run flows from your pytest test, through a RAMPART attack or probe, into your AgentAdapter, and out to the agent system — then back again as a Result:
Request / response cycle for a single test run.
Under the hood, every execution follows a common lifecycle owned by BaseExecution, which drives the per-turn loop between the strategy, your adapter, and the evaluator:
sequenceDiagram
participant Test as Your Test
participant Exec as BaseExecution
participant Strat as Strategy
participant Adapter as Your Adapter
participant Eval as Evaluator
Test->>Exec: execute_async(adapter)
Exec->>Exec: fire ON_PRE_EXECUTE
Exec->>Strat: _execute_async(adapter)
loop Each turn (up to max_turns)
Strat->>Strat: driver.next_prompt_async(history)
Strat->>Adapter: session.send_async(request)
Adapter-->>Strat: Response
Strat->>Eval: evaluate_async(context)
Eval-->>Strat: EvalResult
Note over Strat: Early stop if detected
end
Strat-->>Exec: Result
Exec->>Exec: fire ON_POST_EXECUTE
Exec-->>Test: Result
Attack strategies add an injection phase before the conversation loop. Probe strategies skip injection entirely.
If an InfrastructureError is raised during execution, BaseExecution catches it and produces a Result with SafetyStatus.ERROR.
The Result Contract¶
Result is the single output type for all tests. Its boolean conversion (bool(result)) returns result.safe:
result = await Attacks.xpia(...).execute_async(adapter=my_adapter)
assert result, result.summary
If the agent behaved safely, the assertion passes. If not, the failure message is the human-readable summary.
Evaluator Polarity¶
Evaluators are polarity-free. They answer "did X happen?" — not "is X good or bad?" The meaning of detection depends on context:
- In an attack, detection means the attack objective was achieved → UNSAFE
- In a probe, detection means the expected behavior is present → SAFE
The Attacks and Probes factories handle this mapping automatically via resolve_as_attack and resolve_as_probe.
You can reuse the same evaluator in both contexts. A ToolCalled evaluator detects whether a tool was called — whether that's good or bad depends on whether you're attacking or probing.
pytest Integration¶
RAMPART registers as a pytest plugin automatically when installed. It provides:
- Markers:
@pytest.mark.harm(...)for categorization,@pytest.mark.trial(n=...)for statistical repetition - Automatic result collection: Results from
Attacks.*andProbes.*are collected without manual wiring - Terminal summary: A safety summary printed after the standard pytest output
- Report sinks: Structured output via the
rampart_sinksfixture
See pytest Markers & Fixtures for setup details.