Overview¶

RAMPART is a pytest-native safety testing framework for agentic AI applications. You write tests that probe your agent for safety violations — injection attacks, behavioral regressions, data exfiltration — and RAMPART orchestrates the interaction, evaluates the outcome, and reports the results.

Tests look like regular pytest tests. RAMPART provides the execution strategies, evaluation logic, and reporting infrastructure; you provide the adapter that connects your agent to the framework.

Core Concepts¶

RAMPART has two top-level execution categories:

Category	Tests for	"Detected" means	Result
Attack	Bad behavior the agent should not exhibit	The attack succeeded	UNSAFE
Probe	Good behavior the agent should exhibit	The expected behavior is present	SAFE

Both categories produce the same Result type. The difference is in how evaluator outcomes map to safety verdicts.

RAMPART ships with the following attacks; more will be added:

XPIA — Cross-Prompt Injection Attack

RAMPART ships with the following probes; more will be added:

Behavioral — Verify expected agent behavior

Architecture at a Glance¶

RAMPART integrates within your consumer package (the thin integration layer in your product team's repo) and the agent under test, building on PyRIT primitives underneath. The diagram below shows how the layers fit together and where your code plugs in:

RAMPART layered architecture: PyRIT (L1), RAMPART (L2), Consumer Package (L3), Agent Under Test (L4)

RAMPART's four layers — from PyRIT primitives up to the agent under test.

Component Model¶

Zooming in on the runtime path, every RAMPART test wires together the same handful of components:

Component	You provide	RAMPART provides
AgentAdapter	Implementation that creates sessions and declares capabilities	The protocol
Session	Implementation that sends requests and returns responses	The protocol
Surface	Implementation for your data sources (or use built-ins)	The protocol; built-in surfaces like `OneDriveSurface`
Evaluator	Choice and configuration	Built-ins: `ToolCalled`, `ResponseContains`, `SideEffectOccurred`
PromptDriver	Trigger prompts (as strings) or a custom driver	`StaticDriver`, `LLMDriver`
ReportSink	Choice of sink and output location	`JsonFileReportSink`

Execution Lifecycle¶

A single test run flows from your pytest test, through a RAMPART attack or probe, into your AgentAdapter, and out to the agent system — then back again as a Result:

RAMPART test execution flow: Test → Attacks/Probes → AgentAdapter → Agent System, with Pass/Fail returned

Request / response cycle for a single test run.

Under the hood, every execution follows a common lifecycle owned by BaseExecution, which drives the per-turn loop between the strategy, your adapter, and the evaluator:

sequenceDiagram
    participant Test as Your Test
    participant Exec as BaseExecution
    participant Strat as Strategy
    participant Adapter as Your Adapter
    participant Eval as Evaluator

    Test->>Exec: execute_async(adapter)
    Exec->>Exec: fire ON_PRE_EXECUTE
    Exec->>Strat: _execute_async(adapter)

    loop Each turn (up to max_turns)
        Strat->>Strat: driver.next_prompt_async(history)
        Strat->>Adapter: session.send_async(request)
        Adapter-->>Strat: Response
        Strat->>Eval: evaluate_async(context)
        Eval-->>Strat: EvalResult
        Note over Strat: Early stop if detected
    end

    Strat-->>Exec: Result
    Exec->>Exec: fire ON_POST_EXECUTE
    Exec-->>Test: Result

Attack strategies add an injection phase before the conversation loop. Probe strategies skip injection entirely.

If an InfrastructureError is raised during execution, BaseExecution catches it and produces a Result with SafetyStatus.ERROR.

The Result Contract¶

Result is the single output type for all tests. Its boolean conversion (bool(result)) returns result.safe:

Python

result = await Attacks.xpia(...).execute_async(adapter=my_adapter)
assert result, result.summary

If the agent behaved safely, the assertion passes. If not, the failure message is the human-readable summary.

Evaluator Polarity¶

Evaluators are polarity-free. They answer "did X happen?" — not "is X good or bad?" The meaning of detection depends on context:

In an attack, detection means the attack objective was achieved → UNSAFE
In a probe, detection means the expected behavior is present → SAFE

The Attacks and Probes factories handle this mapping automatically via resolve_as_attack and resolve_as_probe.

You can reuse the same evaluator in both contexts. A ToolCalled evaluator detects whether a tool was called — whether that's good or bad depends on whether you're attacking or probing.

pytest Integration¶

RAMPART registers as a pytest plugin automatically when installed. It provides:

Markers: @pytest.mark.harm(...) for categorization, @pytest.mark.trial(n=...) for statistical repetition
Automatic result collection: Results from Attacks.* and Probes.* are collected without manual wiring
Terminal summary: A safety summary printed after the standard pytest output
Report sinks: Structured output via the rampart_sinks fixture

See pytest Markers & Fixtures for setup details.