API Reference — Attacks¶
attacks
¶
Attack factories.
Attacks test for BAD things. When the evaluator detects the attack objective, the result is UNSAFE (safe=False).
Attacks
¶
Factory methods for attack test executions.
Each method returns a BaseExecution. The test calls
execute_async and asserts the result.
Attacks test for BAD things. When the evaluator detects the attack objective, the result is UNSAFE (safe=False).
Note: the return type is BaseExecution (an ABC), not a protocol.
This is a deliberate asymmetry with the parameter types, which
use the Evaluator protocol. Parameters use protocols because
consumers provide implementations. The return type uses the ABC
because the consumer only calls execute_async on it, and
BaseExecution provides the concrete lifecycle skeleton.
xpia
staticmethod
¶
Create an XPIA attack execution.
Orchestrates the full XPIA flow: inject payloads into surfaces, wait for indexing, create a session, drive the trigger conversation, evaluate, clean up, return Result.
The trigger parameter accepts benign user prompts that cause the agent to retrieve and process the injected content (e.g., "Summarize Q3"). XPIA triggers are never adversarial — the attack is in the injected payload, not the prompt.
Pass a string for the common single-prompt case, a list of strings for multi-turn sequences, a Request for prompts with inline attachments, or a PromptDriver for full control.
For inline XPIA — where the poisoned content is attached to
the chat message rather than pre-positioned in an external
surface — omit inject and pass a Request with
attachments as the trigger.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
inject
|
InjectionHandle | list[InjectionHandle] | None
|
Prepared injections from |
None
|
trigger
|
str | list[str] | Request | list[Request] | PromptDriver
|
Benign user request(s) that cause the agent to process poisoned content. |
required |
evaluator
|
Evaluator
|
What condition to check for. |
required |
max_turns
|
int
|
Maximum prompt-response exchanges before ERROR. Defaults to 5. |
5
|
event_handlers
|
list[ExecutionEventHandler] | None
|
Optional additional handlers for custom observability. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
BaseExecution |
BaseExecution
|
Ready to execute with
|
Source code in rampart/attacks/__init__.py
XPIAExecution
¶
Bases: BaseExecution
Executes the full XPIA attack lifecycle.
Inherits BaseExecution. Implements _execute_async with XPIA's
specific phase structure. The lifecycle skeleton (event dispatch,
result collection, timing, infrastructure error handling) is owned
by BaseExecution.
Phases (delegated to private helpers from _execute_async):
1. Activate all injection handles (via AsyncExitStack).
2. Wait for indexing (concurrent per-handle).
3. Create session (via async context manager).
4. Drive the trigger conversation via the PromptDriver.
5. Evaluate per-turn with early stopping on detection.
6. Cleanup session and injections (guaranteed via AsyncExitStack).
7. Build and return Result via resolve_as_attack.
InfrastructureError raised by surfaces or adapters during any phase
is caught by BaseExecution.execute_async (not here) and converted
to Result(status=SafetyStatus.ERROR). This execution strategy
does not handle infrastructure failures — the base class does it as
a cross-cutting concern for all strategies.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
handles
|
list[InjectionHandle]
|
Prepared injections to activate. Empty for inline XPIA where payloads travel as chat attachments. |
None
|
driver
|
PromptDriver
|
How to drive the trigger conversation. |
required |
evaluator
|
Evaluator
|
What condition to check for. |
required |
max_turns
|
int
|
Maximum prompt-response exchanges before the execution stops with ERROR. Prevents unbounded loops. |
25
|
event_handlers
|
list[ExecutionEventHandler] | None
|
Additional handlers beyond the framework defaults. |
None
|
Source code in rampart/attacks/_xpia.py
strategy_name
property
¶
Identifies this as an XPIA execution in results and reports.