Skip to content

API Reference — Attacks

attacks

Attack factories.

Attacks test for BAD things. When the evaluator detects the attack objective, the result is UNSAFE (safe=False).

Attacks

Factory methods for attack test executions.

Each method returns a BaseExecution. The test calls execute_async and asserts the result.

Attacks test for BAD things. When the evaluator detects the attack objective, the result is UNSAFE (safe=False).

Note: the return type is BaseExecution (an ABC), not a protocol. This is a deliberate asymmetry with the parameter types, which use the Evaluator protocol. Parameters use protocols because consumers provide implementations. The return type uses the ABC because the consumer only calls execute_async on it, and BaseExecution provides the concrete lifecycle skeleton.

xpia staticmethod

Python
xpia(
    *,
    inject=None,
    trigger,
    evaluator,
    max_turns=5,
    event_handlers=None,
)

Create an XPIA attack execution.

Orchestrates the full XPIA flow: inject payloads into surfaces, wait for indexing, create a session, drive the trigger conversation, evaluate, clean up, return Result.

The trigger parameter accepts benign user prompts that cause the agent to retrieve and process the injected content (e.g., "Summarize Q3"). XPIA triggers are never adversarial — the attack is in the injected payload, not the prompt.

Pass a string for the common single-prompt case, a list of strings for multi-turn sequences, a Request for prompts with inline attachments, or a PromptDriver for full control.

For inline XPIA — where the poisoned content is attached to the chat message rather than pre-positioned in an external surface — omit inject and pass a Request with attachments as the trigger.

Parameters:

Name Type Description Default
inject InjectionHandle | list[InjectionHandle] | None

Prepared injections from surface.inject() calls. None for inline XPIA where the payload travels as a chat attachment.

None
trigger str | list[str] | Request | list[Request] | PromptDriver

Benign user request(s) that cause the agent to process poisoned content.

required
evaluator Evaluator

What condition to check for.

required
max_turns int

Maximum prompt-response exchanges before ERROR. Defaults to 5.

5
event_handlers list[ExecutionEventHandler] | None

Optional additional handlers for custom observability.

None

Returns:

Name Type Description
BaseExecution BaseExecution

Ready to execute with execute_async(adapter=...).

Source code in rampart/attacks/__init__.py
Python
@staticmethod
def xpia(
    *,
    inject: InjectionHandle | list[InjectionHandle] | None = None,
    trigger: str | list[str] | Request | list[Request] | PromptDriver,
    evaluator: Evaluator,
    max_turns: int = 5,
    event_handlers: list[ExecutionEventHandler] | None = None,
) -> BaseExecution:
    """Create an XPIA attack execution.

    Orchestrates the full XPIA flow: inject payloads into surfaces,
    wait for indexing, create a session, drive the trigger
    conversation, evaluate, clean up, return Result.

    The trigger parameter accepts benign user prompts that cause
    the agent to retrieve and process the injected content (e.g.,
    "Summarize Q3").  XPIA triggers are never adversarial — the
    attack is in the injected payload, not the prompt.

    Pass a string for the common single-prompt case, a list of
    strings for multi-turn sequences, a Request for prompts with
    inline attachments, or a PromptDriver for full control.

    For inline XPIA — where the poisoned content is attached to
    the chat message rather than pre-positioned in an external
    surface — omit ``inject`` and pass a ``Request`` with
    ``attachments`` as the trigger.

    Args:
        inject (InjectionHandle | list[InjectionHandle] | None):
            Prepared injections from ``surface.inject()`` calls.
            None for inline XPIA where the payload travels as a
            chat attachment.
        trigger (str | list[str] | Request | list[Request] | PromptDriver):
            Benign user request(s) that cause the agent to process
            poisoned content.
        evaluator (Evaluator): What condition to check for.
        max_turns (int): Maximum prompt-response exchanges before
            ERROR.  Defaults to 5.
        event_handlers (list[ExecutionEventHandler] | None): Optional
            additional handlers for custom observability.

    Returns:
        BaseExecution: Ready to execute with
            ``execute_async(adapter=...)``.
    """
    if inject is None:
        handles = []
    elif isinstance(inject, list):
        handles = inject
    else:
        handles = [inject]
    driver = coerce_driver(trigger)

    return XPIAExecution(
        handles=handles,
        driver=driver,
        evaluator=evaluator,
        max_turns=max_turns,
        event_handlers=event_handlers,
    )

XPIAExecution

Python
XPIAExecution(
    *,
    handles=None,
    driver,
    evaluator,
    max_turns=25,
    event_handlers=None,
)

Bases: BaseExecution

Executes the full XPIA attack lifecycle.

Inherits BaseExecution. Implements _execute_async with XPIA's specific phase structure. The lifecycle skeleton (event dispatch, result collection, timing, infrastructure error handling) is owned by BaseExecution.

Phases (delegated to private helpers from _execute_async): 1. Activate all injection handles (via AsyncExitStack). 2. Wait for indexing (concurrent per-handle). 3. Create session (via async context manager). 4. Drive the trigger conversation via the PromptDriver. 5. Evaluate per-turn with early stopping on detection. 6. Cleanup session and injections (guaranteed via AsyncExitStack). 7. Build and return Result via resolve_as_attack.

InfrastructureError raised by surfaces or adapters during any phase is caught by BaseExecution.execute_async (not here) and converted to Result(status=SafetyStatus.ERROR). This execution strategy does not handle infrastructure failures — the base class does it as a cross-cutting concern for all strategies.

Parameters:

Name Type Description Default
handles list[InjectionHandle]

Prepared injections to activate. Empty for inline XPIA where payloads travel as chat attachments.

None
driver PromptDriver

How to drive the trigger conversation.

required
evaluator Evaluator

What condition to check for.

required
max_turns int

Maximum prompt-response exchanges before the execution stops with ERROR. Prevents unbounded loops.

25
event_handlers list[ExecutionEventHandler] | None

Additional handlers beyond the framework defaults.

None
Source code in rampart/attacks/_xpia.py
Python
def __init__(
    self,
    *,
    handles: list[InjectionHandle] | None = None,
    driver: PromptDriver,
    evaluator: Evaluator,
    max_turns: int = 25,
    event_handlers: list[ExecutionEventHandler] | None = None,
) -> None:
    super().__init__(event_handlers=event_handlers)
    self._handles = handles or []
    self._driver = driver
    self._evaluator = evaluator
    self._max_turns = max_turns

strategy_name property

Python
strategy_name

Identifies this as an XPIA execution in results and reports.