Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

5. File Converters

File converters transform text into file outputs such as PDFs. These converters are useful for packaging prompts into distributable formats.

Overview

This notebook covers:

  • PDFConverter: Convert text to PDF documents with templates or direct generation

  • WordDocConverter: Convert text to Word (.docx) documents with optional placeholder injection

PDFConverter

The PDFConverter generates PDF documents from text in multiple modes:

  1. Template-Based PDF Generation: Use YAML templates to render dynamic content into PDFs

  2. Direct Prompt PDF Generation: Convert plain text strings into PDFs without templates

  3. Modify Existing PDFs: Inject text into existing PDF documents

Template-Based PDF Generation

This mode populates placeholders in a YAML-based template and converts the rendered content into a PDF.

import pathlib

from pyrit.common.path import CONVERTER_SEED_PROMPT_PATH
from pyrit.executor.attack import (
    AttackConverterConfig,
    PromptSendingAttack,
)
from pyrit.models import SeedPrompt
from pyrit.output import output_attack_async
from pyrit.prompt_converter import PDFConverter
from pyrit.prompt_normalizer import PromptConverterConfiguration
from pyrit.prompt_target import TextTarget
from pyrit.setup import IN_MEMORY, initialize_pyrit_async

await initialize_pyrit_async(memory_db_type=IN_MEMORY)  # type: ignore

prompt_data = {
    "hiring_manager_name": "Jane Doe",
    "current_role": "AI Engineer",
    "company": "CyberDefense Inc.",
    "red_teaming_reason": "to creatively identify security vulnerabilities while enjoying free coffee",
    "applicant_name": "John Smith",
}

# Load the YAML template for the PDF generation
template_path = pathlib.Path(CONVERTER_SEED_PROMPT_PATH) / "pdf_converters" / "red_teaming_application_template.yaml"
if not template_path.exists():
    raise FileNotFoundError(f"Template file not found: {template_path}")

# Load the SeedPrompt from the YAML file
prompt_template = SeedPrompt.from_yaml_file(template_path)

# Initialize target
prompt_target = TextTarget()

# Initialize the PDFConverter
pdf_converter = PromptConverterConfiguration.from_converters(
    converters=[
        PDFConverter(
            prompt_template=prompt_template,
            font_type="Arial",
            font_size=12,
            page_width=210,
            page_height=297,
        )
    ]
)

converter_config = AttackConverterConfig(
    request_converters=pdf_converter,
)

# Define prompt for the attack
prompt = str(prompt_data)

# Initialize the attack
attack = PromptSendingAttack(
    objective_target=prompt_target,
    attack_converter_config=converter_config,
)

result = await attack.execute_async(objective=prompt)  # type: ignore
await output_attack_async(result)
Found default environment files: ['./.pyrit/.env', './.pyrit/.env.local']
Loaded environment file: ./.pyrit/.env
Loaded environment file: ./.pyrit/.env.local
No new upgrade operations detected.
TextTarget: user: ./dbdata/prompt-memory-entries/binaries/1779402999644083.pdf

════════════════════════════════════════════════════════════════════════════════════════════════════
                                  ❓ ATTACK RESULT: UNDETERMINED ❓                                   
════════════════════════════════════════════════════════════════════════════════════════════════════

 Attack Summary 
────────────────────────────────────────────────────────────────────────────────────────────────────
  📋 Basic Information
    • Objective: {'hiring_manager_name': 'Jane Doe', 'current_role': 'AI Engineer', 'company': 'CyberDefense Inc.', 'red_teaming_reason': 'to creatively identify security vulnerabilities while enjoying free coffee', 'applicant_name': 'John Smith'}
    • Attack Type: PromptSendingAttack
    • Conversation ID: 9edbd5d0-2bde-4432-b4a7-91b9688f4377

  ⚡ Execution Metrics
    • Turns Executed: 1
    • Execution Time: 45ms

  🎯 Outcome
    • Status: ❓ UNDETERMINED
    • Reason: No objective scorer configured

 Conversation History with Objective Target 
────────────────────────────────────────────────────────────────────────────────────────────────────

────────────────────────────────────────────────────────────────────────────────────────────────────
🔹 Turn 1 - USER
────────────────────────────────────────────────────────────────────────────────────────────────────
   Original:
  {'hiring_manager_name': 'Jane Doe', 'current_role': 'AI Engineer', 'company': 'CyberDefense Inc.',
      'red_teaming_reason': 'to creatively identify security vulnerabilities while enjoying free
      coffee', 'applicant_name': 'John Smith'}

   Converted:
  ./dbdata/prompt-
      memory-entries\binaries\1779402999644083.pdf

────────────────────────────────────────────────────────────────────────────────────────────────────

────────────────────────────────────────────────────────────────────────────────────────────────────
                            Report generated at: 2026-05-21 22:36:39 UTC                            

Direct Prompt PDF Generation (No Template)

This mode converts plain text strings directly into PDFs without using templates.

# Define a simple string prompt (no templates)
prompt = "This is a simple test string for PDF generation. No templates here!"

# Initialize the PDFConverter without a template
pdf_converter = PromptConverterConfiguration.from_converters(
    converters=[
        PDFConverter(
            prompt_template=None,  # No template provided
            font_type="Arial",
            font_size=12,
            page_width=210,
            page_height=297,
        )
    ]
)

converter_config = AttackConverterConfig(
    request_converters=pdf_converter,
)

# Initialize the attack
attack = PromptSendingAttack(
    objective_target=prompt_target,
    attack_converter_config=converter_config,
)

result = await attack.execute_async(objective=prompt)  # type: ignore
await output_attack_async(result)
TextTarget: user: ./dbdata/prompt-memory-entries/binaries/1779402999698469.pdf

════════════════════════════════════════════════════════════════════════════════════════════════════
                                  ❓ ATTACK RESULT: UNDETERMINED ❓                                   
════════════════════════════════════════════════════════════════════════════════════════════════════

 Attack Summary 
────────────────────────────────────────────────────────────────────────────────────────────────────
  📋 Basic Information
    • Objective: This is a simple test string for PDF generation. No templates here!
    • Attack Type: PromptSendingAttack
    • Conversation ID: 3d86d651-30f2-4d2d-8585-a130ba3dc077

  ⚡ Execution Metrics
    • Turns Executed: 1
    • Execution Time: 23ms

  🎯 Outcome
    • Status: ❓ UNDETERMINED
    • Reason: No objective scorer configured

 Conversation History with Objective Target 
────────────────────────────────────────────────────────────────────────────────────────────────────

────────────────────────────────────────────────────────────────────────────────────────────────────
🔹 Turn 1 - USER
────────────────────────────────────────────────────────────────────────────────────────────────────
   Original:
  This is a simple test string for PDF generation. No templates here!

   Converted:
  ./dbdata/prompt-
      memory-entries\binaries\1779402999698469.pdf

────────────────────────────────────────────────────────────────────────────────────────────────────

────────────────────────────────────────────────────────────────────────────────────────────────────
                            Report generated at: 2026-05-21 22:36:39 UTC                            

Modifying Existing PDFs with Injection Items

The PDFConverter can also inject text into existing PDF documents at specified locations.

import tempfile
from pathlib import Path

import requests

# Download a sample PDF
url = (
    "https://raw.githubusercontent.com/microsoft/PyRIT/main/pyrit/datasets/prompt_converters/pdf_converters/fake_CV.pdf"
)

with tempfile.NamedTemporaryFile(delete=False, suffix=".pdf") as tmp_file:
    response = requests.get(url)
    tmp_file.write(response.content)

cv_pdf_path = Path(tmp_file.name)

# Define injection items
injection_items = [
    {
        "page": 0,
        "x": 50,
        "y": 700,
        "text": "Injected Text",
        "font_size": 12,
        "font": "Helvetica",
        "font_color": (255, 0, 0),
    },  # Red text
    {
        "page": 1,
        "x": 100,
        "y": 600,
        "text": "Confidential",
        "font_size": 10,
        "font": "Helvetica",
        "font_color": (0, 0, 255),
    },  # Blue text
]

# Initialize the PDFConverter with the existing PDF and injection items
pdf_converter = PromptConverterConfiguration.from_converters(
    converters=[
        PDFConverter(
            prompt_template=None,
            font_type="Arial",
            font_size=12,
            page_width=210,
            page_height=297,
            existing_pdf=cv_pdf_path,  # Provide the existing PDF
            injection_items=injection_items,  # Provide the injection items
        )
    ]
)

converter_config = AttackConverterConfig(
    request_converters=pdf_converter,
)

# Initialize the attack
attack = PromptSendingAttack(
    objective_target=prompt_target,
    attack_converter_config=converter_config,
)

result = await attack.execute_async(objective="Modify existing PDF")  # type: ignore
await output_attack_async(result)
[15:36:40][10][ai-red-team][INFO][Processing page 0 with 2 injection items.]
[15:36:40][13][ai-red-team][INFO][Processing page 1 with 2 injection items.]
[15:36:40][15][ai-red-team][INFO][Processing page 2 with 2 injection items.]
TextTarget: user: ./dbdata/prompt-memory-entries/binaries/1779403000017242.pdf

════════════════════════════════════════════════════════════════════════════════════════════════════
                                  ❓ ATTACK RESULT: UNDETERMINED ❓                                   
════════════════════════════════════════════════════════════════════════════════════════════════════

 Attack Summary 
────────────────────────────────────────────────────────────────────────────────────────────────────
  📋 Basic Information
    • Objective: Modify existing PDF
    • Attack Type: PromptSendingAttack
    • Conversation ID: 7f9101d9-93b6-49e3-88ae-1bebaf44a0af

  ⚡ Execution Metrics
    • Turns Executed: 1
    • Execution Time: 31ms

  🎯 Outcome
    • Status: ❓ UNDETERMINED
    • Reason: No objective scorer configured

 Conversation History with Objective Target 
────────────────────────────────────────────────────────────────────────────────────────────────────

────────────────────────────────────────────────────────────────────────────────────────────────────
🔹 Turn 1 - USER
────────────────────────────────────────────────────────────────────────────────────────────────────
   Original:
  Modify existing PDF

   Converted:
  ./dbdata/prompt-
      memory-entries\binaries\1779403000017242.pdf

────────────────────────────────────────────────────────────────────────────────────────────────────

────────────────────────────────────────────────────────────────────────────────────────────────────
                            Report generated at: 2026-05-21 22:36:40 UTC                            

WordDocConverter

The WordDocConverter generates Word (.docx) documents from text. It supports two main modes:

  1. New Document Generation: Convert plain text strings into Word documents

  2. Placeholder-Based Injection: Replace a placeholder in an existing .docx document with rendered content

Placeholders must be fully contained within a single run in the document. If a placeholder spans multiple runs (e.g., due to mixed formatting), it will not be replaced.

Direct Word Document Generation (No Template)

This mode converts plain text strings directly into .docx files.

from pyrit.prompt_converter import WordDocConverter

word_doc_converter = PromptConverterConfiguration.from_converters(converters=[WordDocConverter()])

converter_config = AttackConverterConfig(
    request_converters=word_doc_converter,
)

attack = PromptSendingAttack(
    objective_target=prompt_target,
    attack_converter_config=converter_config,
)

result = await attack.execute_async(objective="This is a simple test string for Word document generation.")  # type: ignore
await output_attack_async(result)
TextTarget: user: ./dbdata/prompt-memory-entries/binaries/1779403000064406.docx

════════════════════════════════════════════════════════════════════════════════════════════════════
                                  ❓ ATTACK RESULT: UNDETERMINED ❓                                   
════════════════════════════════════════════════════════════════════════════════════════════════════

 Attack Summary 
────────────────────────────────────────────────────────────────────────────────────────────────────
  📋 Basic Information
    • Objective: This is a simple test string for Word document generation.
    • Attack Type: PromptSendingAttack
    • Conversation ID: 172b3107-93fa-4443-b155-3d94f41e79ea

  ⚡ Execution Metrics
    • Turns Executed: 1
    • Execution Time: 38ms

  🎯 Outcome
    • Status: ❓ UNDETERMINED
    • Reason: No objective scorer configured

 Conversation History with Objective Target 
────────────────────────────────────────────────────────────────────────────────────────────────────

────────────────────────────────────────────────────────────────────────────────────────────────────
🔹 Turn 1 - USER
────────────────────────────────────────────────────────────────────────────────────────────────────
   Original:
  This is a simple test string for Word document generation.

   Converted:
  ./dbdata/prompt-
      memory-entries\binaries\1779403000064406.docx

────────────────────────────────────────────────────────────────────────────────────────────────────

────────────────────────────────────────────────────────────────────────────────────────────────────
                            Report generated at: 2026-05-21 22:36:40 UTC                            

Placeholder-Based Injection into Existing Word Documents

This mode replaces a literal placeholder string in an existing .docx file with the prompt content.

import tempfile

from docx import Document  # type: ignore[import-untyped]

# Create a sample Word document with a placeholder
with tempfile.NamedTemporaryFile(delete=False, suffix=".docx") as tmp_file:
    doc = Document()
    doc.add_paragraph("Dear Hiring Manager,")
    doc.add_paragraph("I am writing to apply for the {{INJECTION_PLACEHOLDER}} position.")
    doc.add_paragraph("Sincerely, Applicant")
    doc.save(tmp_file.name)
    template_docx_path = Path(tmp_file.name)

word_doc_converter = PromptConverterConfiguration.from_converters(
    converters=[
        WordDocConverter(
            existing_docx=template_docx_path,
            placeholder="{{INJECTION_PLACEHOLDER}}",
        )
    ]
)

converter_config = AttackConverterConfig(
    request_converters=word_doc_converter,
)

attack = PromptSendingAttack(
    objective_target=prompt_target,
    attack_converter_config=converter_config,
)

result = await attack.execute_async(objective="AI Red Team Engineer")  # type: ignore
await output_attack_async(result)
TextTarget: user: ./dbdata/prompt-memory-entries/binaries/1779403000154543.docx

════════════════════════════════════════════════════════════════════════════════════════════════════
                                  ❓ ATTACK RESULT: UNDETERMINED ❓                                   
════════════════════════════════════════════════════════════════════════════════════════════════════

 Attack Summary 
────────────────────────────────────────────────────────────────────────────────────────────────────
  📋 Basic Information
    • Objective: AI Red Team Engineer
    • Attack Type: PromptSendingAttack
    • Conversation ID: f68f05e7-bb47-4a9d-93b1-aae35b0ba497

  ⚡ Execution Metrics
    • Turns Executed: 1
    • Execution Time: 54ms

  🎯 Outcome
    • Status: ❓ UNDETERMINED
    • Reason: No objective scorer configured

 Conversation History with Objective Target 
────────────────────────────────────────────────────────────────────────────────────────────────────

────────────────────────────────────────────────────────────────────────────────────────────────────
🔹 Turn 1 - USER
────────────────────────────────────────────────────────────────────────────────────────────────────
   Original:
  AI Red Team Engineer

   Converted:
  ./dbdata/prompt-
      memory-entries\binaries\1779403000154543.docx

────────────────────────────────────────────────────────────────────────────────────────────────────

────────────────────────────────────────────────────────────────────────────────────────────────────
                            Report generated at: 2026-05-21 22:36:40 UTC