This notebook shows how to generate GCG Zou et al., 2023 suffixes using Azure Machine Learning (AML), which consists of three main steps:
Connect to an Azure Machine Learning (AML) workspace.
Create AML Environment with the Python dependencies.
Submit a training job to AML.
Connect to Azure Machine Learning Workspace¶
The workspace is the top-level resource for Azure Machine Learning (AML), providing a centralized place to work with all the artifacts you create when using AML. In this section, we will connect to the workspace in which the job will be run.
To connect to a workspace, we need identifier parameters - a subscription, resource group and workspace name. We will use these details in the MLClient from azure.ai.ml to get a handle to the required AML workspace. We use the default Azure authentication for this tutorial.
import os
from pyrit.setup.initialization import _load_environment_files
_load_environment_files(env_files=None)
subscription_id = os.environ.get("AZURE_ML_SUBSCRIPTION_ID")
resource_group = os.environ.get("AZURE_ML_RESOURCE_GROUP")
workspace = os.environ.get("AZURE_ML_WORKSPACE_NAME")
print(workspace)Found default environment files: ['./.pyrit/.env', './.pyrit/.env.local']
Loaded environment file: ./.pyrit/.env
Loaded environment file: ./.pyrit/.env.local
gcg-romanlutz
The Azure ML SDK emits a fair amount of telemetry to stderr that looks
alarming but is benign: every operation logs an ActivityCompleted: ... HowEnded=Failure line for any expected UserError (such as
create_or_update finding the environment already at the latest version),
and every preview / experimental class prints a one-line warning. Quiet
all of it so the rest of the notebook output stays focused on what
actually matters.
import logging
import warnings
logging.getLogger("azure.ai.ml").setLevel(logging.ERROR)
warnings.filterwarnings("ignore", module=r"azure\.ai\.ml.*")from azure.ai.ml import MLClient
from azure.identity import AzureCliCredential
ml_client = MLClient(AzureCliCredential(), subscription_id, resource_group, workspace)Class DeploymentTemplateOperations: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Create AML Environment¶
To install the dependencies needed to run GCG, we create an AML environment from a
Dockerfile. The Dockerfile uses
an NVIDIA CUDA base image with Python 3.11 and installs PyRIT with the gcg extra.
from pathlib import Path
from azure.ai.ml.entities import BuildContext, Environment
from pyrit.common.path import HOME_PATH
# Configure the AML environment — build context is the repo root so the Dockerfile
# can COPY pyproject.toml and pyrit/ for pip install -e ".[gcg]"
env_docker_context = Environment(
build=BuildContext(
path=Path(HOME_PATH),
dockerfile_path="pyrit/auxiliary_attacks/gcg/src/Dockerfile",
),
name="pyrit-gcg",
description="PyRIT GCG environment: CUDA 12.1 + Python 3.11 + pip install -e .[gcg]",
tags={"Owner": os.environ.get("USER", "unknown")},
)
ml_client.environments.create_or_update(env_docker_context)ActivityCompleted: Activity=Datastore.ListSecrets, HowEnded=Failure, Duration=731.02 [ms], Exception=HttpResponseError, ErrorCategory=UserError, ErrorMessage=(UserError) No secrets for credentials of type None.
Code: UserError
Message: No secrets for credentials of type None.
Additional Information:Type: ComponentName
Info: {
"value": "managementfrontend"
}Type: Correlation
Info: {
"value": {
"operation": "d83f8c4d225dee5d56c301c18e298f59",
"request": "c537217eb2b56149"
}
}Type: Environment
Info: {
"value": "westus3"
}Type: Location
Info: {
"value": "westus3"
}Type: Time
Info: {
"value": "2026-05-09T12:49:18.18528+00:00"
}
ActivityCompleted: Activity=Environment.CreateOrUpdate, HowEnded=Failure, Duration=33839.37 [ms], Exception=ResourceExistsError, ErrorCategory=UserError, ErrorMessage=(UserError) Environment pyrit-gcg with version 10 is already registered and cannot be changed.
Code: UserError
Message: Environment pyrit-gcg with version 10 is already registered and cannot be changed.
ActivityCompleted: Activity=Datastore.ListSecrets, HowEnded=Failure, Duration=348.1 [ms], Exception=HttpResponseError, ErrorCategory=UserError, ErrorMessage=(UserError) No secrets for credentials of type None.
Code: UserError
Message: No secrets for credentials of type None.
Additional Information:Type: ComponentName
Info: {
"value": "managementfrontend"
}Type: Correlation
Info: {
"value": {
"operation": "66a3d036ffde9abfa617b61d00bd6214",
"request": "139566989f2c3f74"
}
}Type: Environment
Info: {
"value": "westus3"
}Type: Location
Info: {
"value": "westus3"
}Type: Time
Info: {
"value": "2026-05-09T12:49:49.3263735+00:00"
}
Environment({'arm_type': 'environment_version', 'latest_version': None, 'image': None, 'intellectual_property': None, 'is_anonymous': False, 'auto_increment_version': False, 'auto_delete_setting': None, 'name': 'pyrit-gcg', 'description': 'PyRIT GCG environment: CUDA 12.1 + Python 3.11 + pip install -e .[gcg]', 'tags': {'Owner': 'unknown'}, 'properties': {'azureml.labels': 'latest'}, 'print_as_yaml': False, 'id': '/subscriptions/db1ba766-2ca3-42c6-a19a-0f0d43134a8c/resourceGroups/gcg-romanlutz/providers/Microsoft.MachineLearningServices/workspaces/gcg-romanlutz/environments/pyrit-gcg/versions/11', 'Resource__source_path': '', 'base_path': './git/PyRIT-wt-gcg-refactor/doc/code/auxiliary_attacks', 'creation_context': <azure.ai.ml.entities._system_data.SystemData object at 0x000001310E5816A0>, 'serialize': <msrest.serialization.Serializer object at 0x00000131121C2E40>, 'version': '11', 'conda_file': None, 'build': <azure.ai.ml.entities._assets.environment.BuildContext object at 0x000001310FECE210>, 'inference_config': None, 'os_type': 'Linux', 'conda_file_path': None, 'path': None, 'datastore': None, 'upload_hash': None, 'translated_conda_file': None})Submit Training Job to AML¶
Finally, we configure the command to run the GCG algorithm. The entry point is
pyrit.auxiliary_attacks.gcg.experiments.run,
invoked as a module so the uploaded code snapshot takes priority over the
Docker-installed package (Python’s -m flag puts the cwd at the front of sys.path).
We also have to specify a GPU compute target. In our experience, a GPU instance with at least 24GB of vRAM is required (e.g., Standard_NC24ads_A100_v4).
Depending on the compute instance you use, you may encounter “out of memory” errors.
In this case, we recommend training on a smaller model or lowering n_train_data or batch_size.
from azure.ai.ml import Output, command
job = command(
code=Path(HOME_PATH),
command=(
"python -m pyrit.auxiliary_attacks.gcg.experiments.run"
" --model_name llama_2"
" --setup single"
" --n_train_data 5"
" --n_test_data 0"
" --n_steps 5"
" --batch_size 64"
" --output_dir ${{outputs.results}}"
),
inputs={},
outputs={"results": Output(type="uri_folder")},
environment=f"{env_docker_context.name}:{env_docker_context.version}",
environment_variables={"HUGGINGFACE_TOKEN": os.environ["HUGGINGFACE_TOKEN"]},
compute="gcg-gpu-a100",
display_name="gcg_suffix_generation",
description="Generate adversarial suffixes using GCG on Llama-2.",
tags={"Owner": os.environ.get("USER", "unknown")},
)returned_job = ml_client.create_or_update(job)
print(f"Job: {returned_job.name}")
print(f"Status: {returned_job.status}")
print(f"Studio URL: {returned_job.studio_url}")Class AutoDeleteSettingSchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Class AutoDeleteConditionSchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Class BaseAutoDeleteSettingSchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Class IntellectualPropertySchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Class ProtectionLevelSchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Class BaseIntellectualPropertySchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
pathOnCompute is not a known attribute of class <class 'azure.ai.ml._restclient.v2023_04_01_preview.models._models_py3.UriFolderJobOutput'> and will be ignored
Job: stoic_parcel_6clfs67hp9
Status: Starting
Studio URL: https://ml.azure.com/runs/stoic_parcel_6clfs67hp9?wsid=/subscriptions/db1ba766-2ca3-42c6-a19a-0f0d43134a8c/resourcegroups/gcg-romanlutz/workspaces/gcg-romanlutz&tid=72f988bf-86f1-41af-91ab-2d7cd011db47
Wait for the Job to Complete and Inspect the Generated Suffix¶
The next cell polls the job until it reaches a terminal state (~20-30
minutes for the small 5-step baseline above), then downloads the named
results output and prints the final suffix. The runner writes its
result file as individual_behaviors_<model>_gcg_<timestamp>.json into
the directory Azure ML mounted for the results output, so it ends up
under <download_dir>/named-outputs/results/ once we download. The
controls array in that file contains one entry per training step, and
the last entry is the final adversarial suffix that, appended to the user
prompt, was optimized to elicit the target response.
import json
import tempfile
import time
from pathlib import Path
_TERMINAL_STATES = {"Completed", "Failed", "Canceled", "CancelRequested"}
last_status = None
while True:
current_status = ml_client.jobs.get(returned_job.name).status
if current_status != last_status:
print(f"Job status: {current_status}", flush=True)
last_status = current_status
if current_status in _TERMINAL_STATES:
break
time.sleep(60)
assert current_status == "Completed", f"Job did not complete successfully: {current_status}"
download_dir = Path(tempfile.mkdtemp(prefix="gcg-aml-"))
ml_client.jobs.download(name=returned_job.name, download_path=str(download_dir), all=True)
result_files = list(download_dir.rglob("individual_behaviors_*_gcg_*.json"))
if not result_files:
print(f"No GCG result file found under {download_dir}. Files captured:")
for p in sorted(download_dir.rglob("*")):
if p.is_file():
print(f" {p.relative_to(download_dir)}")
raise FileNotFoundError("Result JSON not in downloaded artifacts")
result_file = result_files[0]
with open(result_file) as f:
log = json.load(f)
final_suffix = log["controls"][-1] if log["controls"] else None
final_loss = log["losses"][-1] if log["losses"] else None
print(f"Result file: {result_file.name}")
print(f"Steps run: {len(log['controls'])}")
print(f"Final loss: {final_loss}")
print(f"Generated suffix: {final_suffix!r}")Job status: Preparing
Job status: Queued
Job status: Running
Job status: Completed
Downloading artifact azureml://subscriptions/db1ba766-2ca3-42c6-a19a-0f0d43134a8c/resourcegroups/gcg-romanlutz/workspaces/gcg-romanlutz/datastores/workspaceblobstore/paths/azureml/stoic_parcel_6clfs67hp9/results/ to ./AppData/Local/Temp/gcg-aml-e4x7zwr5/named-outputs/results
ActivityCompleted: Activity=Datastore.ListSecrets, HowEnded=Failure, Duration=350.53 [ms], Exception=HttpResponseError, ErrorCategory=UserError, ErrorMessage=(UserError) No secrets for credentials of type None.
Code: UserError
Message: No secrets for credentials of type None.
Additional Information:Type: ComponentName
Info: {
"value": "managementfrontend"
}Type: Correlation
Info: {
"value": {
"operation": "c6de26a5821f768ace8e59f5b75601d7",
"request": "f84ff3aa7b1f0557"
}
}Type: Environment
Info: {
"value": "westus3"
}Type: Location
Info: {
"value": "westus3"
}Type: Time
Info: {
"value": "2026-05-09T13:18:13.3919233+00:00"
}
Downloading artifact azureml://datastores/workspaceartifactstore/ExperimentRun/dcid.stoic_parcel_6clfs67hp9 to ./AppData/Local/Temp/gcg-aml-e4x7zwr5/artifacts
Result file: individual_behaviors_llama_2_gcg_20260509-131606.json
Steps run: 30
Final loss: 0.87353515625
Generated suffix: '! ! }; Kingdom ! ! ! Move ! ! [former [ [ [ [ [ [ [ ['
- Zou, A., Wang, Z., Carlini, N., Nasr, M., Kolter, J. Z., & Fredrikson, M. (2023). Universal and Transferable Adversarial Attacks on Aligned Language Models. arXiv Preprint arXiv:2307.15043. https://arxiv.org/abs/2307.15043