Skip to content

OpenTelemetry Trace Export

waza can emit OpenTelemetry traces for every eval run so you can analyze agent behavior in the same observability stack you use for the rest of your applications. The integration is opt-in — no telemetry leaves your machine unless you pass --otel-exporter.

Each run produces a tree of spans rooted at the eval invocation. Tool and model calls captured by the engine become child spans on the turn that produced them.

graph TD
Eval["eval (spec.id, model)"] --> Task1["task (task.id)"]
Eval --> Task2["task (task.id)"]
Task1 --> Turn1["turn (kind=initial)"]
Turn1 --> Tool1["tool_call (gen_ai.tool.name)"]
Turn1 --> Model1["model_call (gen_ai.request.model)"]
Task2 --> Turn2["turn (kind=initial)"]
Turn2 --> Tool2["tool_call"]

Attributes follow the OpenTelemetry GenAI semantic conventions (gen_ai.system, gen_ai.request.model, gen_ai.usage.input_tokens, gen_ai.usage.output_tokens, gen_ai.tool.name, gen_ai.tool.call.id, …) so backends that already understand GenAI traces render them out of the box.

FlagDescription
--otel-exporterExporter to use: otlp, stdout, or file. Omit (default) to keep telemetry off.
--otel-endpointOTLP/HTTP endpoint (e.g. localhost:4318 or https://collector.example.com). Only used with --otel-exporter=otlp.
--otel-headersComma-separated key=value pairs added to OTLP requests (useful for auth).
--otel-fileOutput path for newline-delimited span JSON when --otel-exporter=file.
--otel-include-payloadsInclude prompts, tool arguments, tool results, and completions in spans. Off by default — only sha256 + byte length are emitted.

The fastest way to see waza traces is the standard OTel Collector container listening on the default OTLP/HTTP port.

docker-compose.yaml
services:
otel-collector:
image: otel/opentelemetry-collector-contrib:latest
command: ["--config=/etc/otelcol/config.yaml"]
volumes:
- ./otel-config.yaml:/etc/otelcol/config.yaml
ports:
- "4318:4318" # OTLP HTTP
otel-config.yaml
receivers:
otlp:
protocols:
http:
exporters:
debug:
verbosity: detailed
service:
pipelines:
traces:
receivers: [otlp]
exporters: [debug]

Bring it up, then run waza pointed at it:

Terminal window
docker compose up -d
waza run ./evals/my-skill/eval.yaml \
--otel-exporter otlp \
--otel-endpoint localhost:4318

Spans appear in the collector’s stdout. Swap the debug exporter for otlphttp, jaeger, tempo, azuremonitor, or any other backend.

For quick inspection without a collector:

Terminal window
# Stream spans as JSON to stderr
waza run eval.yaml --otel-exporter stdout
# Write spans to a file (newline-delimited JSON)
waza run eval.yaml --otel-exporter file --otel-file traces.json

By default waza strips prompt, tool-argument, tool-result, and completion content from spans — only the SHA-256 digest and byte length are recorded under per-slot keys so multiple redacted payloads on the same span (e.g. prompt + completion, or tool arguments + result) do not collide:

SlotHash keyLength key
Promptwaza.prompt.sha256waza.prompt.length
Completionwaza.completion.sha256waza.completion.length
Tool argumentswaza.tool.arguments.sha256waza.tool.arguments.length
Tool resultwaza.tool.result.sha256waza.tool.result.length

This keeps user prompts and tool outputs out of your trace backend.

To capture full payloads (e.g. for debugging in a private collector), pass:

Terminal window
waza run eval.yaml --otel-exporter otlp --otel-include-payloads

When enabled, the following attributes carry full content instead of the hash/length surrogates:

  • gen_ai.prompt — user/system prompt sent to the model
  • gen_ai.completion — model completion text
  • gen_ai.tool.arguments — JSON arguments passed to a tool
  • gen_ai.tool.result — tool execution output

Some collectors require API keys. Use --otel-headers:

Terminal window
waza run eval.yaml \
--otel-exporter otlp \
--otel-endpoint https://collector.example.com \
--otel-headers "x-api-key=$OTEL_API_KEY,x-team=skills"
  • The exporter is best-effort: failures to flush at shutdown log a warning but do not fail the run.
  • When --otel-exporter is unset, the OTel SDK is not initialized — there is zero overhead for users who do not opt in.
  • waza calls SetGlobal on the configured tracer provider so engines or libraries that look up the global provider (for trace ID propagation into the Copilot SDK or HTTP clients) attach to the same trace.