Utility References¶
ID¶
agentlightning.utils.id.generate_id(length)
¶
Generate a random ID of the given length.
Parameters:
-
length(int) –The length of the ID to generate.
Returns:
-
str–A random ID of the given length.
Metrics¶
agentlightning.utils.metrics.MetricsBackend
¶
Abstract base class for metrics backends.
has_prometheus()
¶
Check if the backend has prometheus support.
inc_counter(name, amount=1.0, labels=None)
async
¶
Increments a registered counter.
Parameters:
-
name(str) –Metric name (must be registered as a counter).
-
amount(float, default:1.0) –Increment amount.
-
labels(Optional[LabelDict], default:None) –Label values.
Raises:
-
ValueError–If the metric is not registered, has the wrong type, or label keys do not match the registered label names.
observe_histogram(name, value, labels=None)
async
¶
Records an observation for a registered histogram.
Parameters:
-
name(str) –Metric name (must be registered as a histogram).
-
value(float) –Observed value.
-
labels(Optional[LabelDict], default:None) –Label values.
Raises:
-
ValueError–If the metric is not registered, has the wrong type, or label keys do not match the registered label names.
register_counter(name, label_names=None, group_level=None)
¶
Registers a counter metric.
Parameters:
-
name(str) –Metric name.
-
label_names(Optional[Sequence[str]], default:None) –List of label names. Order determines the truncation priority for group-level logging.
-
group_level(Optional[int], default:None) –Optional per-metric grouping depth for backends that support label grouping (Console). Global backend settings take precedence when provided.
Raises:
-
ValueError–If the metric is already registered with a different type or label set.
register_histogram(name, label_names=None, buckets=None, group_level=None)
¶
Registers a histogram metric.
Parameters:
-
name(str) –Metric name.
-
label_names(Optional[Sequence[str]], default:None) –List of label names. Order determines the truncation priority for group-level logging.
-
buckets(Optional[Sequence[float]], default:None) –Bucket boundaries (exclusive upper bounds). If None, the backend may choose defaults.
-
group_level(Optional[int], default:None) –Optional per-metric grouping depth for backends that support label grouping (Console). Global backend settings take precedence when provided.
Raises:
-
ValueError–If the metric is already registered with a different type or label set.
agentlightning.utils.metrics.ConsoleMetricsBackend
¶
Bases: MetricsBackend
Console backend with sliding-window aggregations and label grouping.
This backend:
- Requires explicit metric registration.
- Stores timestamped events per (metric_name, labels) key.
- Computes rate and percentiles (P50, P95, P99) over a sliding time window.
- Uses a single global logging decision: when logging is triggered, it logs all metric groups, not just the one being updated.
Rate is always per second.
Label grouping: When logging, label dictionaries are truncated to the first
group_level label pairs (following the registered label order) and metrics
with identical truncated labels are aggregated together. For example:
labels = {"method": "GET", "path": "/", "status": "200"}
group_level = 2 # aggregated labels {"method": "GET", "path": "/"}
If group_level is None or < 1, all label combinations for a metric are
merged into a single log entry (equivalent to grouping by zero labels).
Individual counters or histograms can set their own group_level during
registration; those values apply only when the backend-level group_level
is unset, allowing selective overrides.
Thread-safety: Runtime updates and snapshotting use two aiologic locks: one for mutating shared state and another that serializes the global logging decision/snapshot capture so other tasks can continue writing. Metric registration happens during initialization, so it is intentionally left lock-free; this assumption is documented here to avoid blocking writes unnecessarily.
__init__(window_seconds=60.0, log_interval_seconds=10.0, group_level=None)
¶
Initializes ConsoleMetricsBackend.
Parameters:
-
window_seconds(Optional[float], default:60.0) –Sliding window size (in seconds) used when computing rate and percentiles. If None, all in-memory events are used.
-
log_interval_seconds(float, default:10.0) –Minimum time (in seconds) between log bursts. When the interval elapses, the next metric event triggers a snapshot and logging of all metrics.
-
group_level(Optional[int], default:None) –Label grouping depth. When logging, only the first
group_levellabels (following registered order) are retained and metric events sharing those labels are aggregated. If None or < 1, all label combinations collapse into a single group per metric.
inc_counter(name, amount=1.0, labels=None)
async
¶
Increments a registered counter metric.
See base class for behavior and error conditions.
observe_histogram(name, value, labels=None)
async
¶
Records an observation for a registered histogram metric.
See base class for behavior and error conditions.
register_counter(name, label_names=None, group_level=None)
¶
Registers a counter metric.
See base class for argument documentation.
register_histogram(name, label_names=None, buckets=None, group_level=None)
¶
Registers a histogram metric.
See base class for argument documentation.
agentlightning.utils.metrics.PrometheusMetricsBackend
¶
Bases: MetricsBackend
Metrics backend that forwards events to prometheus_client.
All metrics must be registered before use. This backend does not compute any aggregations; it only updates Prometheus metrics.
Thread-safety: Registration is protected by a lock. Metric updates assume metrics are registered during initialization and then remain stable.
Due to the nature of Prometheus, this backend is only suitable for recording high-volume metrics. Low-volume metrics might be lost if the event has only appeared once.
__init__()
¶
Initializes PrometheusMetricsBackend.
Raises:
-
ImportError–If prometheus_client is not installed.
has_prometheus()
¶
Check if the backend has prometheus support.
inc_counter(name, amount=1.0, labels=None)
async
¶
Increments a registered Prometheus counter.
observe_histogram(name, value, labels=None)
async
¶
Records an observation for a registered Prometheus histogram.
register_counter(name, label_names=None, group_level=None)
¶
Registers a Prometheus counter metric.
register_histogram(name, label_names=None, buckets=None, group_level=None)
¶
Registers a Prometheus histogram metric.
agentlightning.utils.metrics.MultiMetricsBackend
¶
Bases: MetricsBackend
Metrics backend that forwards calls to multiple underlying backends.
__init__(backends)
¶
Initializes MultiMetricsBackend.
Parameters:
-
backends(Sequence[MetricsBackend]) –Sequence of underlying backends.
Raises:
-
ValueError–If no backends are provided.
has_prometheus()
¶
Check if the backend has prometheus support.
inc_counter(name, amount=1.0, labels=None)
async
¶
Increments a counter metric in all underlying backends.
observe_histogram(name, value, labels=None)
async
¶
Records a histogram observation in all underlying backends.
register_counter(name, label_names=None, group_level=None)
¶
Registers a counter metric in all underlying backends.
register_histogram(name, label_names=None, buckets=None, group_level=None)
¶
Registers a histogram metric in all underlying backends.
agentlightning.utils.metrics.setup_multiprocess_prometheus()
¶
Set up prometheus multiprocessing directory if not already configured.
agentlightning.utils.metrics.get_prometheus_registry()
¶
Get the appropriate prometheus registry based on multiprocessing configuration.
agentlightning.utils.metrics.shutdown_metrics(server=None, worker=None, *args, **kwargs)
¶
Shutdown prometheus metrics.
Server Launcher¶
agentlightning.utils.server_launcher.PythonServerLauncher
¶
Unified launcher for FastAPI, using uvicorn or gunicorn per mode/worker count.
See PythonServerLauncherArgs for configuration options.
Parameters:
-
app(FastAPI) –The FastAPI app to launch.
-
args(PythonServerLauncherArgs) –The configuration for the server.
-
serve_context(Optional[AsyncContextManager[Any]], default:None) –An optional context manager to apply around the server startup.
access_endpoint
property
¶
Return a loopback-friendly URL so health checks succeed even when binding to 0.0.0.0.
endpoint
property
¶
Return the externally advertised host:port pair regardless of accessibility.
health_url
property
¶
Build the absolute health-check endpoint from args, if one is configured.
__getstate__()
¶
Control pickling to prevent server state from being sent to subprocesses.
__init__(app, args, serve_context=None)
¶
Initialize the launcher with the FastAPI app, configuration, and optional serve context.
is_running()
¶
Return True if the server has been started and not yet stopped.
reload()
async
¶
Restart the server by stopping it if necessary and invoking start again.
run_forever()
async
¶
Start the server and block the caller until it exits, respecting the configured mode.
start()
async
¶
Starts the server according to launch_mode and n_workers.
stop()
async
¶
Stop the server using the inverse of whatever launch mode was used to start it.
agentlightning.utils.server_launcher.PythonServerLauncherArgs
dataclass
¶
access_host = None
class-attribute
instance-attribute
¶
The hostname or IP address to advertise to the client. If not provided, the server will use the default outbound IPv4 address for this machine.
access_log = False
class-attribute
instance-attribute
¶
Whether to turn on access logs.
healthcheck_url = None
class-attribute
instance-attribute
¶
The health check URL to use. If not provided, the server will not be checked for healthiness after starting.
host = None
class-attribute
instance-attribute
¶
The hostname or IP address to bind the server to.
kill_unhealthy_server = True
class-attribute
instance-attribute
¶
Whether to kill the server if it is not healthy after startup.
This setting is ignored when launch_mode is not asyncio.
launch_mode = 'asyncio'
class-attribute
instance-attribute
¶
The launch mode. asyncio is the default mode to runs the server in the current thread.
thread runs the server in a separate thread. mp runs the server in a separate process.
log_level = logging.INFO
class-attribute
instance-attribute
¶
The log level to use.
n_workers = 1
class-attribute
instance-attribute
¶
The number of workers to run in the server. Only applicable for mp mode.
When n_workers > 1, the server will be run using Gunicorn.
port = None
class-attribute
instance-attribute
¶
The TCP port to listen on. If not provided, the server will use a random available port.
process_join_timeout = 10.0
class-attribute
instance-attribute
¶
The timeout to wait for the process to join.
startup_timeout = 60.0
class-attribute
instance-attribute
¶
The timeout to wait for the server to start up.
thread_join_timeout = 10.0
class-attribute
instance-attribute
¶
The timeout to wait for the thread to join.
timeout_keep_alive = 30
class-attribute
instance-attribute
¶
The timeout to keep the connection alive.
agentlightning.utils.server_launcher.LaunchMode = Literal['asyncio', 'thread', 'mp']
module-attribute
¶
The launch mode for the server.
OpenTelemetry¶
agentlightning.utils.otel.full_qualified_name(obj)
¶
agentlightning.utils.otel.get_tracer_provider(inspect=True)
¶
Get the OpenTelemetry tracer provider configured for Agent Lightning.
Parameters:
-
inspect(bool, default:True) –Whether to inspect the tracer provider and log its configuration. When it's on, make sure you also set the logger level to DEBUG to see the logs.
agentlightning.utils.otel.get_tracer(use_active_span_processor=True)
¶
Resolve the OpenTelemetry tracer configured for Agent Lightning.
Parameters:
-
use_active_span_processor(bool, default:True) –Whether to use the active span processor.
Returns:
-
Tracer–OpenTelemetry tracer tagged with the
agentlightninginstrumentation name.
Raises:
-
RuntimeError–If OpenTelemetry was not initialized before calling this helper.
agentlightning.utils.otel.make_tag_attributes(tags)
¶
agentlightning.utils.otel.extract_tags_from_attributes(attributes)
¶
Extract tag attributes from flattened span attributes.
Parameters:
-
attributes(Dict[str, Any]) –A dictionary of flattened span attributes.
agentlightning.utils.otel.make_link_attributes(links)
¶
agentlightning.utils.otel.query_linked_spans(spans, links)
¶
Query spans that are linked by the given link attributes.
Parameters:
-
spans(Sequence[T_SpanLike]) –A sequence of spans to search.
-
links(List[LinkPydanticModel]) –A list of link attributes to match.
Returns:
-
List[T_SpanLike]–A list of spans that match the given link attributes.
agentlightning.utils.otel.extract_links_from_attributes(attributes)
¶
Extract link attributes from flattened span attributes.
Parameters:
-
attributes(Dict[str, Any]) –A dictionary of flattened span attributes.
agentlightning.utils.otel.filter_attributes(attributes, prefix)
¶
Filter attributes that start with the given prefix.
The attribute must start with prefix. or be exactly prefix to be included.
Parameters:
-
attributes(Dict[str, Any]) –A dictionary of span attributes.
-
prefix(str) –The prefix to filter by.
Returns:
-
Dict[str, Any]–A dictionary of attributes that start with the given prefix.
agentlightning.utils.otel.filter_and_unflatten_attributes(attributes, prefix)
¶
Filter attributes that start with the given prefix and unflatten them. The prefix will be removed during unflattening.
Parameters:
-
attributes(Dict[str, Any]) –A dictionary of span attributes.
-
prefix(str) –The prefix to filter by.
Returns:
-
Union[Dict[str, Any], List[Any]]–A nested dictionary or list of attributes that start with the given prefix.
agentlightning.utils.otel.flatten_attributes(nested_data, *, expand_leaf_lists=False)
¶
Flatten a nested dictionary or list into a flat dictionary with dotted keys.
This function recursively traverses dictionaries and lists, producing a flat key-value mapping where nested paths are represented via dot-separated keys. Lists are indexed numerically.
Example:
>>> flatten_attributes({"a": {"b": 1, "c": [2, 3]}}, expand_leaf_lists=True)
{"a.b": 1, "a.c.0": 2, "a.c.1": 3}
Parameters:
-
nested_data(Union[Dict[str, Any], List[Any]]) –A nested structure composed of dictionaries, lists, or primitive values.
-
expand_leaf_lists(bool, default:False) –Whether to expand lists composed only of primitive values. When
False(the default), lists of str/int/float/bool are treated as leaf values and stored without enumerating their indices.
Returns:
-
Dict[str, Any]–A flat dictionary mapping dotted-string paths to primitive values.
agentlightning.utils.otel.unflatten_attributes(flat_data)
¶
Reconstruct a nested dictionary/list structure from a flat dictionary.
Keys are dot-separated paths. Segments that are digit strings will only become list indices if all keys in that dict form a consecutive 0..n-1 range. Otherwise they remain dict keys.
Example:
>>> unflatten_attributes({"a.b": 1, "a.c.0": 2, "a.c.1": 3})
{"a": {"b": 1, "c": [2, 3]}}
Parameters:
-
flat_data(Dict[str, Any]) –A dictionary whose keys are dot-separated paths and whose values are primitive data elements.
Returns:
-
Union[Dict[str, Any], List[Any]]–A nested dictionary (and lists where appropriate) corresponding to
-
Union[Dict[str, Any], List[Any]]–the flattened structure.
agentlightning.utils.otel.sanitize_attribute_value(object, force=True)
¶
Sanitize an attribute value to be a valid OpenTelemetry attribute value.
agentlightning.utils.otel.sanitize_attributes(attributes, force=True)
¶
Sanitize a dictionary of attributes to be a valid OpenTelemetry attributes.
Parameters:
-
attributes(Dict[str, Any]) –A dictionary of attributes to sanitize.
-
force(bool, default:True) –Whether to force sanitization even when the value is not JSON serializable.
agentlightning.utils.otel.sanitize_list_attribute_sanity(maybe_list)
¶
Try to sanitize a list of attributes to be a valid OpenTelemetry attribute value.
Raise error if the list contains multiple types of primitive values.
agentlightning.utils.otel.check_attributes_sanity(attributes)
¶
Check if a dictionary of attributes is a valid OpenTelemetry attributes.
agentlightning.utils.otel.format_exception_attributes(exception)
¶
Format an exception into a dictionary of attributes.
OTLP¶
agentlightning.utils.otlp.handle_otlp_export(request, request_message_cls, response_message_cls, message_callback, signal_name)
async
¶
Generic handler for /v1/traces, /v1/metrics, /v1/logs.
Convert the OTLP Protobuf request to a JSON-like object.
agentlightning.utils.otlp.spans_from_proto(request, sequence_id_bulk_issuer)
async
¶
Parse an OTLP proto payload into List[Span].
A store is needed here for generating a sequence ID for each span.
System Snapshot¶
agentlightning.utils.system_snapshot.system_snapshot(include_gpu=False)
¶
Capture a snapshot of the system's hardware and software information.
Parameters:
-
include_gpu(bool, default:False) –Whether to include GPU information.
Returns:
-
Dict[str, Any]–A dictionary containing the system's hardware and software information.