Agent Lightning Core¶
Client Side¶
agentlightning.litagent
¶
LitAgent
¶
Bases: Generic[T]
Base class for the training and validation logic of an agent.
Developers should subclass this class and implement the rollout methods to define the agent's behavior for a single task. The agent's logic is completely decoupled from the server communication and training infrastructure.
Source code in agentlightning/litagent/litagent.py
39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 |
|
runner
property
¶
Convenient shortcut of self.get_runner().
tracer
property
¶
Convenient shortcut of self.get_tracer().
trainer
property
¶
Convenient shortcut of self.get_trainer().
__init__(*, trained_agents=None)
¶
Initialize the LitAgent.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
trained_agents
|
Optional[str]
|
Optional string representing the trained agents.
This can be used to track which agents have been trained by this instance.
Deprecated. Configure |
None
|
Source code in agentlightning/litagent/litagent.py
get_runner()
¶
Get the runner for this agent.
Returns:
Type | Description |
---|---|
BaseRunner[T]
|
The runner instance associated with this agent. |
Source code in agentlightning/litagent/litagent.py
get_tracer()
¶
Get the tracer for this agent.
Returns:
Type | Description |
---|---|
BaseTracer
|
The BaseTracer instance associated with this agent. |
get_trainer()
¶
Get the trainer for this agent.
Returns:
Type | Description |
---|---|
Trainer
|
The Trainer instance associated with this agent. |
Source code in agentlightning/litagent/litagent.py
is_async()
¶
Check if the agent implements asynchronous rollout methods. Override this property for customized async detection logic.
Returns:
Type | Description |
---|---|
bool
|
True if the agent has custom async rollout methods, False otherwise. |
Source code in agentlightning/litagent/litagent.py
on_rollout_end(task, rollout, runner, tracer)
¶
Hook called after a rollout completes.
Deprecated in favor of on_rollout_end
in the Hook
interface.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
task
|
Task
|
The :class: |
required |
rollout
|
RolloutV2
|
The resulting :class: |
required |
runner
|
BaseRunner[T]
|
The :class: |
required |
tracer
|
BaseTracer
|
The tracer instance associated with the runner. |
required |
Subclasses can override this method for cleanup or additional logging. By default, this is a no-op.
Source code in agentlightning/litagent/litagent.py
on_rollout_start(task, runner, tracer)
¶
Hook called immediately before a rollout begins.
Deprecated in favor of on_rollout_start
in the Hook
interface.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
task
|
Task
|
The :class: |
required |
runner
|
BaseRunner[T]
|
The :class: |
required |
tracer
|
BaseTracer
|
The tracer instance associated with the runner. |
required |
Subclasses can override this method to implement custom logic such as logging, metric collection, or resource setup. By default, this is a no-op.
Source code in agentlightning/litagent/litagent.py
rollout(task, resources, rollout)
¶
Main entry point for executing a rollout.
This method determines whether to call the synchronous or asynchronous rollout method based on the agent's implementation.
If you don't wish to implement both training rollout and validation
rollout separately, you can just implement rollout
which will work for both.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
task
|
T
|
The task object received from the server, containing the input data and metadata. |
required |
resources
|
NamedResources
|
A dictionary of named resources (e.g., LLMs, prompt templates) for the agent to use. |
required |
rollout
|
RolloutV2
|
The full rollout object, please avoid from directly modifying it.
Most agents should only use |
required |
Returns:
Type | Description |
---|---|
RolloutRawResultV2
|
The result of the rollout, which can be one of: |
RolloutRawResultV2
|
|
RolloutRawResultV2
|
|
RolloutRawResultV2
|
|
RolloutRawResultV2
|
|
RolloutRawResultV2
|
|
RolloutRawResultV2
|
|
Source code in agentlightning/litagent/litagent.py
rollout_async(task, resources, rollout)
async
¶
Asynchronous version of the main rollout method.
This method determines whether to call the synchronous or asynchronous rollout method based on the agent's implementation.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
task
|
T
|
The task object received from the server, containing the input data and metadata. |
required |
resources
|
NamedResources
|
A dictionary of named resources (e.g., LLMs, prompt templates) for the agent to use. |
required |
rollout
|
RolloutV2
|
The full rollout object, please avoid from directly modifying it.
Most agents should only use |
required |
Returns:
Type | Description |
---|---|
RolloutRawResultV2
|
The result of the rollout, which can be one of: |
RolloutRawResultV2
|
|
RolloutRawResultV2
|
|
RolloutRawResultV2
|
|
RolloutRawResultV2
|
|
RolloutRawResultV2
|
|
RolloutRawResultV2
|
|
Source code in agentlightning/litagent/litagent.py
set_runner(runner)
¶
Set the runner for this agent.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
runner
|
BaseRunner[T]
|
The runner instance that will handle the execution of rollouts. |
required |
set_trainer(trainer)
¶
Set the trainer for this agent.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
trainer
|
Trainer
|
The Trainer instance that will handle training and validation. |
required |
training_rollout(task, resources, rollout)
¶
Defines the agent's behavior for a single training task.
This method should contain the logic for how the agent processes an input, uses the provided resources (like LLMs or prompts), and produces a result.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
task
|
T
|
The task object received from the server, containing the input data and metadata. |
required |
resources
|
NamedResources
|
A dictionary of named resources (e.g., LLMs, prompt templates) for the agent to use. |
required |
rollout
|
RolloutV2
|
The full rollout object, please avoid from directly modifying it. |
required |
Source code in agentlightning/litagent/litagent.py
training_rollout_async(task, resources, rollout)
async
¶
Asynchronous version of training_rollout
.
This method should be implemented by agents that perform asynchronous operations (e.g., non-blocking I/O, concurrent API calls).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
task
|
T
|
The task object received from the server. |
required |
resources
|
NamedResources
|
A dictionary of named resources for the agent to use. |
required |
rollout
|
RolloutV2
|
The full rollout object, avoid from modifying it. |
required |
Returns:
Type | Description |
---|---|
RolloutRawResultV2
|
The result of the asynchronous training rollout. See |
RolloutRawResultV2
|
possible return types. |
Source code in agentlightning/litagent/litagent.py
validation_rollout(task, resources, rollout)
¶
Defines the agent's behavior for a single validation task.
By default, this method redirects to training_rollout
. Override it
if the agent should behave differently during validation.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
task
|
T
|
The task object received from the server, containing the input data and metadata. |
required |
resources
|
NamedResources
|
A dictionary of named resources for the agent to use. |
required |
rollout
|
RolloutV2
|
The full rollout object, avoid from modifying it. |
required |
Returns:
Type | Description |
---|---|
RolloutRawResultV2
|
The result of the validation rollout. See |
RolloutRawResultV2
|
possible return types. |
Source code in agentlightning/litagent/litagent.py
validation_rollout_async(task, resources, rollout)
async
¶
Asynchronous version of validation_rollout
.
By default, this method redirects to training_rollout_async
.
Override it for different asynchronous validation behavior.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
task
|
T
|
The task object received from the server. |
required |
resources
|
NamedResources
|
A dictionary of named resources for the agent to use. |
required |
rollout
|
RolloutV2
|
The full rollout object, avoid from modifying it. |
required |
Returns:
Type | Description |
---|---|
RolloutRawResultV2
|
The result of the asynchronous validation rollout. See |
RolloutRawResultV2
|
possible return types. |
Source code in agentlightning/litagent/litagent.py
is_v0_1_rollout_api(func)
¶
Check if the rollout API is v0.1. Inspect the function signature to see if it has a rollout_id parameter.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
func
|
Callable[..., Any]
|
The function to check. |
required |
Source code in agentlightning/litagent/litagent.py
llm_rollout(func=None, *, strip_proxy=True)
¶
Create a FunctionalLitAgent from a function that takes (task, llm[, rollout]).
This decorator allows you to define an agent using a simple function instead of creating a full LitAgent subclass. The returned FunctionalLitAgent instance is callable, preserving the original function's behavior.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
func
|
LlmRolloutFunc[T] | None
|
A function that defines the agent's behavior. Can be: - sync: (task, llm) -> result - sync with rollout: (task, llm, rollout) -> result - async: async (task, llm) -> result - async with rollout: async (task, llm, rollout) -> result |
None
|
strip_proxy
|
bool
|
Whether to strip the ProxyLLM resource into a LLM resource. Defaults to True. |
True
|
Returns:
Type | Description |
---|---|
FunctionalLitAgent[T] | Callable[[LlmRolloutFunc[T]], FunctionalLitAgent[T]]
|
A callable FunctionalLitAgent instance that preserves the original function's |
FunctionalLitAgent[T] | Callable[[LlmRolloutFunc[T]], FunctionalLitAgent[T]]
|
type hints and behavior while providing all agent functionality. |
Example
@llm_rollout def my_agent(task, llm): # Agent logic here return response
@llm_rollout(strip_proxy=False) def my_agent_no_strip(task, llm): # Agent logic here return response
Function is still callable with original behavior¶
result = my_agent(task, llm)
Agent methods are also available¶
result = my_agent.rollout(task, resources, rollout)
Source code in agentlightning/litagent/decorator.py
prompt_rollout(func=None)
¶
Create a FunctionalLitAgent from a function that takes (task, prompt_template[, rollout]).
This decorator is designed for agents that work with tunable prompt templates. It enables a workflow where algorithms manage and optimize the prompt template, while agents consume the template to perform rollouts. This is particularly useful for prompt optimization scenarios.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
func
|
PromptRolloutFunc[T] | None
|
A function that defines the agent's behavior. Can be: - sync: (task, prompt_template) -> result - sync with rollout: (task, prompt_template, rollout) -> result - async: async (task, prompt_template) -> result - async with rollout: async (task, prompt_template, rollout) -> result |
None
|
Returns:
Type | Description |
---|---|
FunctionalLitAgent[T] | Callable[[PromptRolloutFunc[T]], FunctionalLitAgent[T]]
|
A callable FunctionalLitAgent instance that preserves the original function's |
FunctionalLitAgent[T] | Callable[[PromptRolloutFunc[T]], FunctionalLitAgent[T]]
|
type hints and behavior while providing all agent functionality. |
Example
@prompt_rollout def my_agent(task, prompt_template): # Use the prompt template to generate a response messages = prompt_template.format(task=task.input) # ... perform rollout with the formatted prompt return response
Function is still callable with original behavior¶
result = my_agent(task, prompt_template)
Agent methods are also available¶
result = my_agent.rollout(task, resources, rollout)
Source code in agentlightning/litagent/decorator.py
rollout(func)
¶
Create a LitAgent from a function, automatically detecting the appropriate type.
This function inspects the provided callable and creates the appropriate agent type based on its signature. It supports both LLM-based and prompt-template-based agents. The returned agent instance is callable, preserving the original function's behavior and type hints.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
func
|
Union[LlmRolloutFunc[T], PromptRolloutFunc[T], Callable[..., Any]]
|
A function that defines the agent's behavior. Supported signatures: - (task, llm[, rollout]) for LLM-based agents - (task, prompt_template[, rollout]) for prompt-template-based agents |
required |
Returns:
Type | Description |
---|---|
FunctionalLitAgent[T]
|
A callable FunctionalLitAgent instance that preserves the original function's |
FunctionalLitAgent[T]
|
type hints and behavior while providing all agent functionality. |
Example
LLM-based agent¶
@rollout def my_llm_agent(task, llm): client = OpenAI(base_url=llm.endpoint) response = client.chat.completions.create( model=llm.model, messages=[{"role": "user", "content": task.input}], ) return response
Prompt-template-based agent¶
@rollout def my_prompt_agent(task, prompt_template): messages = prompt_template.format(task=task.input) # ... perform rollout with the formatted prompt return response
Function is still callable with original behavior¶
result = my_llm_agent(task, llm)
Agent methods are also available¶
result = my_llm_agent.rollout(task, resources, rollout)
Raises:
Type | Description |
---|---|
NotImplementedError
|
If the function signature doesn't match any known patterns. |
Source code in agentlightning/litagent/decorator.py
agentlightning.client
¶
Legacy client for interacting with a legacy Agent Lightning server.
AgentLightningClient
¶
Client for interacting with a version-aware Agent Lightning Server.
This client handles polling for tasks, fetching specific versions of resources (like model configurations), and posting completed rollouts back to the server. It provides both synchronous and asynchronous methods for these operations and includes a cache for resources.
Source code in agentlightning/client.py
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 |
|
__init__(endpoint, poll_interval=5.0, timeout=10.0)
¶
Initializes the AgentLightningClient.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
endpoint
|
str
|
The root URL of the Agent Lightning server. |
required |
poll_interval
|
float
|
The interval in seconds to wait between polling for new tasks. |
5.0
|
timeout
|
float
|
The timeout in seconds for HTTP requests. |
10.0
|
Source code in agentlightning/client.py
get_latest_resources()
¶
Fetches the latest available resources from the server synchronously.
Returns:
Type | Description |
---|---|
Optional[ResourcesUpdate]
|
A ResourcesUpdate object containing the latest resources. |
Source code in agentlightning/client.py
get_latest_resources_async()
async
¶
Fetches the latest available resources from the server.
Returns:
Type | Description |
---|---|
Optional[ResourcesUpdate]
|
A ResourcesUpdate object containing the latest resources. |
Source code in agentlightning/client.py
get_resources_by_id(resource_id)
¶
Fetches a specific version of resources by its ID synchronously, using a cache.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
resource_id
|
str
|
The ID of the resources to fetch, usually from a Task's metadata. |
required |
Returns:
Type | Description |
---|---|
Optional[ResourcesUpdate]
|
A ResourcesUpdate object containing the versioned resources, or None if not found. |
Source code in agentlightning/client.py
get_resources_by_id_async(resource_id)
async
¶
Fetches a specific version of resources by its ID, using a cache.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
resource_id
|
str
|
The ID of the resources to fetch, usually from a Task's metadata. |
required |
Returns:
Type | Description |
---|---|
Optional[ResourcesUpdate]
|
A ResourcesUpdate object containing the versioned resources, or None if not found. |
Source code in agentlightning/client.py
poll_next_task()
¶
Polls the server synchronously for the next task until one is available.
Returns:
Type | Description |
---|---|
Optional[Task]
|
A Task object containing the task details, including the required |
Source code in agentlightning/client.py
poll_next_task_async()
async
¶
Polls the server asynchronously for the next task until one is available.
Returns:
Type | Description |
---|---|
Optional[Task]
|
A Task object containing the task details. |
Source code in agentlightning/client.py
post_rollout(rollout)
¶
Posts a completed rollout to the server synchronously.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
rollout
|
Rollout
|
A Rollout object containing the results of a task. |
required |
Returns:
Type | Description |
---|---|
Optional[Dict[str, Any]]
|
The server's JSON response as a dictionary. |
Source code in agentlightning/client.py
post_rollout_async(rollout)
async
¶
Posts a completed rollout to the server asynchronously.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
rollout
|
Rollout
|
A Rollout object containing the results of a task. |
required |
Returns:
Type | Description |
---|---|
Optional[Dict[str, Any]]
|
The server's JSON response as a dictionary. |
Source code in agentlightning/client.py
DevTaskLoader
¶
Bases: AgentLightningClient
A local task manager for development that provides sample tasks and resources.
This client mocks the server APIs by maintaining a local queue of tasks and resources within the same process. It's designed for development, testing, and scenarios where a full Agent Lightning server is not needed.
The DevTaskLoader overrides the polling and resource fetching methods to return data from local collections instead of making HTTP requests to a remote server.
Source code in agentlightning/client.py
259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 |
|
rollouts
property
¶
Return rollouts that have been posted back to the loader.
__init__(tasks, resources, **kwargs)
¶
Initializes the DevTaskLoader with pre-defined tasks and resources.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tasks
|
Union[List[TaskInput], List[Task]]
|
Either a List of TaskInput objects or a List of Task objects. |
required |
resources
|
Union[NamedResources, ResourcesUpdate]
|
Either NamedResources or ResourcesUpdate object. |
required |
**kwargs
|
Any
|
Additional arguments passed to the parent AgentLightningClient. |
{}
|
Source code in agentlightning/client.py
poll_next_task()
¶
Returns the next task from the local queue.
If tasks are TaskInput objects, assembles them into Task objects. If tasks are already Task objects, returns them directly.
Returns:
Type | Description |
---|---|
Optional[Task]
|
The next Task object from the local task list. |
Source code in agentlightning/client.py
agentlightning.runner
¶
AgentRunner
¶
Bases: BaseRunner[Any]
Manages the agent's execution loop and integrates with AgentOps.
This class orchestrates the interaction between the agent (LitAgent
) and
the server (AgentLightningClient
). It handles polling for tasks, executing
the agent's logic, and reporting results back to the server. If enabled,
it will also automatically trace each rollout using AgentOps.
Attributes:
Name | Type | Description |
---|---|---|
agent |
The |
|
client |
The |
|
tracer |
The tracer instance for this runner/worker. |
|
worker_id |
An optional identifier for the worker process. |
|
max_tasks |
The maximum number of tasks to process before stopping. |
Source code in agentlightning/runner/legacy.py
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 |
|
iter()
¶
Executes the synchronous polling and rollout loop.
Source code in agentlightning/runner/legacy.py
iter_async()
async
¶
Executes the asynchronous polling and rollout loop.
Source code in agentlightning/runner/legacy.py
run()
¶
Poll the task and rollout once synchronously.
Source code in agentlightning/runner/legacy.py
run_async()
async
¶
Poll the task and rollout once.
Source code in agentlightning/runner/legacy.py
AgentRunnerV2
¶
Bases: BaseRunner[T_task]
Runner implementation for executing agent tasks with distributed support.
This runner manages the complete lifecycle of agent rollout execution, including task polling, resource management, tracing, and hooks. It supports both continuous iteration over tasks from the store and single-step execution.
Attributes:
Name | Type | Description |
---|---|---|
worker_id |
Optional[int]
|
The unique identifier for this worker process. |
Source code in agentlightning/runner/agent.py
44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 |
|
__init__(tracer, max_rollouts=None, poll_interval=5.0)
¶
Initialize the agent runner.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tracer
|
BaseTracer
|
The tracer instance for recording execution traces and spans. |
required |
max_rollouts
|
Optional[int]
|
Maximum number of tasks to process in iter() mode. If None, the runner will continue indefinitely until interrupted. |
None
|
poll_interval
|
float
|
Time in seconds to wait between polling attempts when no tasks are available in the store. |
5.0
|
Source code in agentlightning/runner/agent.py
get_agent()
¶
Get the agent instance.
Returns:
Type | Description |
---|---|
LitAgent[T_task]
|
The LitAgent instance managed by this runner. |
Raises:
Type | Description |
---|---|
ValueError
|
If the agent has not been initialized via init(). |
Source code in agentlightning/runner/agent.py
get_store()
¶
Get the store instance.
Returns:
Type | Description |
---|---|
LightningStore
|
The LightningStore instance for this worker. |
Raises:
Type | Description |
---|---|
ValueError
|
If the store has not been initialized via init_worker(). |
Source code in agentlightning/runner/agent.py
get_worker_id()
¶
Get the formatted worker ID string.
Returns:
Type | Description |
---|---|
str
|
A formatted string like "Worker-0" if initialized, or "Worker-Unknown" |
str
|
if the worker ID has not been set. |
Source code in agentlightning/runner/agent.py
init(agent, *, hooks=None, **kwargs)
¶
Initialize the runner with the agent.
This sets up the agent-runner relationship, registers hooks, and initializes the tracer.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
agent
|
LitAgent[T_task]
|
The LitAgent instance to be managed by this runner. |
required |
hooks
|
Optional[Sequence[Hook]]
|
Optional sequence of Hook objects to be called at various lifecycle stages (on_trace_start, on_trace_end, on_rollout_start, on_rollout_end). |
None
|
**kwargs
|
Any
|
Additional initialization arguments (currently unused). |
{}
|
Source code in agentlightning/runner/agent.py
init_worker(worker_id, store, **kwargs)
¶
Initialize the runner for each worker with worker_id and store.
This method is called once per worker in a distributed setup to provide the worker with its ID and store connection.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
worker_id
|
int
|
Unique identifier for this worker process. |
required |
store
|
LightningStore
|
The LightningStore instance for task coordination and data persistence. |
required |
**kwargs
|
Any
|
Additional worker-specific initialization arguments (currently unused). |
{}
|
Source code in agentlightning/runner/agent.py
iter(*, event=None)
async
¶
Run the runner, continuously iterating over tasks in the store.
This method polls the store for new rollouts and executes them until: - The event is set (if provided) - The max_rollouts limit is reached (if configured) - No more tasks are available
All exceptions during rollout execution are caught and logged but not propagated, allowing the runner to continue processing subsequent tasks.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
event
|
Optional[Event]
|
Optional Event object to signal the runner to stop. The runner will check this event periodically and stop gracefully when set. |
None
|
Source code in agentlightning/runner/agent.py
step(input, *, resources=None, mode=None, event=None)
async
¶
Execute a single task directly, bypassing the task queue.
This method creates a new rollout for the given input and executes it immediately. Unlike iter(), exceptions are propagated to the caller.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input
|
T_task
|
The task input to be processed by the agent. |
required |
resources
|
Optional[NamedResources]
|
Optional named resources to be used for this specific task. If provided, a new resources entry will be created in the store. If not provided, the latest resources from the store will be used. |
None
|
mode
|
Optional[RolloutMode]
|
Optional rollout mode ("train" or "validation"). If not provided, the agent's default mode will be used. |
None
|
event
|
Optional[Event]
|
Optional Event object to signal interruption (currently unused but included for interface consistency). |
None
|
Raises:
Type | Description |
---|---|
Exception
|
Any exception that occurs during rollout execution will be re-raised to the caller. |
Source code in agentlightning/runner/agent.py
teardown(*args, **kwargs)
¶
Teardown the runner and clean up all resources.
This method resets all internal state including the agent, store, hooks, and worker ID, and calls the tracer's teardown method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
*args
|
Any
|
Additional teardown arguments (currently unused). |
()
|
**kwargs
|
Any
|
Additional teardown keyword arguments (currently unused). |
{}
|
Source code in agentlightning/runner/agent.py
teardown_worker(worker_id, *args, **kwargs)
¶
Teardown the runner for a specific worker.
This method cleans up worker-specific resources and resets the worker ID.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
worker_id
|
int
|
The unique identifier of the worker being torn down. |
required |
*args
|
Any
|
Additional teardown arguments (currently unused). |
()
|
**kwargs
|
Any
|
Additional teardown keyword arguments (currently unused). |
{}
|
Source code in agentlightning/runner/agent.py
BaseRunner
¶
Bases: ParallelWorkerBase
, Generic[T_task]
Base class for all runners.
This abstract base class defines the interface that all runner implementations must follow. Runners are responsible for executing agent tasks, managing the execution lifecycle, and coordinating with the store.
Source code in agentlightning/runner/base.py
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 |
|
init(agent, **kwargs)
¶
Initialize the runner with the agent.
This method is called once during setup to configure the runner with the agent it will execute.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
agent
|
LitAgent[T_task]
|
The LitAgent instance to be managed by this runner. |
required |
**kwargs
|
Any
|
Additional initialization arguments specific to the runner implementation. |
{}
|
Raises:
Type | Description |
---|---|
NotImplementedError
|
Must be implemented by subclasses. |
Source code in agentlightning/runner/base.py
init_worker(worker_id, store, **kwargs)
¶
Initialize the runner for each worker with worker_id and store.
This method is called once per worker process in a distributed setup. It provides the worker with its unique ID and the store instance for task coordination.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
worker_id
|
int
|
Unique identifier for this worker process. |
required |
store
|
LightningStore
|
The LightningStore instance for task coordination and data persistence. |
required |
**kwargs
|
Any
|
Additional worker-specific initialization arguments. |
{}
|
Raises:
Type | Description |
---|---|
NotImplementedError
|
Must be implemented by subclasses. |
Source code in agentlightning/runner/base.py
iter(*, event=None)
async
¶
Run the runner, continuously iterating over tasks in the store.
This method runs in a loop, polling the store for new tasks and executing them until interrupted by the event or when no more tasks are available.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
event
|
Optional[Event]
|
Optional Event object that can be used to signal the runner to stop gracefully. When set, the runner should finish its current task and exit the iteration loop. |
None
|
Raises:
Type | Description |
---|---|
NotImplementedError
|
Must be implemented by subclasses. |
Source code in agentlightning/runner/base.py
run(*args, **kwargs)
¶
Undefined method - use iter() or step() instead.
This method is intentionally not implemented as the execution behavior should be defined through iter() for continuous execution or step() for single-task execution.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
*args
|
Any
|
Unused positional arguments. |
()
|
**kwargs
|
Any
|
Unused keyword arguments. |
{}
|
Raises:
Type | Description |
---|---|
RuntimeError
|
Always raised to indicate this method should not be used. |
Source code in agentlightning/runner/base.py
run_context(*, agent, store, hooks=None)
¶
Context manager for quickly init and teardown the runner, so that you can debug the runner without a trainer environment.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
agent
|
LitAgent[T_task]
|
The LitAgent instance to be managed by this runner. It should be the same agent that is to be run within the context. |
required |
store
|
LightningStore
|
The LightningStore instance for task coordination and data persistence.
If you don't have one, you can easily create one with |
required |
hooks
|
Optional[Sequence[Hook]]
|
Optional sequence of Hook instances to be used by the runner. Only some runners support hooks. |
None
|
Source code in agentlightning/runner/base.py
step(input, *, resources=None, mode=None, event=None)
async
¶
Execute a single task with the given input.
This method provides fine-grained control for executing individual tasks directly, bypassing the store's task queue.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input
|
T_task
|
The task input to be processed by the agent. |
required |
resources
|
Optional[NamedResources]
|
Optional named resources to be used for this specific task. If not provided, the latest resources from the store will be used. |
None
|
mode
|
Optional[RolloutMode]
|
Optional rollout mode (e.g., "train", "test"). If not provided, the default mode will be used. |
None
|
event
|
Optional[Event]
|
Optional Event object to signal interruption. When set, the runner may abort the current execution. |
None
|
Raises:
Type | Description |
---|---|
NotImplementedError
|
Must be implemented by subclasses. |
Source code in agentlightning/runner/base.py
teardown(*args, **kwargs)
¶
Clean up runner resources and reset state.
This method is called once during shutdown to clean up any resources allocated during initialization and reset the runner state.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
*args
|
Any
|
Additional teardown arguments. |
()
|
**kwargs
|
Any
|
Additional teardown keyword arguments. |
{}
|
Raises:
Type | Description |
---|---|
NotImplementedError
|
Must be implemented by subclasses. |
Source code in agentlightning/runner/base.py
teardown_worker(worker_id, *args, **kwargs)
¶
Clean up worker-specific resources.
This method is called once per worker during shutdown to clean up any resources specific to that worker.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
worker_id
|
int
|
The unique identifier of the worker being torn down. |
required |
*args
|
Any
|
Additional teardown arguments. |
()
|
**kwargs
|
Any
|
Additional teardown keyword arguments. |
{}
|
Raises:
Type | Description |
---|---|
NotImplementedError
|
Must be implemented by subclasses. |
Source code in agentlightning/runner/base.py
agentlightning.trainer
¶
Trainer
¶
Bases: ParallelWorkerBase
Orchestrates the distributed execution of agent rollouts.
The Trainer is responsible for launching one or more worker processes that run the agent's execution loop. It manages multiprocessing, handles graceful shutdown, and serves as the main entry point for running a client-side agent fleet.
Attributes:
Name | Type | Description |
---|---|---|
algorithm |
An instance of |
|
store |
An instance of |
|
runner |
An instance of |
|
initial_resources |
An instance of |
|
n_runners |
Number of agent runners to run in parallel. |
|
max_rollouts |
Maximum number of rollouts to process per runner. If None, workers run until no more rollouts are available. |
|
strategy |
An instance of |
|
tracer |
A tracer instance, or a string pointing to the class full name or a dictionary with a 'type' key
that specifies the class full name and other initialization parameters.
If None, a default |
|
hooks |
A sequence of |
|
adapter |
An instance of |
|
llm_proxy |
An instance of |
|
n_workers |
Number of agent workers to run in parallel. Deprecated in favor of |
|
max_tasks |
Maximum number of tasks to process per runner. Deprecated in favor of |
|
daemon |
Whether worker processes should be daemons. Daemon processes
are terminated automatically when the main process exits. Deprecated.
Only have effect with |
|
triplet_exporter |
An instance of |
|
dev |
None
|
If True, rollouts are run against the dev endpoint provided in |
Source code in agentlightning/trainer/trainer.py
39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 |
|
client()
¶
Returns the AgentLightningClient instance.
dev(agent, train_dataset=None, *, val_dataset=None)
¶
Dry run the training loop with a FastAlgorithm and the real runner.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
agent
|
LitAgent[T_co]
|
The LitAgent instance to be trained on. |
required |
train_dataset
|
Optional[Dataset[T_co]]
|
The dataset to train on. |
None
|
val_dataset
|
Optional[Dataset[T_co]]
|
The dataset to validate on. |
None
|
Source code in agentlightning/trainer/trainer.py
fit(agent, train_data, *, val_data=None, dev_data=None, dev_backend=None)
¶
Train the agent using the provided data.
Each data argument can be a string URL connecting to a agent-lightning server, or an AgentLightningClient instance connecting to a server (or mock server), or a dataset. If no algorithm is provided when instantiating the trainer, the data must be provided to connecting a server. Otherwise, dataset is also allowed and will be passed to the algorithm.
If the algorithm is instantiated and there is no URL/client provided, the algorithm will be responsible for creating a client that will connect to itself. It can also create a mock client if the algorithm does not require a server.
Source code in agentlightning/trainer/trainer.py
613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 |
|
fit_v2(agent, train_dataset=None, *, val_dataset=None)
¶
Run the training loop using the configured strategy, store, and runner.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
agent
|
LitAgent[T_co]
|
The LitAgent instance to be trained on. |
required |
train_dataset
|
Optional[Dataset[T_co]]
|
The dataset to train on. |
None
|
val_dataset
|
Optional[Dataset[T_co]]
|
The dataset to validate on. |
None
|
Source code in agentlightning/trainer/trainer.py
kill_orphaned_processes()
staticmethod
¶
Kill any orphaned processes that may have been left behind by previous runs. This is useful for cleaning up after crashes or unexpected exits.
Source code in agentlightning/trainer/trainer.py
agentlightning.tracer
¶
AgentOpsTracer
¶
Bases: BaseTracer
Traces agent execution using AgentOps.
This tracer provides functionality to capture execution details using the AgentOps library. It manages the AgentOps client initialization, server setup, and integration with the OpenTelemetry tracing ecosystem.
Attributes:
Name | Type | Description |
---|---|---|
agentops_managed |
Whether to automatically manage |
|
instrument_managed |
Whether to automatically manage instrumentation. When set to false, you will manage the instrumentation yourself and the tracer might not work as expected. |
|
daemon |
Whether the AgentOps server runs as a daemon process.
Only applicable if |
Source code in agentlightning/tracer/agentops.py
31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 |
|
get_langchain_callback_handler(tags=None)
¶
Get the Langchain callback handler for integrating with Langchain.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tags
|
List[str] | None
|
Optional list of tags to apply to the Langchain callback handler. |
None
|
Returns:
Type | Description |
---|---|
LangchainCallbackHandler
|
An instance of the Langchain callback handler. |
Source code in agentlightning/tracer/agentops.py
get_last_trace()
¶
Retrieves the raw list of captured spans from the most recent trace.
Returns:
Type | Description |
---|---|
List[ReadableSpan]
|
A list of OpenTelemetry |
Source code in agentlightning/tracer/agentops.py
trace_context(name=None, *, store=None, rollout_id=None, attempt_id=None)
¶
Starts a new tracing context. This should be used as a context manager.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
Optional[str]
|
Optional name for the tracing context. |
None
|
store
|
Optional[LightningStore]
|
Optional store to add the spans to. |
None
|
rollout_id
|
Optional[str]
|
Optional rollout ID to add the spans to. |
None
|
attempt_id
|
Optional[str]
|
Optional attempt ID to add the spans to. |
None
|
Yields:
Type | Description |
---|---|
LightningSpanProcessor
|
The LightningSpanProcessor instance to collect spans. |
Source code in agentlightning/tracer/agentops.py
BaseTracer
¶
Bases: ParallelWorkerBase
An abstract base class for tracers.
This class defines a standard interface for tracing code execution, capturing the resulting spans, and providing them for analysis. It is designed to be backend-agnostic, allowing for different implementations (e.g., for AgentOps, OpenTelemetry, Docker, etc.).
The primary interaction pattern is through the trace_context
context manager, which ensures that traces are properly started and captured,
even in the case of exceptions.
A typical workflow:
tracer = YourTracerImplementation()
try:
with tracer.trace_context(name="my_traced_task"):
# ... code to be traced ...
run_my_agent_logic()
except Exception as e:
print(f"An error occurred: {e}")
# Retrieve the trace data after the context block
spans: list[ReadableSpan] = tracer.get_last_trace()
# Process the trace data
if trace_tree:
rl_triplets = TraceTripletAdapter().adapt(spans)
# ... do something with the triplets
Source code in agentlightning/tracer/base.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 |
|
get_last_trace()
¶
Retrieves the raw list of captured spans from the most recent trace.
Returns:
Type | Description |
---|---|
List[ReadableSpan]
|
A list of OpenTelemetry |
trace_context(name=None, *, store=None, rollout_id=None, attempt_id=None)
¶
Starts a new tracing context. This should be used as a context manager.
The implementation should handle the setup and teardown of the tracing
for the enclosed code block. It must ensure that any spans generated
within the with
block are collected and made available via
get_last_trace
.
If a store is provided, the spans will be added to the store when tracing.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
Optional[str]
|
The name for the root span of this trace context. |
None
|
store
|
Optional[LightningStore]
|
The store to add the spans to. |
None
|
rollout_id
|
Optional[str]
|
The rollout ID to add the spans to. |
None
|
attempt_id
|
Optional[str]
|
The attempt ID to add the spans to. |
None
|
Source code in agentlightning/tracer/base.py
trace_run(func, *args, **kwargs)
¶
A convenience wrapper to trace the execution of a single synchronous function.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
func
|
Callable[..., Any]
|
The synchronous function to execute and trace. |
required |
*args
|
Any
|
Positional arguments to pass to the function. |
()
|
**kwargs
|
Any
|
Keyword arguments to pass to the function. |
{}
|
Returns:
Type | Description |
---|---|
Any
|
The return value of the function. |
Source code in agentlightning/tracer/base.py
trace_run_async(func, *args, **kwargs)
async
¶
A convenience wrapper to trace the execution of a single asynchronous function.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
func
|
Callable[..., Awaitable[Any]]
|
The asynchronous function to execute and trace. |
required |
*args
|
Any
|
Positional arguments to pass to the function. |
()
|
**kwargs
|
Any
|
Keyword arguments to pass to the function. |
{}
|
Returns:
Type | Description |
---|---|
Any
|
The return value of the function. |
Source code in agentlightning/tracer/base.py
OtelTracer
¶
Bases: BaseTracer
Tracer that provides a basic OpenTelemetry tracer provider.
You should be able to collect signals like rewards with this tracer,
but no other function instrumentations like openai.chat.completion
.
Source code in agentlightning/tracer/otel.py
get_last_trace()
¶
Retrieves the raw list of captured spans from the most recent trace.
Returns:
Type | Description |
---|---|
List[ReadableSpan]
|
A list of OpenTelemetry |
Source code in agentlightning/tracer/otel.py
trace_context(name=None, *, store=None, rollout_id=None, attempt_id=None)
¶
Starts a new tracing context. This should be used as a context manager.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
Optional[str]
|
Optional name for the tracing context. |
None
|
store
|
Optional[LightningStore]
|
Optional store to add the spans to. |
None
|
rollout_id
|
Optional[str]
|
Optional rollout ID to add the spans to. |
None
|
attempt_id
|
Optional[str]
|
Optional attempt ID to add the spans to. |
None
|
Yields:
Type | Description |
---|---|
LightningSpanProcessor
|
The LightningSpanProcessor instance to collect spans. |
Source code in agentlightning/tracer/otel.py
agentlightning.reward
¶
emit_reward(reward)
¶
Record a new reward as a new span.
Source code in agentlightning/emitter/reward.py
find_final_reward(spans)
¶
Get the last reward value from a list of spans.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
spans
|
Sequence[SpanLike]
|
A list of spans (either ReadableSpan or Span). |
required |
Returns:
Type | Description |
---|---|
Optional[float]
|
The reward value from the last reward span, or None if not found. |
Source code in agentlightning/emitter/reward.py
find_reward_spans(spans)
¶
Find all reward spans in the given list of spans.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
spans
|
Sequence[SpanLike]
|
A list of spans (either ReadableSpan or Span). |
required |
Returns:
Type | Description |
---|---|
List[SpanLike]
|
A list of spans whose name matches the reward span name. |
Source code in agentlightning/emitter/reward.py
get_reward_value(span)
¶
Get the reward value from a span.
Source code in agentlightning/emitter/reward.py
is_reward_span(span)
¶
reward(fn)
¶
A decorator to wrap a function that computes rewards. It will automatically handle the input and output of the function.
Source code in agentlightning/emitter/reward.py
Server Side¶
agentlightning.server
¶
Legacy server for the Agent Lightning framework. Deprecated in favor of agentlightning.store.
AgentLightningServer
¶
The main SDK class for developers to control the Agent Lightning Server.
This class manages the server lifecycle, task queueing, resources updates, and retrieval of results, providing a simple interface for the optimization logic.
Source code in agentlightning/server.py
162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 |
|
__init__(host='127.0.0.1', port=8000, task_timeout_seconds=300.0)
¶
Initializes the server controller.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
host
|
str
|
The host to bind the server to. |
'127.0.0.1'
|
port
|
int
|
The port to bind the server to. |
8000
|
task_timeout_seconds
|
float
|
Time in seconds after which a claimed task is considered stale and requeued. |
300.0
|
Source code in agentlightning/server.py
get_completed_rollout(rollout_id)
async
¶
Retrieves a specific completed rollout by its ID.
Source code in agentlightning/server.py
poll_completed_rollout(rollout_id, timeout=None)
async
¶
Polls for a completed rollout by its ID, waiting up to timeout
seconds.
Source code in agentlightning/server.py
queue_task(sample, mode=None, resources_id=None, metadata=None)
async
¶
Adds a task to the queue for a client to process.
Source code in agentlightning/server.py
retrieve_completed_rollouts()
async
¶
Retrieves all available completed trajectories and clears the internal store.
Source code in agentlightning/server.py
run_forever()
async
¶
Runs the server indefinitely until stopped. This is useful when async start and stop methods do not work.
start()
async
¶
Starts the FastAPI server in the background.
stop()
async
¶
Gracefully stops the running FastAPI server.
Source code in agentlightning/server.py
update_resources(resources)
async
¶
Updates the resources, creating a new version and setting it as the latest.
Source code in agentlightning/server.py
ServerDataStore
¶
A centralized, thread-safe, async, in-memory data store for the server's state. This holds the task queue, versioned resources, and completed rollouts.
Source code in agentlightning/server.py
30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 |
|
add_task(sample, mode=None, resources_id=None, metadata=None)
async
¶
Adds a new task to the queue with specific metadata and returns its unique ID.
Source code in agentlightning/server.py
get_latest_resources()
async
¶
Safely retrieves the latest version of named resources.
Source code in agentlightning/server.py
get_next_task()
async
¶
Retrieves the next task from the queue without blocking. Returns None if the queue is empty.
Source code in agentlightning/server.py
get_processing_tasks()
¶
get_resources_by_id(resources_id)
async
¶
Safely retrieves a specific version of named resources by its ID.
Source code in agentlightning/server.py
requeue_task(task)
async
¶
Requeues a task that has timed out and removes it from processing.
Source code in agentlightning/server.py
retrieve_completed_rollouts()
async
¶
Retrieves all completed rollouts and clears the store.
Source code in agentlightning/server.py
retrieve_rollout(rollout_id)
async
¶
Safely retrieves a single rollout by its ID, removing it from the store.
Source code in agentlightning/server.py
store_rollout(rollout)
async
¶
Safely stores a completed rollout from a client.
Source code in agentlightning/server.py
update_resources(update)
async
¶
Safely stores a new version of named resources and sets it as the latest.
Source code in agentlightning/server.py
Utilities¶
agentlightning.config
¶
This file is not carefully reviewed. It might contain unintentional bugs and issues. Please always review the parsed construction arguments before using them.
lightning_cli(*classes)
¶
Parses command-line arguments to configure and instantiate provided CliConfigurable classes.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
*classes
|
Type[CliConfigurable]
|
One or more classes that inherit from CliConfigurable. Each class's init parameters will be exposed as command-line arguments. |
()
|
Returns:
Type | Description |
---|---|
CliConfigurable | Tuple[CliConfigurable, ...]
|
A tuple of instantiated objects, corresponding to the input classes in order. |
Source code in agentlightning/config.py
nullable_float(value)
¶
Converts specific string values (case-insensitive) to None, otherwise returns the float.
Source code in agentlightning/config.py
nullable_int(value)
¶
Converts specific string values (case-insensitive) to None, otherwise returns the integer.
Source code in agentlightning/config.py
nullable_str(value)
¶
Converts specific string values (case-insensitive) to None, otherwise returns the string.
agentlightning.types
¶
NamedResources = Dict[str, ResourceUnion]
module-attribute
¶
A dictionary-like class to hold named resources.
Example
resources: NamedResources = { 'main_llm': LLM( endpoint="http://localhost:8080", model="llama3", sampling_parameters={'temperature': 0.7, 'max_tokens': 100} ), 'system_prompt': PromptTemplate( template="You are a helpful assistant.", engine='f-string' ) }
Attempt
¶
Bases: BaseModel
An attempt to execute a rollout. A rollout can have multiple attempts if retries are needed.
Source code in agentlightning/types/core.py
AttemptedRollout
¶
Bases: RolloutV2
A rollout along with its active attempt.
Source code in agentlightning/types/core.py
Dataset
¶
Bases: Protocol
, Generic[T_co]
The general interface for a dataset.
It's currently implemented as a protocol, having a similar interface to torch.utils.data.Dataset. You don't have to inherit from this class; you can use a simple list if you want to.
Source code in agentlightning/types/core.py
Event
¶
Bases: BaseModel
Corresponding to opentelemetry.trace.Event
Source code in agentlightning/types/tracer.py
GenericResponse
¶
Bases: BaseModel
A generic response message that can be used for various purposes.
Source code in agentlightning/types/core.py
Hook
¶
Bases: ParallelWorkerBase
Base class for defining hooks in the agent runner's lifecycle.
Source code in agentlightning/types/core.py
on_rollout_end(*, agent, runner, rollout, spans)
async
¶
Hook called after a rollout attempt completes.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
agent
|
LitAgent[Any]
|
The :class: |
required |
runner
|
BaseRunner[Any]
|
The :class: |
required |
rollout
|
RolloutV2
|
The :class: |
required |
spans
|
Union[List[ReadableSpan], List[Span]]
|
The spans that have been added to the store. |
required |
Subclasses can override this method for cleanup or additional logging. By default, this is a no-op.
Source code in agentlightning/types/core.py
on_rollout_start(*, agent, runner, rollout)
async
¶
Hook called immediately before a rollout attempt begins.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
agent
|
LitAgent[Any]
|
The :class: |
required |
runner
|
BaseRunner[Any]
|
The :class: |
required |
rollout
|
RolloutV2
|
The :class: |
required |
Subclasses can override this method to implement custom logic such as logging, metric collection, or resource setup. By default, this is a no-op.
Source code in agentlightning/types/core.py
on_trace_end(*, agent, runner, tracer, rollout)
async
¶
Hook called immediately after the rollout completes but before the tracer exits the trace context.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
agent
|
LitAgent[Any]
|
The :class: |
required |
runner
|
BaseRunner[Any]
|
The :class: |
required |
tracer
|
BaseTracer
|
The :class: |
required |
rollout
|
RolloutV2
|
The :class: |
required |
Subclasses can override this method to implement custom logic such as logging, metric collection, or resource cleanup. By default, this is a no-op.
Source code in agentlightning/types/core.py
on_trace_start(*, agent, runner, tracer, rollout)
async
¶
Hook called immediately after the tracer enters the trace context but before the rollout begins.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
agent
|
LitAgent[Any]
|
The :class: |
required |
runner
|
BaseRunner[Any]
|
The :class: |
required |
tracer
|
BaseTracer
|
The :class: |
required |
rollout
|
RolloutV2
|
The :class: |
required |
Subclasses can override this method to implement custom logic such as logging, metric collection, or resource setup. By default, this is a no-op.
Source code in agentlightning/types/core.py
LLM
¶
Bases: Resource
Provide an LLM endpoint and model name as a resource.
Attributes:
Name | Type | Description |
---|---|---|
endpoint |
str
|
The URL of the LLM API endpoint. |
model |
str
|
The identifier for the model to be used (e.g., 'gpt-4o'). |
sampling_parameters |
SamplingParameters
|
A dictionary of hyperparameters for model inference, such as temperature, top_p, etc. |
Source code in agentlightning/types/resources.py
get_base_url(*args, **kwargs)
¶
The base_url to put into openai.OpenAI.
Users are encouraged to use base_url
to get the LLM endpoint instead of accessing endpoint
directly.
Link
¶
Bases: BaseModel
Corresponding to opentelemetry.trace.Link
Source code in agentlightning/types/tracer.py
ParallelWorkerBase
¶
Base class for objects that can be parallelized across multiple worker processes.
This class defines the standard lifecycle for parallel processing:
Main Process
- init() - Initialize the object in the main process
- spawn workers and call init_worker() in each worker
- run() - Execute the main workload in parallel across workers
- teardown_worker() - Clean up resources in each worker
- teardown() - Final cleanup in the main process
Subclasses should implement the run() method and optionally override the lifecycle methods for custom initialization and cleanup behavior.
Source code in agentlightning/types/core.py
PromptTemplate
¶
Bases: Resource
A prompt template as a resource.
Attributes:
Name | Type | Description |
---|---|---|
template |
str
|
The template string. The format depends on the engine. |
engine |
Literal['jinja', 'f-string', 'poml']
|
The templating engine to use for rendering the prompt. I imagine users can use their own customized engines, but algos can only well operate on a subset of them. |
Source code in agentlightning/types/resources.py
format(**kwargs)
¶
Format the prompt template with the given kwargs.
Source code in agentlightning/types/resources.py
ProxyLLM
¶
Bases: LLM
Proxy LLM resource that is tailored by llm_proxy.LLMProxy
.
Source code in agentlightning/types/resources.py
__getattribute__(name)
¶
Override to emit a warning when endpoint is accessed directly.
Source code in agentlightning/types/resources.py
model_post_init(__context)
¶
Mark initialization as complete after Pydantic finishes setup.
with_attempted_rollout(rollout)
¶
Bake the rollout and attempt id into the endpoint.
Source code in agentlightning/types/resources.py
Resource
¶
Bases: BaseModel
Corresponding to opentelemetry.sdk.resources.Resource
Source code in agentlightning/types/tracer.py
ResourcesUpdate
¶
Bases: BaseModel
A resource update message to be sent from the server to clients.
This message contains a dictionary of resources that clients should use for subsequent tasks. It is used to update the resources available to clients dynamically.
Source code in agentlightning/types/resources.py
Rollout
¶
Bases: BaseModel
The standard reporting object from client to server.
Source code in agentlightning/types/core.py
RolloutConfig
¶
Bases: BaseModel
Configurations for rollout execution.
Source code in agentlightning/types/core.py
SpanAttributeNames
¶
SpanContext
¶
Bases: BaseModel
Corresponding to opentelemetry.trace.SpanContext
Source code in agentlightning/types/tracer.py
SpanNames
¶
Bases: str
, Enum
Standard span name values for AgentLightning.
Currently reward, message, object and exception spans are supported. We will add more spans related to error handling in the future.
Source code in agentlightning/types/tracer.py
Task
¶
Bases: BaseModel
A task (rollout request) to be processed by the client agent.
Source code in agentlightning/types/core.py
TraceStatus
¶
Bases: BaseModel
Corresponding to opentelemetry.trace.Status
Source code in agentlightning/types/tracer.py
Triplet
¶
Bases: BaseModel
A standard structure for a single turn in a trajectory.