LLM Agents
An agent is a special kind of tool that uses an inline prompt and tools to solve a task.
Usage
We want to build a script that can investigate the most recent run failures in a GitHub repository using GitHub Actions. To do so, we probably will need to the following agents:
- query the GitHub API,
agent_github
- compute some git diff to determine which changes broken the build,
agent_git
- read or search files
agent_fs
script({ tools: ["agent_fs", "agent_git", "agent_github", ...], ...})
Each of these agent is capable of calling an LLM with a specific set of tools to accomplish a task.
The full script source code is available below:
script({ tools: [ "agent_fs", "agent_git", "agent_github", "agent_interpreter", "agent_docs", ], model: "reasoning", parameters: { jobUrl: { type: "string" }, // URL of the job workflow: { type: "string" }, // Workflow name failure_run_id: { type: "number" }, // ID of the failed run branch: { type: "string" }, // Branch name },})
const { workflow = "build.yml", failure_run_id, branch = await git.branch(), jobUrl,} = env.vars
if (jobUrl) { $`1. Extract the run id and job id from the ${jobUrl}` $`2. Find the last successful run before the failed run for the same workflow and branch`} else if (failure_run_id) { $`1. Find the failed run ${failure_run_id} of ${workflow} for branch ${branch} 2. Find the last successful run before the failed run for the same workflow and branch`} else { $`0. Find the worflow ${workflow} in the repository1. Find the latest failed run of ${workflow} for branch ${branch}2. Find the last successful run before the failed run`}$`3. Compare the run job logs between the failed run and the last successful run4. git diff the failed run commit (head_sha) and the last successful run commit - show a diff of the source code that created the problem if possible5. Analyze all the above information and identify the root cause of the failure - generate a patch to fix the problem if possible6. Generate a detailled report of the failure and the root cause - include a list of all HTML urls to the relevant runs, commits, pull requests or issues - include diff of code changes - include the patch if generated - include a summary of the root cause`
defOutputProcessor(async ({ messages }) => { await runPrompt((_) => { _.$`- Generate a pseudo code summary of the plan implemented in MESSAGES. MESSAGES is a LLM conversation with tools. - Judge the quality of the plan and suggest 2 improvements. - Generate a python program that optimizes the plan in code. Assume "llm" is a LLM call.` _.def( "MESSAGES", messages .map( (msg) => _.$`- ${msg.role}: ${msg.content || msg.value || JSON.stringify(msg)}` ) .join("\n") ) }) return undefined})
Multiple instances of the same agent
Some agents, like agent_git
, can be instantiated with different parameters, like working on different repositories.
script({ system: [ "system.agent_git", { id: "system.agent_git", parameters: { repo: "microsoft/jacdac", variant: "jacdac" }, }, ],})
$`Generate a table with the last commits of the jacdac and current git repository?`
In such case, make sure to provide a variant
argument that will be used to generate a unique agent name.
To split or not to split
You could try to load all the tools in the same LLM call and run the task as a single LLM conversation. Results may vary.
script({ tools: ["fs", "git", "github", ...], ...})