LLM Agents
An agent is a special kind of tool that uses an inline prompt and tools to solve a task.
We want to build a script that can investigate the most recent run failures in a GitHub repository using GitHub Actions. To do so, we probably will need to the following agents:
- query the GitHub API,
agent_github
- compute some git diff to determine which changes broken the build,
agent_git
- read or search files
agent_fs
script({ tools: ["agent_fs", "agent_git", "agent_github", ...], ...})
Each of these agent is capable of calling an LLM with a specific set of tools to accomplish a task.
The full script source code is available below:
script({ tools: [ "agent_fs", "agent_git", "agent_github", "agent_interpreter", "agent_docs", ], model: "reasoning", parameters: { jobUrl: { type: "string" }, // URL of the job workflow: { type: "string" }, // Workflow name failure_run_id: { type: "number" }, // ID of the failed run branch: { type: "string" }, // Branch name },})
const { workflow = "build.yml", failure_run_id, branch = await git.branch(), jobUrl,} = env.vars
if (jobUrl) { $`1. Extract the run id and job id from the ${jobUrl}` $`2. Find the last successful run before the failed run for the same workflow and branch`} else if (failure_run_id) { $`1. Find the failed run ${failure_run_id} of ${workflow} for branch ${branch} 2. Find the last successful run before the failed run for the same workflow and branch`} else { $`0. Find the worflow ${workflow} in the repository1. Find the latest failed run of ${workflow} for branch ${branch}2. Find the last successful run before the failed run`}$`3. Compare the run job logs between the failed run and the last successful run4. git diff the failed run commit (head_sha) and the last successful run commit - show a diff of the source code that created the problem if possible5. Analyze all the above information and identify the root cause of the failure - generate a patch to fix the problem if possible6. Generate a detailled report of the failure and the root cause - include a list of all HTML urls to the relevant runs, commits, pull requests or issues - include diff of code changes - include the patch if generated - include a summary of the root cause`
defOutputProcessor(async ({ messages }) => { await runPrompt((_) => { _.$`- Generate a pseudo code summary of the plan implemented in MESSAGES. MESSAGES is a LLM conversation with tools. - Judge the quality of the plan and suggest 2 improvements. - Generate a python program that optimizes the plan in code. Assume "llm" is a LLM call.` _.def( "MESSAGES", messages .map( (msg) => _.$`- ${msg.role}: ${msg.content || JSON.stringify(msg)}` ) .join("\n") ) }) return undefined})
Multiple instances of the same agent
Section titled “Multiple instances of the same agent”Some agents, like agent_git
, can be instantiated with different parameters, like working on different repositories.
script({ system: [ "system.agent_git", { id: "system.agent_git", parameters: { repo: "microsoft/jacdac", variant: "jacdac" }, }, ],})
$`Generate a table with the last commits of the jacdac and current git repository?`
In such case, make sure to provide a variant
argument that will be used to generate a unique agent name.
To split or not to split
Section titled “To split or not to split”You could try to load all the tools in the same LLM call and run the task as a single LLM conversation. Results may vary.
script({ tools: ["fs", "git", "github", ...], ...})