Enquêteur GitHub Action

AI generated translation.

L’exemple suivant montre un script qui analyse un journal de tâche d’un GitHub Action Workflow et tente de déterminer la cause première du problème.

Stratégie

Le script est un hybride entre un logiciel traditionnel et un logiciel basé sur LLM/Agent. Nous commençons par collecter des informations pertinentes pour le LLM, afin de remplir le contexte avec des informations pertinentes, puis nous laissons l’agent raisonner et demander plus d’informations si nécessaire à travers des outils.

La première partie du script est un logiciel traditionnel qui collecte les informations et prépare le contexte pour le LLM. Il utilise la stratégie simple suivante :

trouver des informations sur l’exécution du workflow ayant échoué, y compris le commit et les journaux de tâches
trouver la dernière exécution réussie du workflow, si elle existe, et rassembler le commit ainsi que les journaux de tâches
construire un contexte LLM avec toutes les informations, y compris les différences de commits et les différences des journaux de tâches.

Les informations collectées dans cette section ne sont pas hallucinaient par conception et sont ajoutées au résultat final en utilisant l’objet env.output.

La seconde partie est un agent qui utilise le LLM pour raisonner sur les informations et demander plus d’informations si nécessaire.

Ajouter le script

Ouvrez votre dépôt GitHub et lancez une nouvelle pull request.
Ajoutez le script suivant à votre dépôt sous le nom genaisrc/prr.genai.mts.

script({
  title: "GitHub Action Investigator",
  description: "Analyze GitHub Action runs to find the root cause of a failure",
  parameters: {
    /** the user can get the url from the github web
     *  like 14890513008 or https://github.com/microsoft/genaiscript/actions/runs/14890513008
     */
    runId: {
      type: "number",
      description: "Run identifier",
    },
    jobId: {
      type: "number",
      description: "Job identifier",
    },
    runUrl: {
      type: "string",
      description: "Run identifier or URL",
    },
  },
  system: ["system", "system.assistant", "system.annotations", "system.files"],
  flexTokens: 10000,
  cache: "gai",
  model: "small",
  tools: ["agent_fs", "agent_github", "agent_git"],
});
const { dbg, output, vars } = env;

output.heading(2, "Investigator report");
output.heading(3, "Context collection");
const { owner, repo } = await github.info();

let runId: number = vars.runId;
let jobId: number = vars.jobId;
if (isNaN(runId)) {
  const runUrl = vars.runUrl;
  output.itemLink(`run url`, runUrl);

  // Retrieve repository information
  const { runRepo, runOwner, runIdUrl, jobIdUrl } =
    /^https:\/\/github\.com\/(?<runOwner>\w+)\/(?<runRepo>\w+)\/actions\/runs\/(?<runIdUrl>\d+)?(\/job\/(?<jobIdRun>\d+))?/i.exec(
      runUrl,
    )?.groups || {};
  if (!runRepo)
    throw new Error(
      "Url not recognized. Please provide a valid URL https://github.com/<owner>/<repo>/actions/runs/<runId>/...",
    );
  runId = parseInt(runIdUrl);
  dbg(`runId: ${runId}`);

  jobId = parseInt(jobIdUrl);
  dbg(`jobId: ${jobId}`);

  if (runOwner !== owner)
    cancel(`Run owner ${runOwner} does not match the current repository owner ${owner}`);
  if (runRepo !== repo)
    cancel(`Run repository ${runRepo} does not match the current repository ${repo}`);
}

if (isNaN(runId)) throw new Error("You must provide a runId or runUrl");
output.itemValue(`run id`, runId);
// fetch run
const run = await github.workflowRun(runId);
dbg(`run: %O`, run);
const branch = run.head_branch;
dbg(`branch: ${branch}`);

const workflow = await github.workflow(run.workflow_id);
dbg(`workflow: ${workflow.name}`);

// List workflow runs for the specified workflow and branch
const runs = await github.listWorkflowRuns(workflow.id, {
  status: "completed",
  branch,
  count: 100,
});
runs.reverse(); // from newest to oldest

dbg(
  `runs: %O`,
  runs.map(({ id, conclusion, workflow_id, html_url, run_started_at }) => ({
    id,
    conclusion,
    workflow_id,
    html_url,
    run_started_at,
  })),
);

const reversedRuns = runs.filter((r) => new Date(r.run_started_at) <= new Date(run.run_started_at));
if (!reversedRuns.length) cancel("No runs found");
dbg(
  `reversed runs: %O`,
  reversedRuns.map(({ id, conclusion, workflow_id, html_url, run_started_at }) => ({
    id,
    conclusion,
    workflow_id,
    html_url,
    run_started_at,
  })),
);

const firstFailedJobs = await github.listWorkflowJobs(run.id);
const firstFailedJob =
  firstFailedJobs.find(({ conclusion }) => conclusion === "failure") ?? firstFailedJobs[0];
const firstFailureLog = firstFailedJob.content;
if (!firstFailureLog) cancel("No logs found");
output.itemLink(`failed job`, firstFailedJob.html_url);

// resolve the latest successful workflow run
const lastSuccessRun = reversedRuns.find(({ conclusion }) => conclusion === "success");
if (lastSuccessRun)
  output.itemLink(`last successful run #${lastSuccessRun.run_number}`, lastSuccessRun.html_url);
else output.item(`last successful run not found`);

let gitDiffRef: string;
let logRef: string;
let logDiffRef: string;
if (lastSuccessRun) {
  if (lastSuccessRun.head_sha === run.head_sha) {
    console.debug("No previous successful run found");
  } else {
    output.itemLink(
      `diff (${lastSuccessRun.head_sha.slice(0, 7)}...${run.head_sha.slice(0, 7)})`,
      `https://github.com/${owner}/${repo}/compare/${lastSuccessRun.head_sha}...${run.head_sha}`,
    );

    // Execute git diff between the last success and failed run commits
    await git.fetch("origin", lastSuccessRun.head_sha);
    await git.fetch("origin", run.head_sha);
    const gitDiff = await git.diff({
      base: lastSuccessRun.head_sha,
      head: run.head_sha,
      excludedPaths: "**/genaiscript.d.ts",
    });

    if (gitDiff) {
      gitDiffRef = def("GIT_DIFF", gitDiff, {
        language: "diff",
        lineNumbers: true,
        flex: 1,
      });
    }
  }
}

if (!lastSuccessRun) {
  // Define log content if no last successful run is available
  logRef = def("LOG", firstFailureLog, {
    maxTokens: 20000,
    lineNumbers: false,
  });
} else {
  const lastSuccessJobs = await github.listWorkflowJobs(lastSuccessRun.id);
  const lastSuccessJob = lastSuccessJobs.find(({ name }) => firstFailedJob.name === name);
  if (!lastSuccessJob)
    console.debug(`could not find job ${firstFailedJob.name} in last success run`);
  else {
    output.itemLink(`last successful job`, lastSuccessJob.html_url);
    const jobDiff = await github.diffWorkflowJobLogs(firstFailedJob.id, lastSuccessJob.id);
    // Generate a diff of logs between the last success and failed runs
    logDiffRef = def("LOG_DIFF", jobDiff, {
      language: "diff",
      lineNumbers: false,
    });
  }
}

// Instruction for generating a report based on the analysis
$`Your are an expert software engineer and you are able to analyze the logs and find the root cause of the failure.

${lastSuccessRun ? `You are analyzing 2 GitHub Action Workflow Runs: a SUCCESS_RUN and a FAILED_RUN.` : ""}

${gitDiffRef ? `- ${gitDiffRef} contains a git diff of the commits of SUCCESS_RUN and FAILED_RUN` : ""}
${logDiffRef ? `- ${logDiffRef} contains a workflow job diff of SUCCESS_RUN and FAILED_RUN` : ""}
${logRef ? `- ${logRef} contains the log of the FAILED_RUN` : ""}

${lastSuccessRun ? `- The SUCCESS_RUN is the last successful workflow run (head_sha: ${lastSuccessRun})` : ""}
- The FAILED_RUN is the workflow run that failed (head_sha: ${run.head_sha})

## Task
Analyze the diff in LOG_DIFF and provide a summary of the root cause of the failure.

Show the code that is responsible for the failure.
If you cannot find the root cause, stop.

Investigate potential fixes for the failure.
If you find a solution, generate a diff with suggested fixes. Use a diff format.
If you cannot locate the error, do not generate a diff.

## Instructions

Use 'agent_fs', 'agent_git' and 'agent_github' if you need more information.
Do not invent git or github information.
You have access to the entire source code through the agent_fs tool.
`;

output.heading(2, `AI Analysis`);

Automatiser avec GitHub Actions

En utilisant GitHub Actions et GitHub Models, vous pouvez automatiser l’exécution du script et la création des commentaires.

Vous pouvez décider d’activer ou de désactiver la partie agentique du script en commentant la ligne agent_*. Un script sans agent a un comportement prévisible de consommation de jetons (c’est un appel LLM) ; un script agentique entrera dans une boucle et consommera plus de jetons, mais il sera capable de demander plus d’informations si nécessaire.

name: genai investigator
on:
    workflow_run:
        workflows: ["build", "playwright", "ollama"]
        types:
            - completed
concurrency:
    group: ${{ github.workflow }}-${{ github.ref }}-${{ github.event.workflow_run.event }}-${{ github.event.workflow_run.conclusion }}
    cancel-in-progress: true
permissions:
    contents: read
    actions: read
    pull-requests: write
    models: read
jobs:
    investigate:
        # Only run this job if the workflow run concluded with a failure
        # and was triggered by a pull request event
        if: ${{ github.event.workflow_run.conclusion == 'failure' && github.event.workflow_run.event == 'pull_request' }}
        runs-on: ubuntu-latest
        steps:
            - uses: actions/checkout@v4
              with:
                  submodules: "recursive"
                  fetch-depth: 10
            - uses: pnpm/action-setup@v4
            - uses: actions/setup-node@v4
              with:
                  node-version: "22"
                  cache: pnpm
            - run: pnpm install --frozen-lockfile
            - name: compile
              run: pnpm compile
            - name: genaiscript gai
              run: node packages/cli/dist/src/index.js run gai -p github --pull-request-comment --vars "runId=${{ github.event.workflow_run.id }}" --out-trace $GITHUB_STEP_SUMMARY
              env:
                  GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

Avons-nous fini ?

Non ! Ce script est loin d’être parfait et en fait, il nécessite probablement de meilleures heuristiques pour construire le contexte spécifique à votre dépôt. C’est un bon point de départ, mais vous devrez ajuster les heuristiques pour le faire fonctionner pour votre dépôt.