TLA+ AI Linter

AI generated translation.

TLA+ est un langage de haut niveau pour modéliser des programmes et des systèmes—en particulier les systèmes concurrents et distribués. Il repose sur l’idée que la meilleure façon de décrire précisément les choses est d’utiliser des mathématiques simples.

TLA+ n’est pas fourni avec un linter ou un formateur traditionnel. Le TLA+ AI Linter est un script GenAI qui utilise des LLMs pour analyser les fichiers TLA+.

Spécifications TLA+

Voici une spécification TLA+ qui modélise une solution fondamentale au problème de détection de terminaison dans les systèmes distribués.

------------------------------- MODULE EWD998PCal -------------------------------
(***************************************************************************)
(* TLA+ specification of an algorithm for distributed termination          *)
(* detection on a ring, due to Shmuel Safra, published as EWD 998:         *)
(* Shmuel Safra's version of termination detection.                        *)
(* https://www.cs.utexas.edu/users/EWD/ewd09xx/EWD998.PDF                  *)
(***************************************************************************)
EXTENDS Integers, Bags, BagsExt

CONSTANT N
ASSUME NAssumption == N \in Nat \ {0} \* At least one node.

Node == 0 .. N-1

Initiator == 0 \* Any node can be the initiator; 0 has just been conveniently choosen to simplify the definition of token initiation.

(********
--algorithm ewd998 {

  variables
    (*
        Although we know the relationship between the counter and network, modeling network as a set of messages would be too cumbersome.
        We have two alternatives for modeling the network: as a bag of messages or as a sequence of messages. Although modeling it as a
        sequence may seem more intuitive, we do not require its ordering properties for our purposes. Therefore, we have decided to use a
        bag to represent the network. It's worth noting that Distributed Plucal refers to this concept as a "channel".
    *)
    network = [n \in Node |-> IF n = Initiator THEN SetToBag({[type|-> "tok", q |-> 0, color |-> "black"]}) ELSE EmptyBag];

  define {
    (*
      The passMsg operator is not implementable -at least not without using extra synchronization- because it atomically reads a message
      from the nic's in-buffer and writes to its out-buffer!
    *)
    passMsg(net, from, oldMsg, to, newMsg) == [ net EXCEPT ![from] = BagRemove(@, oldMsg), ![to] = BagAdd(@, newMsg) ]
    sendMsg(net, to, msg) == [ net EXCEPT ![to] = BagAdd(@, msg) ]
    dropMsg(net, to, msg) == [ net EXCEPT ![to] = BagRemove(@, msg) ]
    pendingMsgs(net, rcv) == DOMAIN net[rcv]
  }

  fair process (node \in Node)
    variables active \in BOOLEAN, color = "black", counter = 0;
  {
l:  while (TRUE) {

      either { \* send some payload message to some other node.
        when active;
        with (to \in Node \ {self}) {
          network := sendMsg(network, to, [type|-> "pl"]);
        };
        counter := counter + 1

      } or { \* receive a payload message. Reactivates the node.
        with (msg \in pendingMsgs(network, self)) {
            when msg.type = "pl";
            counter := counter - 1;
            active := TRUE;
            color := "black";
            network := dropMsg(network, self, msg)
        }

      } or { \* terminate the current node.
        active := FALSE

      } or { \* pass the token to the next node.
        when self # Initiator;
        with (tok \in pendingMsgs(network, self)) {
            when tok.type = "tok" /\ ~active;
            network := passMsg(network, self, tok, self-1, [type|-> "tok", q |-> tok.q + counter, color |-> (IF color = "black" THEN "black" ELSE tok.color)]);
            color := "white";
        }

      } or { \* Initiate token.
        when self = Initiator;
        with (tok \in pendingMsgs(network, self)) {
            when tok.type = "tok" /\ (color = "black" \/ tok.q + counter # 0 \/ tok.color = "black");
            network := passMsg(network, self, tok, N-1, [type|-> "tok", q |-> 0, color |-> "white"]);
            color := "white";
        }
      }
    }
  }
}
********)
\* BEGIN TRANSLATION (chksum(pcal) = "4d658e04" /\ chksum(tla) = "530581e3")
VARIABLE network

(* define statement *)
passMsg(net, from, oldMsg, to, newMsg) == [ net EXCEPT ![from] = BagRemove(@, oldMsg), ![to] = BagAdd(@, newMsg) ]
sendMsg(net, to, msg) == [ net EXCEPT ![to] = BagAdd(@, msg) ]
dropMsg(net, to, msg) == [ net EXCEPT ![to] = BagRemove(@, msg) ]
pendingMsgs(net, rcv) == DOMAIN net[rcv]

VARIABLES active, color, counter

vars == << network, active, color, counter >>

ProcSet == (Node)

Init == (* Global variables *)
        /\ network = [n \in Node |-> IF n = Initiator THEN SetToBag({[type|-> "tok", q |-> 0, color |-> "black"]}) ELSE EmptyBag]
        (* Process node *)
        /\ active \in [Node -> BOOLEAN]
        /\ color = [self \in Node |-> "black"]
        /\ counter = [self \in Node |-> 0]

node(self) == \/ /\ active[self]
                 /\ \E to \in Node \ {self}:
                      network' = sendMsg(network, to, [type|-> "pl"])
                 /\ counter' = [counter EXCEPT ![self] = counter[self] + 1]
                 /\ UNCHANGED <<active, color>>
              \/ /\ \E msg \in pendingMsgs(network, self):
                      /\ msg.type = "pl"
                      /\ counter' = [counter EXCEPT ![self] = counter[self] - 1]
                      /\ active' = [active EXCEPT ![self] = TRUE]
                      /\ color' = [color EXCEPT ![self] = "black"]
                      /\ network' = dropMsg(network, self, msg)
              \/ /\ active' = [active EXCEPT ![self] = FALSE]
                 /\ UNCHANGED <<network, color, counter>>
              \/ /\ self # Initiator
                 /\ \E tok \in pendingMsgs(network, self):
                      /\ tok.type = "tok" /\ ~active[self]
                      /\ network' = passMsg(network, self, tok, self-1, [type|-> "tok", q |-> tok.q + counter[self], color |-> (IF color[self] = "black" THEN "black" ELSE tok.color)])
                      /\ color' = [color EXCEPT ![self] = "white"]
                 /\ UNCHANGED <<active, counter>>
              \/ /\ self = Initiator
                 /\ \E tok \in pendingMsgs(network, self):
                      /\ tok.type = "tok" /\ (color[self] = "black" \/ tok.q + counter[self] # 0 \/ tok.color = "black")
                      /\ network' = passMsg(network, self, tok, N-1, [type|-> "tok", q |-> 0, color |-> "white"])
                      /\ color' = [color EXCEPT ![self] = "white"]
                 /\ UNCHANGED <<active, counter>>

Next == (\E self \in Node: node(self))

Spec == /\ Init /\ [][Next]_vars
        /\ \A self \in Node : WF_vars(node(self))

\* END TRANSLATION

-----------------------------------------------------------------------------

token ==
    LET tpos == CHOOSE i \in Node : \E m \in DOMAIN network[i]: m.type = "tok"
        tok == CHOOSE m \in DOMAIN network[tpos] : m.type = "tok"
    IN [pos |-> tpos, q |-> tok.q, color |-> tok.color]

pending ==
    [n \in Node |-> IF [type|->"pl"] \in DOMAIN network[n] THEN network[n][[type|->"pl"]] ELSE 0]

EWD998 == INSTANCE EWD998

EWD998Spec == EWD998!Init /\ [][EWD998!Next]_EWD998!vars \* Not checking liveness because we cannot easily define fairness for what ewd998 calls system actions.

THEOREM Spec => EWD998Spec

-----------------------------------------------------------------------------

Alias ==
    [
        network |-> network,
        active |-> active,
        color |-> color,
        counter |-> counter,
        token |-> token,
        pending |-> pending
    ]

StateConstraint ==
    \A i \in DOMAIN counter : counter[i] < 3

=============================================================================

Script

Le script GenAI suivant va effectuer le linting de la spécification TLA+ ci-dessus. Plus précisément, il va vérifier si les commentaires en prose dans la spécification sont cohérents avec les définitions TLA+.

// metadata (including model parameters)
// learn more at https://aka.ms/genaiscript
script({ title: "tlAI-Linter", description: "Check if the prose comments and their TLA+ declarations and definitions are syntactically and semantically consistent" })

// use def to emit LLM variables
def("TLA+", env.files.filter(f => f.filename.endsWith(".tla")), {lineNumbers: true})

// use $ to output formatted text to the prompt
$`You are an expert at TLA+/TLAPLUS. Your task is to check if the prose comments and their TLA+ declarations and definitions are syntactically and semantically consistent!!!
Explain any consistencies and inconsistencies you may find.  Report inconsistent and consistent pairs in a single ANNOTATION section.

## TLA+ Syntax Hints
- A formula [A]_v is called a temporal formula, and is shorthand for the formula A \/ v' = v.  In other words, the formula is true if A is true or if the value of v remains unchanged.  Usually v is a tuple of the spec's variables.
- The symbol \`#\` is alternative syntax used for inequality in TLA+; the other symbol is \`/=\".

## TLA+ Semantics Hints
- Do NOT add any invariants or properties to the behavior specification Spec or any of its subformulas.  This would change THEOREM Spec => Inv into THEOREM Spec /\ Inv => Inv, which is vacuously true.
- TLA+ specs are always stuttering insensitive, i.e., the next-state relation is always [A]_v.  In other words, one cannot write a stuttering sensitive specification.

## TLA+ Convention Hints
- The type correctness invariant is typically called TypeOK.
- Users can employ TLA labels as a means to conceptually associate a comment with a sub-formula like a specific disjunct or conjunct of a TLA formula. Even though these labels have no other function, they facilitate referencing particular parts of the formula from a comment.

## Formal and informal math Hints
- Take into account that humans may write informal math that is syntactically different from the formal math, yet semantically equivalent.  For example, humans may write \`N > 3T\` instead of \`N > 3 * T\`.
`

des numéros de ligne sont ajoutés au contenu du fichier pour aider le LLM à localiser précisément les problèmes.

def(
  "TLA+",
  env.files.filter((f) => f.filename.endsWith(".tla")),
  { lineNumbers: true },
);

le script utilise un support intégré pour les annotations afin de générer des avertissements et erreurs analysables. Les annotations sont automatiquement intégrées comme problèmes dans VSCode ou comme erreurs de compilation dans la chaîne CI/CD.

$`Report inconsistent and consistent pairs in a single ANNOTATION section.`;

GPT-4 connaît déjà beaucoup de choses sur la logique et les mathématiques de base. Cependant, le script énumère également des idiomes TLA+ courants qui sont pertinents pour l’analyse d’une spécification.

Github Action

name: tlAI-linter

on:
    pull_request:
        branches:
            - main
            - dev

jobs:
    linting:
        name: tlAI-linter

        runs-on: ubuntu-latest

        env:
            OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
            OPENAI_API_BASE: ${{ secrets.OPENAI_API_BASE }}
            OPENAI_API_TYPE: ${{ secrets.OPENAI_API_TYPE }}

        defaults:
            run:
                shell: bash

        steps:
            - name: Clone repo
              uses: actions/checkout@v4
              with:
                  ## All history for git diff below to succeed.
                  fetch-depth: 0

            - name: Setup NodeJS
              ## https://github.com/actions/setup-node
            - uses: pnpm/action-setup@v4
              uses: actions/setup-node@v4
              with:
                  node-version: "22"

            - name: Run GenAIscript on the TLA+ specs that are added in this pull request.
              ## Identify git diff: $(git diff --name-only HEAD^ | grep '.tla')
              ## Install genaiscript runtime: https://microsoft.github.io/genaiscript/reference/cli/
              ## Output LLM response in SARIF format: https://microsoft.github.io/genaiscript/reference/scripts/annotations/
              run: npx --yes genaiscript run .github/scripts/tlAI-Linter.genai.js $(git diff --name-only HEAD^ | grep '.tla') -oa results.sarif

            - name: Upload SARIF file
              ## https://sarifweb.azurewebsites.net
              ## https://docs.github.com/en/code-security/code-scanning/integrating-with-code-scanning/uploading-a-sarif-file-to-github
              if: success() || failure()
              uses: github/codeql-action/upload-sarif@v3
              with:
                  sarif_file: results.sarif

après avoir cloné le dépôt et installé des dépendances telles que node.js, le script GenAI est exécuté pour analyser les spécifications TLA+ qui ont été ajoutées ou modifiées dans la Pull Request.
La sortie du script, c’est-à-dire les annotations générées par le LLM, est formatée comme un rapport SARIF et téléversée dans la Pull Request.

Résultats

Le linter a généré des annotations pour chaque commentaire en prose dans la spécification, et un commentaire a été identifié comme incohérent par rapport aux définitions TLA+. Un avertissement correspondant est ajouté à la PR.

$The image shows a screenshot of a GitHub pull request page. The pull request is titled "Update EDW998PCal.tla #1" and is from a branch named "lemmy-patch-1" into the "master" branch. A bot identified as "github-advanced-security" has flagged potential problems 25 minutes ago. Below, part of a file is shown with a diff view highlighting changes between two versions. Two lines have been altered, where an "ASSUME" statement's comment has been updated from "At least one node" to "Any number of nodes between zero and infinitely many." There's an alert from the TLA+ Linter indicating that the comment is consistent with the TLA+ declaration `N \in Nat \ {0}` which means N is a natural number excluding zero. Below this, there is a dismissible alert box titled "Check notice" with a "Dismiss alert" button. At the bottom of the screenshot, there's a note that "Some checks were not successful," indicating 1 successful check and 1 failing check. The successful check is by "TLA-linter / TLA-linter (pull_request)" and the failing check is "Code scanning results / genaiscript." There's also a reference to a commit push action to the "master" branch from another branch and a prompt suggesting to add more commits by pushing to the "lemmy-patch-1" branch on "lemmy/Examples".$