TLA+ AI Linter

TLA+ is a high-level language for modeling programs and systems—especially concurrent and distributed ones. It’s based on the idea that the best way to describe things precisely is with simple mathematics.

TLA+ does not come with a traditional linter or formatter. The TLA+ AI Linter is a GenAI script that uses LLMs to lint TLA+ files.

TLA+ specifications

The following is a TLA+ spec that models a seminal solution to the termination detection problem in distributed systems.

------------------------------- MODULE EWD998PCal -------------------------------
(***************************************************************************)
(* TLA+ specification of an algorithm for distributed termination          *)
(* detection on a ring, due to Shmuel Safra, published as EWD 998:         *)
(* Shmuel Safra's version of termination detection.                        *)
(* https://www.cs.utexas.edu/users/EWD/ewd09xx/EWD998.PDF                  *)
(***************************************************************************)
EXTENDS Integers, Bags, BagsExt

CONSTANT N
ASSUME NAssumption == N \in Nat \ {0} \* At least one node.

Node == 0 .. N-1

Initiator == 0 \* Any node can be the initiator; 0 has just been conveniently choosen to simplify the definition of token initiation.

(********
--algorithm ewd998 {

  variables
    (*
        Although we know the relationship between the counter and network, modeling network as a set of messages would be too cumbersome.
        We have two alternatives for modeling the network: as a bag of messages or as a sequence of messages. Although modeling it as a
        sequence may seem more intuitive, we do not require its ordering properties for our purposes. Therefore, we have decided to use a
        bag to represent the network. It's worth noting that Distributed Plucal refers to this concept as a "channel".
    *)
    network = [n \in Node |-> IF n = Initiator THEN SetToBag({[type|-> "tok", q |-> 0, color |-> "black"]}) ELSE EmptyBag];

  define {
    (*
      The passMsg operator is not implementable -at least not without using extra synchronization- because it atomically reads a message
      from the nic's in-buffer and writes to its out-buffer!
    *)
    passMsg(net, from, oldMsg, to, newMsg) == [ net EXCEPT ![from] = BagRemove(@, oldMsg), ![to] = BagAdd(@, newMsg) ]
    sendMsg(net, to, msg) == [ net EXCEPT ![to] = BagAdd(@, msg) ]
    dropMsg(net, to, msg) == [ net EXCEPT ![to] = BagRemove(@, msg) ]
    pendingMsgs(net, rcv) == DOMAIN net[rcv]
  }

  fair process (node \in Node)
    variables active \in BOOLEAN, color = "black", counter = 0;
  {
l:  while (TRUE) {

      either { \* send some payload message to some other node.
        when active;
        with (to \in Node \ {self}) {
          network := sendMsg(network, to, [type|-> "pl"]);
        };
        counter := counter + 1

      } or { \* receive a payload message. Reactivates the node.
        with (msg \in pendingMsgs(network, self)) {
            when msg.type = "pl";
            counter := counter - 1;
            active := TRUE;
            color := "black";
            network := dropMsg(network, self, msg)
        }

      } or { \* terminate the current node.
        active := FALSE

      } or { \* pass the token to the next node.
        when self # Initiator;
        with (tok \in pendingMsgs(network, self)) {
            when tok.type = "tok" /\ ~active;
            network := passMsg(network, self, tok, self-1, [type|-> "tok", q |-> tok.q + counter, color |-> (IF color = "black" THEN "black" ELSE tok.color)]);
            color := "white";
        }

      } or { \* Initiate token.
        when self = Initiator;
        with (tok \in pendingMsgs(network, self)) {
            when tok.type = "tok" /\ (color = "black" \/ tok.q + counter # 0 \/ tok.color = "black");
            network := passMsg(network, self, tok, N-1, [type|-> "tok", q |-> 0, color |-> "white"]);
            color := "white";
        }
      }
    }
  }
}
********)
\* BEGIN TRANSLATION (chksum(pcal) = "4d658e04" /\ chksum(tla) = "530581e3")
VARIABLE network

(* define statement *)
passMsg(net, from, oldMsg, to, newMsg) == [ net EXCEPT ![from] = BagRemove(@, oldMsg), ![to] = BagAdd(@, newMsg) ]
sendMsg(net, to, msg) == [ net EXCEPT ![to] = BagAdd(@, msg) ]
dropMsg(net, to, msg) == [ net EXCEPT ![to] = BagRemove(@, msg) ]
pendingMsgs(net, rcv) == DOMAIN net[rcv]

VARIABLES active, color, counter

vars == << network, active, color, counter >>

ProcSet == (Node)

Init == (* Global variables *)
        /\ network = [n \in Node |-> IF n = Initiator THEN SetToBag({[type|-> "tok", q |-> 0, color |-> "black"]}) ELSE EmptyBag]
        (* Process node *)
        /\ active \in [Node -> BOOLEAN]
        /\ color = [self \in Node |-> "black"]
        /\ counter = [self \in Node |-> 0]

node(self) == \/ /\ active[self]
                 /\ \E to \in Node \ {self}:
                      network' = sendMsg(network, to, [type|-> "pl"])
                 /\ counter' = [counter EXCEPT ![self] = counter[self] + 1]
                 /\ UNCHANGED <<active, color>>
              \/ /\ \E msg \in pendingMsgs(network, self):
                      /\ msg.type = "pl"
                      /\ counter' = [counter EXCEPT ![self] = counter[self] - 1]
                      /\ active' = [active EXCEPT ![self] = TRUE]
                      /\ color' = [color EXCEPT ![self] = "black"]
                      /\ network' = dropMsg(network, self, msg)
              \/ /\ active' = [active EXCEPT ![self] = FALSE]
                 /\ UNCHANGED <<network, color, counter>>
              \/ /\ self # Initiator
                 /\ \E tok \in pendingMsgs(network, self):
                      /\ tok.type = "tok" /\ ~active[self]
                      /\ network' = passMsg(network, self, tok, self-1, [type|-> "tok", q |-> tok.q + counter[self], color |-> (IF color[self] = "black" THEN "black" ELSE tok.color)])
                      /\ color' = [color EXCEPT ![self] = "white"]
                 /\ UNCHANGED <<active, counter>>
              \/ /\ self = Initiator
                 /\ \E tok \in pendingMsgs(network, self):
                      /\ tok.type = "tok" /\ (color[self] = "black" \/ tok.q + counter[self] # 0 \/ tok.color = "black")
                      /\ network' = passMsg(network, self, tok, N-1, [type|-> "tok", q |-> 0, color |-> "white"])
                      /\ color' = [color EXCEPT ![self] = "white"]
                 /\ UNCHANGED <<active, counter>>

Next == (\E self \in Node: node(self))

Spec == /\ Init /\ [][Next]_vars
        /\ \A self \in Node : WF_vars(node(self))

\* END TRANSLATION

-----------------------------------------------------------------------------

token ==
    LET tpos == CHOOSE i \in Node : \E m \in DOMAIN network[i]: m.type = "tok"
        tok == CHOOSE m \in DOMAIN network[tpos] : m.type = "tok"
    IN [pos |-> tpos, q |-> tok.q, color |-> tok.color]

pending ==
    [n \in Node |-> IF [type|->"pl"] \in DOMAIN network[n] THEN network[n][[type|->"pl"]] ELSE 0]

EWD998 == INSTANCE EWD998

EWD998Spec == EWD998!Init /\ [][EWD998!Next]_EWD998!vars \* Not checking liveness because we cannot easily define fairness for what ewd998 calls system actions.

THEOREM Spec => EWD998Spec

-----------------------------------------------------------------------------

Alias ==
    [
        network |-> network,
        active |-> active,
        color |-> color,
        counter |-> counter,
        token |-> token,
        pending |-> pending
    ]

StateConstraint ==
    \A i \in DOMAIN counter : counter[i] < 3

=============================================================================

Script

The following GenAI script will lint the TLA+ spec above. More specifically, it will check if the prose comments in the spec are consistent with the TLA+ definitions.

// metadata (including model parameters)
// learn more at https://aka.ms/genaiscript
script({ title: "tlAI-Linter", description: "Check if the prose comments and their TLA+ declarations and definitions are syntactically and semantically consistent" })

// use def to emit LLM variables
def("TLA+", env.files.filter(f => f.filename.endsWith(".tla")), {lineNumbers: true})

// use $ to output formatted text to the prompt
$`You are an expert at TLA+/TLAPLUS. Your task is to check if the prose comments and their TLA+ declarations and definitions are syntactically and semantically consistent!!!
Explain any consistencies and inconsistencies you may find.  Report inconsistent and consistent pairs in a single ANNOTATION section.

## TLA+ Syntax Hints
- A formula [A]_v is called a temporal formula, and is shorthand for the formula A \/ v' = v.  In other words, the formula is true if A is true or if the value of v remains unchanged.  Usually v is a tuple of the spec's variables.
- The symbol \`#\` is alternative syntax used for inequality in TLA+; the other symbol is \`/=\".

## TLA+ Semantics Hints
- Do NOT add any invariants or properties to the behavior specification Spec or any of its subformulas.  This would change THEOREM Spec => Inv into THEOREM Spec /\ Inv => Inv, which is vacuously true.
- TLA+ specs are always stuttering insensitive, i.e., the next-state relation is always [A]_v.  In other words, one cannot write a stuttering sensitive specification.

## TLA+ Convention Hints
- The type correctness invariant is typically called TypeOK.
- Users can employ TLA labels as a means to conceptually associate a comment with a sub-formula like a specific disjunct or conjunct of a TLA formula. Even though these labels have no other function, they facilitate referencing particular parts of the formula from a comment.

## Formal and informal math Hints
- Take into account that humans may write informal math that is syntactically different from the formal math, yet semantically equivalent.  For example, humans may write \`N > 3T\` instead of \`N > 3 * T\`.
`

line numbers are added to the file content to help the LLM precisely locate the issues.

def("TLA+", env.files.filter(f => f.filename.endsWith(".tla")), {lineNumbers: true})

the script uses a builtin support for annotations to generate parseable warnings and errors. Annotations are automatically integrated as problems in VSCode or as build errors in the CI/CD pipeline.

$`Report inconsistent and consistent pairs in a single ANNOTATION section.`

GPT-4 already knows a lot about logic and basic math. However, the script also lists common TLA+ idioms that are relevant to lint a spec.

Github Action

name: tlAI-linter

on:
    pull_request:
        branches:
            - main
            - dev

jobs:
    linting:
        name: tlAI-linter

        runs-on: ubuntu-latest

        env:
            OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
            OPENAI_API_BASE: ${{ secrets.OPENAI_API_BASE }}
            OPENAI_API_TYPE: ${{ secrets.OPENAI_API_TYPE }}

        defaults:
            run:
                shell: bash

        steps:
            - name: Clone repo
              uses: actions/checkout@v4
              with:
                  ## All history for git diff below to succeed.
                  fetch-depth: 0

            - name: Setup NodeJS
              ## https://github.com/actions/setup-node
              uses: actions/setup-node@v4
              with:
                  node-version: "20"

            - name: Run GenAIscript on the TLA+ specs that are added in this pull request.
              ## Identify git diff: $(git diff --name-only HEAD^ | grep '.tla')
              ## Install genaiscript runtime: https://microsoft.github.io/genaiscript/reference/cli/
              ## Output LLM response in SARIF format: https://microsoft.github.io/genaiscript/reference/scripts/annotations/
              run: npx --yes genaiscript run .github/scripts/tlAI-Linter.genai.js $(git diff --name-only HEAD^ | grep '.tla') -oa results.sarif

            - name: Upload SARIF file
              ## https://sarifweb.azurewebsites.net
              ## https://docs.github.com/en/code-security/code-scanning/integrating-with-code-scanning/uploading-a-sarif-file-to-github
              if: success() || failure()
              uses: github/codeql-action/upload-sarif@v3
              with:
                  sarif_file: results.sarif

after cloning the repository and installing dependencies such as node.js, the GenAI script is run to lint the TLA+ specs that were added or modified in the PR.
the script’s output, i.e., the annotations generated by the LLM, are formatted as a SARIF report and upload to the PR.

Results

The linter generated annotations for each prose comment in the spec, and one comment is found to be inconsistent with the TLA+ definitions. A corresponding warning is added to the PR.

$The image shows a screenshot of a GitHub pull request page. The pull request is titled "Update EDW998PCal.tla #1" and is from a branch named "lemmy-patch-1" into the "master" branch. A bot identified as "github-advanced-security" has flagged potential problems 25 minutes ago. Below, part of a file is shown with a diff view highlighting changes between two versions. Two lines have been altered, where an "ASSUME" statement's comment has been updated from "At least one node" to "Any number of nodes between zero and infinitely many." There's an alert from the TLA+ Linter indicating that the comment is consistent with the TLA+ declaration `N \in Nat \ {0}` which means N is a natural number excluding zero. Below this, there is a dismissible alert box titled "Check notice" with a "Dismiss alert" button. At the bottom of the screenshot, there's a note that "Some checks were not successful," indicating 1 successful check and 1 failing check. The successful check is by "TLA-linter / TLA-linter (pull_request)" and the failing check is "Code scanning results / genaiscript." There's also a reference to a commit push action to the "master" branch from another branch and a prompt suggesting to add more commits by pushing to the "lemmy-patch-1" branch on "lemmy/Examples".$