We just saw a very simple example of compositional testing of a peer-to-peer interface. It’s also possible to use compositional testing at the interface between layers in a layered protocol stack.

As an example, let’s look at the very simple leader election protocol, introduced in this paper in 1979.

In this protocol we have a collection of distributed processes organized in a ring. Each process can send messages to its right neighbor in the ring and receive message from left neighbor. A process has a unique id drawn from some totally ordered set (say, the integers). The purpose of the protocol is to discover which process has the highest id value. This process is elected as the “leader”.

This protocol itself is trivially simple. Each process transmits its own id value. When it receives a value, it retransmits the value, but only if it is greater than the process’ own id value. If a process receives its own id, this value must have traveled all the way around the ring, so the process knows its id is greater than all others and it declares itself leader.

Interface specifications

We’ll start with a service specification for this protocol:

object asgn = {

    function pid(X:node.t) : id.t
    axiom [injectivity] pid(X) = pid(Y) -> X = Y
}

object serv = {

    action elect(v:node.t)

    object spec = {
        before elect {
            assert asgn.pid(v) >= asgn.pid(X)  
        }
    }
}

The abstract type node.t represents a reference to a process in our system (for example, its network address). The asgn object defines an assignment of id values to nodes, represented by the function pid. The fact that the id values are unique is guaranteed by the axiom injectivity.

The service specification serv defines one action elect which is called when a given node is elected leader. Its specification says that only the node with the maximum id value can be elected.

Our protocol consists of a collection of concurrent process layered on top of two services: a network service and a timer service:

Leader Election Ring Figure 1

Let’s consider now the two intermediate interfaces. The specification for the network service is quite simple and comes from IVy’s standard library:

module ip_simple(addr,pkt) = {

    import action recv(dst:addr,v:pkt)
    export action send(src:addr,dst:addr,v:pkt)

    object spec = {
        relation sent(V:pkt, N:addr)
        after init {
            sent(V, N) := false
        }

        before send {
            sent(v,dst) := true
        }
        before recv {
            assert sent(v,dst)
        }
    }
}

The interface is a module with two parameters: the type addr of network addresses, and the type pkt of packets. It has two actions, recv and send. The relation sent tells us which values have been sent to which nodes. Initially, nothing has been sent. The interface provides two actions send and recv which are called, respectively, when a value is sent or received. The src and dst parameters respectively give the source and desitination addresses of the packets.

The specification says that a value is marked as sent when send occurs and a value must be marked sent before it can be received. This describes a network service that can duplicate or re-order messages, but cannot corrupt messages.

The specification of the timer service is also quite simple and comes from the standard libarary:

module timeout_sec = {

    import action timeout

}

The timer service simply calls timeout from time to time, with no specification as to when.

Protocol implementation

Now that we have the specificaitons of all the interfaces, let’s define the protocol itself:

instance net : ip_simple(node.t,id.t)
instance timer(X:node.t) : timeout_sec

object proto = {

    implement timer.timeout(me:node.t) = {
        call trans.send(node.next(me),asgn.pid(me))
    }

    implement net.recv(me:node.t,v:id.t) {
        if v = asgn.pid(me) {       # Found a leader!
            call serv.elect(me)
        }
        else if v > asgn.pid(me)  { # pass message to next node
            call trans.send(node.next(me),v)
        }
    }

}

We create one instance of the network module, and one timer for each node. The protocol implements the interface actions timer.timeout and net.recv. When a node times out, it transmits the node’s id to the next process in the ring, defined by the the action node.next (which we’ll define shortly).

When a node receives an id value v, it checks whether v is its own id according to the id assignment asgn. If so, the process knows it is leader and calls serv.elect. Otherwise, if the received value is greater, it calls net.send to send the value on to the next node in the ring.

With our protocol implemented, we still need to provide concrete types for node and id. If we were formally verifying the protocol, we would probably treat these as abstract datatypes with specified mathematical properties. For testing, though, we just need some concrete types. Here is one possiblity:

object node = {
    type t
    interpret t -> bv[1]

    action next(x:t) returns (y:t) = {
        y := x + 1
    }
}

object id = {
    type t
    interpret t -> bv[8]
}

That is, node addresses are one-bit binary numbers (meaning we have just two nodes, and id’s are 8-bit numbers. Often it’s a good idea to start testing with small datatypes. In the case of the leader election ring, more nodes would increase the amount of time needed to see an election event.

Compositionally testing the leader election protocol

We’re going to divide the system into three isolates, one for each of its major components. Here is what the three isolates will look like:

Leader Election Ring Figure 2

Here are the isolate definitions:

trusted isolate iso_p = proto with serv,node,id,asgn,net,timer
trusted isolate iso_t = timer
trusted isolate iso_n = net with node,id

We added the keyword trusted to these isolates to indicate to IVy that they will be verified informally by testing. Notice that the isolate for proto contains its service specification as well as the specifications of all of the other components. We need to test that proto lives up to its guarantees at all of its interfaces, provided that all of its assumptions hold. On the other hand, net needs only the specifications of node and id (the net object contains its own service specification).

We can now use IVy to compile test environments for each of the isolates, based on the interface specifications. Let’s start with the high-level protocol:

$ ivy_to_cpp isolate=iso_p target=test build=true leader_election_ring.ivy
g++ -I $Z3DIR/include -L $Z3DIR/lib -g -o leader_election_ring leader_election_ring.cpp -lz3

We run the tester:

$ ./leader_election_ring
> timer.timeout(1)
< net.send(1,0,170)
> net.recv(0,170)
< net.send(0,1,170)
> net.recv(0,170)
< net.send(0,1,170)
> net.recv(0,170)
< net.send(0,1,170)
> timer.timeout(0)
< net.send(0,1,149)
> timer.timeout(0)
< net.send(0,1,149)
> timer.timeout(0)
< net.send(0,1,149)
> net.recv(0,170)

...

> net.recv(0,170)
< net.send(0,1,170)
> timer.timeout(0)
< net.send(0,1,149)
> net.recv(1,170)
< serv.elect(1)

In this run, process 1 times out and sends its id 170 to process

Process 0 receives this id and forwards it back to process 1 (meaning process 0’s id must be less than 170). Later, process 0 times out and sends its id 149 to process 1. Eventually, the network decides to deliver to process 1 the message with its own id, causing process 1 to be elected. When this happens, the assertion monitor checkes that in fact process 1 has the highest id.

The important point here is that there is no actual network. The tester is simulating the network by randomly selecting actions consistent with the history of actions at the interface. In practice, this means that the tester stores the relation sent (the state of the network interface specification) and only calls net.recv with packets that have been previously sent. When there are multiple choices, the tester makes an effort to select uniformly. This is a hard problem in general, however. In practice the distribution will be non-uniform. If you look at the calls to net.recv, you can see that there is some duplication of sent packets, as we would expect.

We can see here a major advantage of compositional testing over integration testing. If we used a real network to test our protocol, we might have to wait a long time for a packet duplication to occur (in fact, this might never occur unless we take care to stress the network appropriately). The resilience of our protocol to packet duplication would not be well tested, as it is here. Similarly, our protocol might have unintended dependences on timing. Since timeout events are being generated randomly here, and not based on wall clock time, we can see timings that might occur rarely in the actual system, perhaps only under very heavy load.

Now let’s turn to the network isolate. We compile and run:

$ ivy_to_cpp isolate=iso_n target=test build=true leader_election_ring.ivy
g++ -I $Z3DIR/include -L $Z3DIR/lib -g -o leader_election_ring leader_election_ring.cpp -lz3

./leader_election_ring
> net.send(0,0,168)
< net.recv(0,168)
> net.send(0,0,243)
< net.recv(0,243)
> net.send(0,0,13)
< net.recv(0,13)
> net.send(1,0,108)
> net.send(0,0,84)
> net.send(0,0,78)
< net.recv(0,108)
< net.recv(0,84)
< net.recv(0,78)
> net.send(1,0,17)
< net.recv(0,17)
> net.send(0,1,167)
< net.recv(1,167)
> net.send(1,0,3)
> net.send(0,1,13)
...

The object net is actually just a wrapper around the operating system’s networking API. Therefore, when we test net, we are actually testing the operating system’s networking stack. Fortunately, testing reveals no errors. Notice though, that the tester is generating a wider variety of packets than the protocol could actually generate. In fact, it’s simply generating packet values uniformly at random. In this sense, the network stack is also being better tested than it would be if composed with the actual protocol.

Finally, we can test the timer:

$ ivy_to_cpp isolate=iso_t target=test build=true leader_election_ring.ivy
g++ -I $Z3DIR/include -L $Z3DIR/lib -g -o leader_election_ring leader_election_ring.cpp -lz3

./leader_election_ring
< timer.timeout(0)
< timer.timeout(1)
< timer.timeout(0)
< timer.timeout(1)
< timer.timeout(0)
< timer.timeout(1)
< timer.timeout(0)
< timer.timeout(1)
< timer.timeout(0)
...

Not very interesting, but it shows both timeouts occurring once every second, which is what we expect. Again, we are really testing the operating system here (and also IVy’s run time scheduler, which uses the operating system API).

Running the protocol

Now that we’ve tested the components, we can compile the whole system and run it as a collection of independent processes. Our protocol is an example of a parameterized object. Ivy can decompose it into a collection of processes based on the parameter me. Each value of the parameter becomes an individual process. This transformation is called parameter stripping.

To do this, we use the following declaration:

extract iso_impl(me:node.t) = proto(me),net(me),timer(me),node,id,asgn

This describes a special kind of isolate called an extract. Extracts are intended to be run in production and don’t include the specification monitors. In the extract declaration, me is a symbolic constant that represents the first parameter of of each action and state component in the object. This parameter is “stripped” by replacing it with the symbolic constant me.

Before thinking about this too hard, let’s just run it to see what happens. We compile as usual, with one difference: we use the option target=repl. Instead of generating a test environment, IVy provides a read-eval-print loop as the environment:

$ ivy_to_cpp target=repl isolate=iso_impl build=true leader_election_ring.ivy
g++ -I $Z3DIR/include -L $Z3DIR/lib -g -o leader_election_ring leader_election_ring.cpp -lz3

This produces an executable that takes one parameter me on the command line. Let’s create two terminals A and B to run process 0 and process 1 respectively:

A: $ ./leader_election_ring 0
A: >

B: $ ./leader_election_ring 1
B: >

A: < serve.elect
A: < serve.elect
A: < serve.elect
...

Notice that nothing happens when we start node 0. It is sending packets once per second, but they are being dropped because no port is yet open to receive them. After we start node 1, it forwards node 0’s packets, which causes node 0 to be elected (and to continues to be elected once per second).