# Concrete data

Abstract dataypes allow us to hide the representation of data behind interfaces. To pass information across interfaces, however, we still need a sufficiently rich class of concrete datatypes whose representation is not hidden. This is particularly relevant when communicating across a network. In this case, we can’t efficiently pass a reference to an abstract object – we have to represent the contents of the object in a concrete way.

To facilitate exchange of data across interfaces, Ivy provides a set of standard concrete datatypes.

# Structs

A `struct`

is an aggregate of named components. Structs are declared like this:

```
type point = struct {
x : coord,
y : coord
}
```

This defines a type `point`

whose elements are structures consisting
of members `x`

and `y`

of type `coord`

. Two *destructor* functions `x`

and `y`

are defined that respectively extract the *x* and *y*
coordinates of the point. Thus, if `p`

is a point, then `x(p)`

is the
*x* coordinate of `p`

and `y(p)`

is the *y* coordinate.

The members of a structure can be assigned. For example, to assign `0`

to
the *x* coordinate of point `p`

, we would write:

```
x(p) := 0
```

This mutates not the destructor function `x`

, which is immutable, but
rather the value of `p`

. The value of `y(p)`

remains unchanged by this
assignment.

Members of a struct can also be functions and relations. For example:

```
type point = struct {
coords(X:dimension) : nat
}
```

Our points now have one coordinate value for each dimension defined by
some arbitrary type `dimension`

. Suppose we declare:

```
type dimension = {x,y,z}
```

We could then refer to the `x`

coordinate of a point `p`

as
`coord(p,x)`

, and we could assign `0`

to it like this:

```
coord(p,x) := 0
```

This would be written in many programming languages as something like
`p.coord[x] = 0`

. The type `dimension`

need not be an enumerated type,
however. It could be an uninterpreted type, or an interpreted type
(for example, an integer type) or even another structure type.

Structs can also contain structs. For example:

```
type line = {
begin : point,
end : point
}
```

To set the *x* coordinate of the begin point of a line `l`

, we could write:

```
coord(begin(l),x) := 0
```

This mutates `l`

and not the immutable destructor functions `coord`

and `begin`

.

# Arrays

An array type represents a map from an interval `[0,end)`

of some type
(the index type) to some other type (that value type). Usually the
index type is interpreted as `int`

.

Here is an example of a simple program that tabulates the function
`f(X) = X * X`

from `0`

to `max-1`

:

```
#lang ivy1.7
include collections
type t
instance arr : array(t,t)
action tabulate(max : t) returns (res:arr.t) {
local i:index {
i := 0;
res := arr.create(max,0);
while i < max {
res := arr.set(res,i,i*i);
i := i + 1
}
}
}
```

The `array`

module provides an action `create`

that returns an array
mapping values in the range `[0,end)`

to some given value (in the
example, `0`

). The `set`

action takes a array, an index and a
value. It sets the value of the array at the given index to the value
and returns the resulting array. Keep in mind, because of IVy’s
call-by-value convention, an action can’t have a side effect on its
input parameters. The `set`

action returns a *copy* of its input
array with one value changed.

This program seems inefficient, since it copies the array each time it
modifies one element. In fact it isn’t. The compiler recognizes that
the array can be modified “in place”. Lets run this program to see
what we get. First, we have to give a concrete interpretation for
type `t`

and define an extract:

```
interpret t -> int
export tabulate
extract iso_impl = tabulate, arr
```

Compile and run:

```
$ make array1
ivy_to_cpp target=repl isolate=iso_impl array1.ivy
g++ -g -o array1 array1.cpp
$ ./array1
> tabulate(4)
[0,1,4,9]
```

Here is a part of the interface definition for array types, from the
`collections`

module in the standard library:

```
module array(domain,range) = {
type t
action create(s:domain,y:range) returns (a:t)
action set(a:t,x:domain,y:range) returns (a:t)
action get(a:t,x:domain) returns (y:range)
action size(a:t) returns (s:domain)
action resize(a:t,s:domain,v:range) returns (a:t)
function end(A:t) : domain
function value(A:t,X:domain) : range
object spec = {
before create {
assert 0 <= s
}
after create {
assert end(a) = s & value(a,X) = y
}
before set {
assert 0 <= x & x < end(a)
}
after set {
assert value(a,X) = y if X = x else value(old a,X)
}
before get {
assert 0 <= x & x < end(a)
}
after get {
assert value(a,x) = y
}
after size {
assert s = end(a)
}
after resize {
assert end(a) = s;
assert 0 <= X & X < end(old a) -> value(a,X) = value(old a,X);
assert end(old a) <= X & X < s -> value(a,X) = v
}
...
}
...
}
```

In addition to `create`

and `set`

, array types provide an action `get`

to get the value at a given index, `size`

to get the `end`

index, and `resize`

to change the end index. The `resize`

action takes an addition parameter
that gives the value of the new elements in case the size is increased.

Notice that in the `set`

method, both the input and the output
parameter are called `a`

. This indicates to the compiler that we
intend to modify the array `a`

in place if possible. When writing the
`after`

specification for `set`

, we need a way to refer to the input
value of `a`

. For this, we use `old a`

. We can refer to the original
value of any component that is mutated by an action using `old`

. You
can also see the use of `old`

in the specification of `resize`

. The
expression `old x`

always referes to the value of symbol `x`

at the
time the action was called.

In the next section, we’ll see a more interesting application of arrays.

# Representation functions

Often, we use concrete dataypes as representations of abstract values. This gives us an efficient way of passing these values across interfaces.

As an example, consider using an array as a representation of a set. Here is the start of a module that accomplishes that:

```
#lang ivy1.7
module set(elem) = {
type this
alias t = this
relation contains(X:t,Y:elem)
action emptyset returns(s:t)
action add(s:t,e:elem) returns (s:t)
...
```

Notice something new here: `type this`

. This declares a type with the
same name as the object we are declaring (which won’t be known until
we instantiate this module). For convenience, we create an alias `t`

for this type. This interface of our abstact set type contains a
relation and two actions.

The relation `contains`

acts as a *representation function*. This
tells us what abstract set a given value of type `set`

represents.
The representation function allows us write specifications using the
abstract value that is represented. This means we can can pass around
values of type `set`

as if they were actually abstract sets instead of
arrays.

Now let’s specify the semantics of our two set operations in terms of
the `contains`

relation.

```
module set(elem) = {
...
object spec = {
after emptyset {
assert ~contains(s,X)
}
after add {
assert contains(s,X) <-> (contains(old s,X) | X = e)
}
}
```

The action `emptyset`

returns a representation of the empty set, while
`add`

returns set `s`

with element `e`

added. Notice that the set
parameter of `add`

is both input and output. This indiciates to the
compiler that we wish to avoid copying if possible. In most cases,
this will allowed the compiled code of `add`

to modify the set in
place.

Let’s have a hack at implementing these operations. The implementation
usually provides a concrete data representation, a definition for each
function or relation in the interface, and code for each
action. Typically it instantiates some modules to provide the concrete
representation. Here’s one way we could implement `set`

:

```
include collections
include order
module set(index,elem) = {
...
instance index : unbounded_sequence
instance arr : array(index.t,elem)
destructor repr(X:t) : arr
definition contains(X:t,y:elem) = exists Z. 0 <= Z & Z < repr(X).end & repr(X).value(Z) = y
implement emptyset {
repr(s) := arr.create(0,0)
}
implement add {
if ~contains(s,e) {
repr(s) := arr.resize(repr(s),index.next(arr.end(repr(s))),e)
}
}
isolate iso = this
}
```

This implementation uses an unbounded array as a representation.
The concrete type is usually provide by giving the abstract type a
*private destructor*, in this case the function `repr`

from type `t`

to
the array type. Giving the destructor in this way is equivalent to
writing:

```
type t = struct {
repr : arr
}
```

except the that field `repr`

, being a component of the implementation,
is not visible to users of `set`

(it’s similar to a private member in
C++). We use the function `repr`

to access the internal representation
of a set.

Next, we give the definition of `contains`

in terms of `repr`

. This
definition is also private. We say that `contains(x,y)`

is `true`

if a given `elem`

*y* can be found somewhere in the array representing
set `x`

.

The implementation of `emptyset`

returns an empty array. To add an
element to a set, we test whether the element is already present. If
not, we resize the array to make it one value larger, where the added
value is `e`

, the new element.

Finally, we create an isolate containing our object so it will be verified separately.

Notice that in the implementation of `add`

, we evaluate the predicate
`contains`

. IVy recognizes that `contains`

is executable because the
quantifier in the definition of `contains`

is bounded. This means it
can compile the quantifier into a loop from `0`

to `arr.end(X)`

. If
you try to compile an unbounded quantifier into executable code, IVy
will complain.

Also notice that instead of `arr.end(s) + 1`

, we wrote
`index.next(arr.end(s))`

. That is, we are treating `index`

as an
abstract datatype, assuming only that it provides an order `<`

and an
action `next`

. From a software engineering point of view, this is
probably useless abstraction. However, from a verification point of
view, it is useful – it allows us to verify `set`

without using the
theory or arithmetic.

The implementation is moderately efficient. That is, if the caller
of `add`

writes:

s := set.add(s,e)

the compiler will recognize that the array `s`

can be modified in
place. On the other hand, evaluating `contains`

will still be linear
time in the array size, since it loops over the array. This approach
is only practical for small sets.

To try out our sets, we instantiate a set type:

```
type elem
instance s : set(elem)
```

We export our two actions:

```
export s.emptyset
export s.add
```

Then we verify:

```
$ivy_check arrayset.ivy
ivy_check arrayset.ivy
Checking isolate s.impl.index.iso...
trying ext:s.impl.index.next...
checking consecution...
Checking isolate s.iso...
trying ext:s.add...
checking consecution...
trying ext:s.emptyset...
checking consecution...
OK
```

Notice that even though the object `index`

was instantiated from the
standard library, IVy still verified it. Generally speaking, even
trusted modules have properties that need to be verified when
instantiated in a particular environment.

# Representation invariants

Now let’s try adding an action to our `set`

module that removes an element:

```
module set(index,elem) = {
...
action remove(s:t,e:elem) returns (s:t)
object spec = {
...
after remove {
assert contains(s,X) <-> (contains(old s,X) & X ~= e);
}
}
object impl = {
...
implement remove {
if some (i:index.t) 0 <= i & i < repr(s).end & repr(s).value(i) = e {
var last := repr(s).end.prev;
repr(s) := repr(s).set(i,repr(s).get(last));
repr(s) := repr(s).resize(last,0)
}
}
}
}
```

This implementation of `remove`

scans the array for some index `i`

whose value is `e`

.
If such an `i`

exists, the value `e`

is removed by replacing it with the last value
in the array and then resizing the array to make it one element smaller.

Unfortunately, this implementation doesn’t work: if the input array
contains two copies of the element `e`

, one copy will remain. One
solution to this problem is to use a *representation invariant*. This
is a predicate a predicate that must be true of a value for it to be a
valid representation. The representation invariant is assumed to be
true of input values and must hold of output values.

In the case of our sets, the representation invariant is that no value
occurs twice in the array. It is part of the interface, and like `contains`

, its definition
is private:

```
relation valid(X:t)
```

We have to specify our interface to make the appropriate assumptions and guarantees:

```
object spec = {
after emptyset {
assert ~contains(s,X)
}
before add {
assert valid(s)
}
after add {
assert contains(s,X) <-> (contains(old s,X) | X = e);
assert valid(s)
}
before remove {
assert valid(s)
}
after remove {
assert contains(s,X) <-> (contains(old s,X) & X ~= e);
assert valid(s)
}
}
```

We give the definition of `valid`

in the implementation:

```
object impl = {
...
definition valid(X:t) =
forall Y,Z. (0 <= Y & Y < repr(X).end & 0 <= Z & Z < repr(X).end
& repr(X).value(Y) = repr(X).value(Z) -> Y = Z)
}
```

Now we can verify that our implementation of `remove`

works (try
verifying arrayset2.ivy). There is a disadvantage to
this approach, however. Users of our `set`

module now have to keep
track of the invariant that set values are valid. This could lead to many
“boilerplate” invariant conjectures. One day, IVy will solve this problem
using predicate subtypes.

# Array and loops

For now, it might be best to just implement `remove`

in a way that doesn’t
rely on a representation invariant. We can do this using a `while`

loop.
Here is the implementation we have in mind:

```
action remove(s:t,e:elem) returns (res:t)
object impl = {
...
implement remove {
local i:index.t, end:index.t {
i := 0;
res := emptyset;
end := arr.end(s);
while i < end
{
local f:elem {
f := arr.get(s,i);
if f ~= e {
res := add(res,f)
}
};
i := index.next(i)
}
}
}
}
```

That is, we scan the input array `s`

. When we encounter a value not
equal to the output set `res`

. This is easy enough to write, but Ivy
can’t prove it correct because of the loop. We’re going to need an
inductive invariant.

Before diving in to the proof, let’s consider the general schema of
inductive proofs of loops. We think of the loop as computing an
approximation to a desired function that approaches the correct result
with each iteration. We’ll call this function `step(i)`

, where `i`

is
the loop index. Suppose we want our loop to compute the “or” of the
bits in an array `a`

, that is:

```
function or = exists X. 0 <= X & X < arr.end(a) & arr.value(a,X)
```

Then our step function would be:

```
function step(i) = exists X. 0 <= X & X < i & arr.value(a,X)
```

That is, our approximation is the “or” of all the bits up to (but not
including) the index `i`

.

Now we need to give an *inductive* characterization of `step`

. That
is, we need to know how to compute the value of `step`

at the next
iteration. It looks like this:

```
property 0 <= I & index.succ(I,J) -> step(J) = (step(I) | arr.value(a,I))
```

That is, to compute approximation for the successor of index `I`

, we
“or” `step(I)`

with the bit in the array at index `I`

. Ivy can prove
this property from the definition of `step`

. It just unfolds the definition
of `step`

for `I`

and `J`

and uses its EPR decision procedure.
Notice we use the successor relation `succ`

provided by the index type
to represent the fact that `J`

is next after `I`

. This is because we
can’t call the action `next`

in a property of function definition.

Now we need to know that when we get to the and of the array, we have the desired result. Here is how we write this condition:

```
property I = arr.end(a) -> step(I) = or
```

Again, Ivy can easily prove this from the definitions of `step`

and
`or`

. If you squint at these two properties a bit, you’ll see that
they are very close to a logic program for computing `or`

(in fact,
we could easily recast them in the form of Horn
clauses). A logic
programming system could execute them directly to compute our result.
We have have in mind, however, to do the computation using a loop:

```
i := 0;
res := false;
while i < arr.end(a)
invariant 0 <= i & i <= arr.end(a)
invariant res = step(i)
{
res := res | arr.value(a,i);
i := index.next(i)
}
assert res = or
```

Notice the two invariants decorating the loop. IVy can prove that that
they are inductive and that the assertion is true. This can be done
using only our two properties, without reference to the actual
definition definition of `step`

. This is the general approach for
writing a loop in IVy: first characterize the loop inductively, then
write the loop with two invariants as above.

Now let’s get back to the `remove`

method. We start with the step
function:

```
function remove_step (s:t,e:elem,i:index.t,y:elem)
= (exists Z. 0 <= Z & Z < i & arr.value(s,Z) = y) & y ~= e
```

This is an approximation of the representation function
`contains`

. Here, `s`

is the input set, `e`

is the element to remove,
`i`

is the loop index and `y`

is an element to test for
membership. The function yields true if `y`

is not the removed element
`e`

and if it occurs before index `i`

in the input set. In other
words, this function gives us the output set accumulated up to index
`i`

.

Now we need an inductive charaterization. Basically, we add an element of the array to the output set if it is not the removed element:

```
property I >= 0 & index.succ(I,J) ->
(remove_step(X,E,J,Y) <-> (remove_step(X,E,I,Y) | arr.value(X,I) = Y & Y ~= E))
```

The exit condition tells us when we have completed the computation:

```
property I = arr.end(X) ->
(remove_step(X,E,I,Y) <-> (contains(X,Y) & Y~=E))
```

That is, when we reach the end of the array, we have accumulated all
the elements of the input set except `e`

.

Now here’s the `remove`

action, decorated with invariants:

```
implement remove {
local i:index.t, end:index.t {
i := 0;
res := arr.create(0,0);
end := arr.end(s);
while i < end
invariant 0 <= i & i <= end
invariant contains(res,Y) = remove_step(s,e,i,Y)
{
local f:elem {
f := arr.get(s,i);
if f ~= e {
res := add(res,f)
}
};
i := index.next(i)
}
}
}
```

Notice that the invariant is universally quantified on `Y`

, the test
element. The invariant says that our loop computes exactly what our
inductive characterization computes. Ivy can easily verify that this
version of `remove`

satisfies its specification.

Of course, this might seem like a lot of work for around ten lines of code. This style of proof has a big advantage, though, from the point of view of automating the proof. That is, the places in the Horn clauses where definitions need to be unfolded are clear. We don’t need fragile quantifier instantiation heuristics to make the proof go through.

In general, though, it’s best to avoid this kind of painstaking construction of loops by using higher-level operations on containers.