Getting Started#

This page will be a short guide to setting up a problem with PyBryt, including the process of designing reference implementations based on solutions, create scaffold code for students, and organizing and assessing student implementations. This guide assumes that you have already installed PyBryt.

In this example, suppose that you want students to write a function that performs matrix exponentiation on nested Python lists. You will provide students with a matrix multiplication function that uses Strassen’s algorithm for multipling matrices, and it will be their objective to use that matrix multiplication function to implement the logic for computing matrix powers.

[32]:
import hashlib
import json
import numpy as np
import pybryt
from copy import deepcopy
from pprint import pprint
from strassen import matmul  # the matrix multiplication function

Step 1: Writing Solutions#

Let’s first consider the possible solutions to this problem. The naive solution would be to iteratively or recursively multiply the matrix with itself the requisite number of times:

[13]:
diag = lambda x, n: [[x if i == j else 0 for i in range(n)] for j in range(n)]

def matpow_naive(A, p):
    """
    Raise a matrix to a non-negative power.

    Parameters
    ----------
    A : list[list[int]]
        the matrix
    p : int
        the power

    Returns
    -------
    list[list[int]]
        the exponentiated matrix
    """
    if p == 0:
        return diag(1, len(A))
    B = deepcopy(A)
    for i in range(p - 1):
        B = matmul(B, A)
    return B

However, there is more than one way to implement matrix exponentiation; a faster divide-and-conquer algorithm that can be used to perform matrix exponentiation in logarithmic time:

[14]:
def matpow_dac(A, p):
    """
    Raise a matrix to a non-negative power.

    Parameters
    ----------
    A : list[list[int]]
        the matrix
    p : int
        the power

    Returns
    -------
    list[list[int]]
        the exponentiated matrix
    """
    if p == 0:
        return diag(1, len(A))
    elif p == 1:
        return A

    P = matpow_dac(A, p // 2)
    if p % 2 == 0:
        return matmul(P, P)
    else:
        return matmul(A, matmul(P, P))

Step 2: Testing Implementations#

Now that we have a couple of solutions, we need a way to test them so that PyBryt can generate the annotations. To create testing data, we’ll use Python’s built-in random library. The function generate_rand_matrix defined below creates a square matrix populated with random integers given the size of the matrix n and an RNG seed seed.

[15]:
import random

def generate_rand_matrix(n, seed, dmin=-100, dmax=100):
    """
    Generate a square matrix populated with random data.

    Parameters
    ----------
    n    : int
        the size of the matrix
    seed : int
        a seed to use for the RNG
    dmin : int
        the minimum value to allow when generating data
    dmax : int
        the maximum value to allow when generating data

    Returns
    -------
    list[list[int]]
        the matrix
    """
    rng = random.Random(seed)
    return [[rng.randrange(dmin, dmax) for _ in range(n)] for _ in range(n)]

To test the matrix power functions, we’ll compare the matrices returned by the functions on some randomly-generated matrices to the matrices that NumPy returns. The function test_matpow below implements this logic, and also checks some simple corner cases; it raises an AssertionError if any of the tests fail.

[16]:
def test_matpow(matpow):
    """
    Tests an implementation of matrix exponentiation against NumPy.

    Parameters
    ----------
    matpow : callable[[list[list[int]], int], list[list[int]]]
        the matrix power function

    Raises
    ------
    AssertionError
        if `matpow` returns an incorrect value
    """
    simple_matrix_powers = [
        (diag(0, 5), 0),
        (diag(0, 5), 2),
        (diag(1, 5), 2),
    ]
    for args in simple_matrix_powers:
        mat_pow = matpow(*args)
        assert np.allclose(mat_pow, np.linalg.matrix_power(*args))

    rng = random.Random(42)
    for _ in range(5):
        n, p = rng.randrange(10, 100), rng.randrange(10)
        mat = generate_rand_matrix(n, n, dmin=-10, dmax=10)
        mat_pow = matpow(mat, p)
        assert np.allclose(
            mat_pow, np.linalg.matrix_power(np.array(mat, dtype="int64"), p))

Now that we have a way to test our solutions, let’s do so:

[17]:
test_matpow(matpow_naive)
test_matpow(matpow_dac)

Step 3: Annotating Solutions#

Let’s now turn our tested solutions into reference implementations. We’ll start by annotating the naive solution. In the annotated version below, we create value annotations for the return values and append them to the naive_annots list. We also create a value annotation for each intermediate matrix we encounter in the for loop and use Annotation.before to create temporal annotations asserting the order in which these matrices should appear in the memory footprint.

[25]:
naive_annots = []

def matpow_naive(A, p):
    """
    Raise a matrix to a non-negative power.

    Parameters
    ----------
    A : list[list[int]]
        the matrix
    p : int
        the power

    Returns
    -------
    list[list[int]]
        the exponentiated matrix
    """
    if p == 0:
        d = diag(1, len(A))
        naive_annots.append(pybryt.Value(
            d,
            name="identity",
            success_message="Returned identity when p == 0",
            failure_message="Incorrect return value when p == 0",
        ))
        return d

    B = deepcopy(A)
    partial = pybryt.Value(
        B,
        name="partial-product",
        success_message="Found correct partial products",
        failure_message="Incorrect partial products",
    )

    for i in range(p - 1):
        B = matmul(B, A)
        next_partial = pybryt.Value(
            B,
            name="partial-product",
            success_message="Found",
            failure_message="Incorrect return value when p == 0",
        )
        naive_annots.append(partial.before(next_partial))
        partial = next_partial

    return B

To annotate the divide-and-conquer solution, we again create values checking the return value and all intermediate matrices encountered. However, in this instance, we’re using a Collection with enforce_order=True to gather the annotations for the intermediate matrices and assert their ordering. (This method is slightly easier to conceptualize for the recursive algorithm used here since we can pass the collection to subsequent calls to matpow_dac, as opposed to the iterative algorithm used above.)

[26]:
dac_annots = []

def matpow_dac(A, p, collection=None):
    """
    Raise a matrix to a non-negative power.

    Parameters
    ----------
    A          : list[list[int]]
        the matrix
    p          : int
        the power
    collection : pybryt.Collection
        a collection of annotations to which the intermediate matrices will be added

    Returns
    -------
    list[list[int]]
        the exponentiated matrix
    """
    if p == 0:
        d = diag(1, len(A))
        dac_annots.append(pybryt.Value(
            d,
            name="identity",
            success_message="Returned identity when p == 0",
            failure_message="Incorrect return value when p == 0",
        ))
        return d

    elif p == 1:
        ret = A
        dac_annots.append(pybryt.Value(
            ret,
            name="1st-power",
            success_message="Returned unaltered matrix when p == 1",
            failure_message="Incorrect return value when p == 1",
        ))
        return ret

    should_track = False
    if collection is None:
        collection = pybryt.Collection(
            enforce_order=True,
            success_message="Found the correct sequence of partial powers",
            failure_message="Did not find the correct sequence of partial powers",
        )
        dac_annots.append(collection)
        should_track = True

    P = matpow_dac(A, p // 2, collection=collection)
    collection.add(pybryt.Value(P))

    if p % 2 == 0:
        ret = matmul(P, P)
    else:
        ret = matmul(A, matmul(P, P))

    if should_track:
        dac_annots.append(pybryt.Value(
            ret,
            name="return-value",
            success_message="Returned correct value",
            failure_message="Incorrect return value",
        ))

    return ret

Step 4: Populating and Instantiating Reference Implementations#

With the annotated versions of our solutions ready to go, we’re now able to create our annotations. First, we’ll need to populate the lists of annotations by running our test_matpow function on each of the annotation solutions.

[28]:
naive_annots.clear()
test_matpow(matpow_naive)

dac_annots.clear()
test_matpow(matpow_dac)

Finally, we should add one more annotation to each reference to forbid the import of NumPy; this ensures that students aren’t taking the easy way out in their implementations by using NumPy’s matrix power implementation.

[29]:
naive_annots.append(pybryt.ForbidImport("numpy"))
dac_annots.append(pybryt.ForbidImport("numpy"))

Now, we can insantiate the ReferenceImplementation objects:

[30]:
naive_ref = pybryt.ReferenceImplementation("naive-matpow", naive_annots)
dac_ref = pybryt.ReferenceImplementation("dac-matpow", dac_annots)

It’s at this stage in the game that you’ll probably end up switching notebooks to start the next step in the process (writing the student scaffold), so it let’s save the references to files so that they can be used again later.

[31]:
naive_ref.dump()
dac_ref.dump()

Step 5: Writing the Assignment Scaffold#

Now we need to provide the students with some scaffold code that they can use to implement their own matpow function. The first step is to make some modifications to how we’re testing the solutions so that we don’t set of the annotations that forbid the use of NumPy. To do this, we’ll keep the same seed and overall structure of the test, but instead of comparing the function to NumPy’s implementation, we’ll convert each list to a string and compare the hash of the stringified list against the known hash of the stringified correct list. The cell below generates these hashes and saves them to the file hashes.json.

[50]:
hash_matrix = lambda m: hashlib.sha256(str(m).encode()).hexdigest()

def matpow(A, p):
    """
    Raise a matrix to a non-negative power.

    Parameters
    ----------
    A : list[list[int]]
        the matrix
    p : int
        the power

    Returns
    -------
    list[list[int]]
        the exponentiated matrix
    """
    if p == 0:
        return diag(1, len(A))
    elif p == 1:
        return A

    P = matpow(A, p // 2)
    if p % 2 == 0:
        return matmul(P, P)
    else:
        return matmul(A, matmul(P, P))

hashes = []

simple_matrix_powers = [
    (diag(0, 5), 0),
    (diag(0, 5), 2),
    (diag(1, 5), 2),
]
for args in simple_matrix_powers:
    mat_pow = matpow(*args)
    hashes.append(hash_matrix(mat_pow))

rng = random.Random(42)
for _ in range(5):
    n, p = rng.randrange(10, 100), rng.randrange(10)
    mat = generate_rand_matrix(n, n, dmin=-10, dmax=10)
    mat_pow = matpow(mat, p)
    hashes.append(hash_matrix(mat_pow))

with open("hashes.json", "w+") as f:
    json.dump(hashes, f)

With the hashes in place, we can rewrite our test_matpow function in a way that can be used in the students’ notebooks:

[51]:
def test_matpow(matpow):
    """
    Tests an implementation of matrix exponentiation.

    Parameters
    ----------
    matpow : callable[[list[list[int]], int], list[list[int]]]
        the matrix power function

    Raises
    ------
    AssertionError
        if `matpow` returns an incorrect value
    """
    with open("hashes.json") as f:
        hashes = json.load(f)

    hashes = iter(hashes)
    simple_matrix_powers = [
        (diag(0, 5), 0),
        (diag(0, 5), 2),
        (diag(1, 5), 2),
    ]
    for args in simple_matrix_powers:
        mat_pow = matpow(*args)
        assert hash_matrix(mat_pow) == next(hashes)

    rng = random.Random(42)
    for _ in range(5):
        n, p = rng.randrange(10, 100), rng.randrange(10)
        mat = generate_rand_matrix(n, n, dmin=-10, dmax=10)
        mat_pow = matpow(mat, p)
        assert hash_matrix(mat_pow) == next(hashes)

A very important note: The function that we use to test the students’ implementations of the matpow function must call the function on the same inputs as were used to construct the reference implementation, as the one above does thanks to RNG seeding. Other inputs can be added, but all of the inputs used to construct annotations must be used in the student implementation. This is because PyBryt’s annotations, at their most basic level, work by checking for specific values being used in the students’ code, so if some inputs used to construct the reference are not used to test the students’ implementations, those values will not be present and the reference will never be satisfied.

Let’s verify that this works against the matpow function from above:

[52]:
test_matpow(matpow)

Finally, all that’s left to do is provide the students a place to put their code and connect all the pieces:

def matpow(A, p):
    """
    Raise a matrix to a non-negative power.

    Parameters
    ----------
    A : list[list[int]]
        the matrix
    p : int
        the power

    Returns
    -------
    list[list[int]]
        the exponentiated matrix
    """
    ... # YOUR CODE HERE

test_matpow(matpow)

If you want students to be able to run their implementations against the references on their own, you can use the pybryt.check context manager:

with pybryt.check([naive_ref, dac_ref]):
    test_matpow(matpow)

Conclusion#

And that’s all there is to it! Continue reading the documentation to learn more about the different kinds of annotations PyBryt supports, how to create and manage student implementations, and much more. If you’d like more examples on how to use PyBryt, take a look at the intro and advanced modules on Microsoft Learn, or the demos folder in the GitHub repo.