Skip to content

How Revizor works

Revizor in a nutshell

Revizor is a tool for detecting unexpected microarchitectural leakage in CPUs. Microarchitectural leakage means the information that an attacker could learn by launching a microarchitectural side-channel attack (e.g., Spectre or Meltdown). The expected microarchitectural leakage is the leakage that we already know about (i.e., known microarchitectural vulnerabilities). We describe the expected leakage in a form of a Speculation Contract (see below). Accordingly, the unexpected leakage is any leakage not described pby a contract - we call it a contract violation. Revizor's task is to find such violations.

Speculation Contracts

Below is a brief intro to Contracts. You can find a more detailed description in the original paper and in the Background section of the Revizor paper.

Microarchitectural Leakage and Hardware Traces

Consider two programs, an attacker and a victim. The attacker launches a microarchitectural side-channel attack (e.g., a cache side channel) to spy on the victim and learn some of its data. A hardware trace is a sequence of all the observations made through this side-channel after each instruction during the victim's execution. In other words, hardware trace is the result for a side-channel attack.

We abstractly represent the hardware trace as the output of a function

π»π‘‡π‘Ÿπ‘Žπ‘π‘’ = π΄π‘‘π‘‘π‘Žπ‘π‘˜(π‘ƒπ‘Ÿπ‘œπ‘”,π·π‘Žπ‘‘π‘Ž,𝐢𝑑π‘₯)

that takes three input parameters: (1) the victim program π‘ƒπ‘Ÿπ‘œπ‘”; (2) the input π·π‘Žπ‘‘π‘Ž processed by the victim’s program (i.e., the architectural state including registers and main memory); (3) the microarchitectural context 𝐢𝑑π‘₯ in which it executes. The information exposed by a hardware trace depends on the assumed side-channel and threat model.

Example: If the threat model includes attacks on a data cache, π»π‘‡π‘Ÿπ‘Žπ‘π‘’ is composed of the cache set indexes used by π‘ƒπ‘Ÿπ‘œπ‘”β€™s loads and stores. If it includes attacks on an instruction cache, π»π‘‡π‘Ÿπ‘Žπ‘π‘’ contains the addresses of executed instructions.

A program leaks information via side-channels when its hardware traces depend on the inputs (π·π‘Žπ‘‘π‘Ž): We assume the attacker knows π‘ƒπ‘Ÿπ‘œπ‘” and can manipulate 𝐢𝑑π‘₯, hence any difference between the hardware traces implies difference in π·π‘Žπ‘‘π‘Ž, which effectively exposes information to the attacker.

What's a Speculation Contract?

A speculation contract specifies the information that can be exposed by a CPU during a program execution under a given threat model. For each instruction in the CPU ISA (or a subset thereof), a contract describes the information exposed by the instruction’s (observation clause) and the externally-observable speculation that the instruction may trigger (execution clause). When a contract covers a subset of ISA, the leakage of unspecified instructions is undefined.

Example: consider the contract summarized in the next table:

Observation Clause Execution Clause
Load Expose Address -
Store Expose Address -
Cond. Jump - Mispredict Target
Other - -

We call this contract MEM-COND. Through the observation clauses of loads and stores, the contract prescribes that addresses of all memory access may be exposed (hence MEM). The execution clause of conditional branches describes their misprediction, thus the contract prescribes that branch targets may be mispredicted (hence COND). This way, the contract models a data cache side channel on a CPU with branch prediction.

A contract trace πΆπ‘‡π‘Ÿπ‘Žπ‘π‘’ contains the sequence of all the observations the contract allows to be exposed after each instruction during a program execution, including the instructions executed speculatively. Conversely, the information that is not exposed via πΆπ‘‡π‘Ÿπ‘Žπ‘π‘’ is supposed to be kept secret.

We abstractly represent a contract as a function πΆπ‘œπ‘›π‘‘π‘Ÿπ‘Žπ‘π‘‘ that maps the program π‘ƒπ‘Ÿπ‘œπ‘” and its input π·π‘Žπ‘‘π‘Ž to a contract trace πΆπ‘‡π‘Ÿπ‘Žπ‘π‘’:

πΆπ‘‡π‘Ÿπ‘Žπ‘π‘’ = πΆπ‘œπ‘›π‘‘π‘Ÿπ‘Žπ‘π‘‘(π‘ƒπ‘Ÿπ‘œπ‘”,π·π‘Žπ‘‘π‘Ž)

Example: Consider the following program:

z = array1[x] # base of array1 is 0x100
if y < 10:
    z = array2[y] # base of array2 is 0x200
It is executed with an input data={x=10,y=20}. The MEM-COND contract trace is ctrace=[0x110,0x220], representing that the load at line 1 exposes the accessed address during normal execution, and the load at line 3 exposes its address during speculative execution triggered by the branch at line 2.

A CPU complies with a contract when its hardware traces (collected on the actual CPU) leak at most as much information as the contract traces. Formally, we require that whenever any two executions of any program have the same contract trace (implying the difference between inputs is not exposed), the respective hardware traces should also match.

A CPU complies with a πΆπ‘œπ‘›π‘‘π‘Ÿπ‘Žπ‘π‘‘ if, for all programs π‘ƒπ‘Ÿπ‘œπ‘”, all input pairs (π·π‘Žπ‘‘π‘Ž,π·π‘Žπ‘‘π‘Žβ€²), and all initial microarchitectural states 𝐢𝑑π‘₯:

πΆπ‘œπ‘›π‘‘π‘Ÿπ‘Žπ‘π‘‘(π‘ƒπ‘Ÿπ‘œπ‘”,π·π‘Žπ‘‘π‘Ž) = πΆπ‘œπ‘›π‘‘π‘Ÿπ‘Žπ‘π‘‘(π‘ƒπ‘Ÿπ‘œπ‘”,π·π‘Žπ‘‘π‘Žβ€²) -> π΄π‘‘π‘‘π‘Žπ‘π‘˜(π‘ƒπ‘Ÿπ‘œπ‘”,π·π‘Žπ‘‘π‘Ž,𝐢𝑑π‘₯) = π΄π‘‘π‘‘π‘Žπ‘π‘˜(π‘ƒπ‘Ÿπ‘œπ‘”,π·π‘Žπ‘‘π‘Žβ€²,𝐢𝑑π‘₯)

Conversely, a CPU violates a contract if there exists a program π‘ƒπ‘Ÿπ‘œπ‘”, a microarchitectural state Ctx, and two inputs π·π‘Žπ‘‘π‘Ž,π·π‘Žπ‘‘π‘Žβ€² that agree on their contract traces but disagree on the hardware traces. We call the tuple (π‘ƒπ‘Ÿπ‘œπ‘”,𝐢𝑑π‘₯,π·π‘Žπ‘‘π‘Ž,π·π‘Žπ‘‘π‘Žβ€²) a contract counterexample. The counterexample witnesses that an adversary can learn more information from hardware traces than what the contract specifies. A counterexample indicates a potential microarchitectural vulnerability that was not accounted for by the contract.

Model-based Relational Testing

To find contract violations, we use the following approach, which we call Model-based Relational Testing (MRT).

The next figure show the main components of MRT:

MRT

Test case and input generation. We sample the search space of programs, inputs and microarchitectural states to find counterexamples. The generated instruction sequences (test cases) are comprised of the ISA subset described by the contract. The test cases and respective inputs to them are generated to achieve high diversity and to increase speculation or leakage potential.

Collecting contract traces. We implement an executable Model of the contract to allow automatic collection of contract traces for standard binaries. For this, we modify a functional CPU emulator to implement speculative control flow based on a contract’s execution clause, and to record traces based on its observation clause.

Collecting hardware traces. We collect hardware traces by executing the test case on the CPU under test and measuring the observable microarchitectural state changes during the execution according to the threat model. The executor employs several methods to achieve consistent and repeatable measurements.

Relational Analysis. Based on the collected contract and hardware traces, we identify contract violations. Namely, we search for pairs of inputs that match the following:

ContractTrace1 == ContractTrace2
               and
HardwareTrace1 != HardwareTrace2

This requires relational reasoning: * We partition inputs into groups, which we call input classes. All inputs within a class have the same contract trace. Thus, input classes correspond to the equivalence classes of equality on contract traces. Classes with a single (ineffective) input are discarded. * For each class, we check if all inputs within a class have the same hardware trace.

If the check fails on any of the classes, we found a counterexample that witnesses contract violation.

Revizor

Revizor implements the MRT approach for black-box CPUs. The implementation details are described in Revizor Architecture.