About the Leaderboard

The Verus Proof Synthesis Leaderboard tracks progress in using Large Language Models (LLMs) to automatically generate formal correctness proofs for Rust programs verified with Verus.

Our goal is to provide standardized benchmarks and transparent comparisons to accelerate research in LLM-based formal verification.

📊 Benchmarks

Verus-Bench Algorithm-Level

Classic algorithmic verification tasks including sorting, searching, array manipulation, and mathematical proofs.

150
Tasks
4
Sources
~30
Avg LoC
~10
Avg Proof LoC

Sources: CloverBench, MBPP, Diffy, Misc

VeruSAGE-Bench Repository-Level

Complex verification tasks extracted from real-world verified systems including OS kernels, distributed systems, and storage systems.

849
Tasks
8
Projects
~950
Avg LoC
~50
Avg Proof LoC

Projects: Anvil, IronKV, NRKernel, ATMO, and more

VeruSAGE-Bench Source Projects

Anvil (AL)
Distributed Systems
Anvil Advanced (AC)
Distributed Systems
IronKV (IR)
Key-Value Store
Memory Allocator (MA)
Systems
Node Replication (NO)
Distributed Systems
NRKernel (NR)
OS Kernel
ATMO (OS)
Microkernel
Storage (ST)
Storage Systems
Vest (VE)
Serialization

🔬 Proof Synthesis Systems

AutoVerus

AutoVerus is a three-phase proof synthesis system for algorithm-level verification:

📄 Paper: arXiv:2409.13082 (OOPSLA 2025)

VeruSAGE

VeruSAGE is an agentic framework for repository-level verification that uses an observation-reasoning-action loop with specialized tools for navigating large codebases.

📄 Paper: arXiv:2512.18436

📚 What is Verus?

Verus is a tool for verifying the correctness of Rust programs. It uses SMT-based verification to prove that code satisfies its specifications, which are written as pre/post-conditions and invariants in Rust syntax.

Unlike testing, which can only show the presence of bugs, verification can mathematically prove their absence. Verus has been used to verify production systems including IronKV and components of Azure infrastructure.

📖 Citation

If you use our benchmarks or systems in your research, please cite:

@article{autoverus, title={AutoVerus: Automated Proof Generation for Rust Code}, author={Chenyuan Yang and Xuheng Li and Md Rakib Hossain Misu and Jianan Yao and Weidong Cui and Yeyun Gong and Chris Hawblitzel and Shuvendu K. Lahiri and Jacob R. Lorch and Shuai Lu and Fan Yang and Ziqiao Zhou and Shan Lu}, journal={Proceedings of the ACM on Programming Languages}, volume={9}, number={OOPSLA2}, year={2025}, publisher={ACM} }
@misc{verusage, title={VeruSAGE: A Study of Agent-Based Verification for Rust Systems}, author={Chenyuan Yang and Natalie Neamtu and Chris Hawblitzel and Jacob R. Lorch and Shan Lu}, year={2025}, eprint={2512.18436}, archivePrefix={arXiv}, primaryClass={cs.OS} }

🔗 Related Links

📧 Contact

For questions, suggestions, or collaborations, please open an issue on our GitHub repository or contact the authors directly.