BenchPress

Predict any LLM's score on any benchmark.

Try it Predict your own model

01 / 02

Pick a model. Pick a benchmark.

01 Choose a model
02 Choose a benchmark

Score

Select a model and benchmark above.

Leaderboard

On this benchmark.

Full (incl. predicted) Observed only

Resources

Paper, code, and data.

Use the code to reproduce the paper, or download the score matrix behind the predictor.

Paper code Dataset arXiv

Contribute

Have more scores?

Report benchmark scores for a model. Include the model, benchmark, score, evaluation setting, effort, and source; we will review provenance before adding it to the matrix.

Report scores