Please go to for all dataset descriptions and pointers.

File format

For both tasks, please prepare the test results file in the following TAB-separated (TSV) format: qid<TAB>pid<TAB>rank.

1124703    8766037    1
1124703    8021997    2
1124703    7816201    3
1124703    8296123    4
1124703    8790898    5
1124703    5451590    6
1124703    8021999    7
1124703    8388210    8
1124703    8702520    9
1124703    8790903    10

We report MRR@10 for both tasks. Therefore, to minimize the size of your test results file, please free to only inclde the top 10 results per query.

Evaluation script

The official evaluation script for the two tasks are available at the below locations:

Submission process

Once you have built a model that meets your expectations on evaluation with the dev set, you can submit your test results to get official evaluation on the test set. To ensure the integrity of the official test results, we do not release the correct answers for test set to the public.

To submit your model for official evaluation on the test set, follow the steps corresponding to the appropriate task:

Document ranking

For the document ranking task, we follow a GitHub pull request based submission process. Please find the submission guidelines for the document ranking task here:

Passage ranking

For the passage ranking task, we follow a GitHub pull request based submission process. Please find the submission guidelines for the passage ranking task here:

Terms and Conditions

The MS MARCO and ORCAS datasets are intended for non-commercial research purposes only to promote advancement in the field of artificial intelligence and related areas, and is made available free of charge without extending any license or other intellectual property rights. The datasets are provided “as is” without warranty and usage of the data has risks since we may not own the underlying rights in the documents. We are not be liable for any damages related to use of the dataset. Feedback is voluntarily given and can be used as we see fit. By using any of these datasets you are automatically agreeing to abide by these terms and conditions. Upon violation of any of these terms, your rights to use the dataset will end automatically.

Please contact us at if you own any of the documents made available but do not want them in this dataset. We will remove the data accordingly. If you have questions about use of the dataset or any research outputs in your products or services, we encourage you to undertake your own independent legal review. For other questions, please feel free to contact us.


