NeuralPGRank/fiqa-hard-negatives
收藏Hugging Face2024-11-23 更新2025-11-29 收录
下载链接:
https://hf-mirror.com/datasets/NeuralPGRank/fiqa-hard-negatives
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-sa-4.0
language:
- en
---
# Dataset Card
## Dataset Details
This dataset contains a set of candidate documents for second-stage re-ranking on fiqa
(test split in [BEIR](https://huggingface.co/BeIR)). Those candidate documents are composed of hard negatives mined from
[gtr-t5-xl](https://huggingface.co/sentence-transformers/gtr-t5-xl) as Stage 1 ranker
and ground-truth documents that are known to be relevant to the query. This is a release from our paper
[Policy-Gradient Training of Language Models for Ranking](https://gao-g.github.io/), so
please cite it if using this dataset.
## Direct Use
You can load the dataset by:
```python
from datasets import load_dataset
dataset = load_dataset("NeuralPGRank/fiqa-hard-negatives")
```
Each example is an dictionary:
```python
>>> python dataset['test'][0]
{
"qid" : ..., # query ID
"topk" : {
doc ID: ..., # document ID as the key; None or a score as the value
doc ID: ...,
...
},
}
```
## Citation
```
@inproceedings{Gao2023PolicyGradientTO,
title={Policy-Gradient Training of Language Models for Ranking},
author={Ge Gao and Jonathan D. Chang and Claire Cardie and Kiant{\'e} Brantley and Thorsten Joachims},
booktitle={Conference on Neural Information Processing Systems (Foundation Models for Decising Making Workshop)},
year={2023},
url={https://arxiv.org/pdf/2310.04407}
}
```
## Dataset Card Author and Contact
[Ge Gao](https://gao-g.github.io/)
提供机构:
NeuralPGRank



