NeuralPGRank/fiqa-hard-negatives

Name: NeuralPGRank/fiqa-hard-negatives
Creator: NeuralPGRank
Published: 2024-11-23 15:33:47
License: 暂无描述

Hugging Face2024-11-23 更新2025-11-29 收录

下载链接：

https://hf-mirror.com/datasets/NeuralPGRank/fiqa-hard-negatives

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: cc-by-sa-4.0 language: - en --- # Dataset Card ## Dataset Details This dataset contains a set of candidate documents for second-stage re-ranking on fiqa (test split in [BEIR](https://huggingface.co/BeIR)). Those candidate documents are composed of hard negatives mined from [gtr-t5-xl](https://huggingface.co/sentence-transformers/gtr-t5-xl) as Stage 1 ranker and ground-truth documents that are known to be relevant to the query. This is a release from our paper [Policy-Gradient Training of Language Models for Ranking](https://gao-g.github.io/), so please cite it if using this dataset. ## Direct Use You can load the dataset by: ```python from datasets import load_dataset dataset = load_dataset("NeuralPGRank/fiqa-hard-negatives") ``` Each example is an dictionary: ```python >>> python dataset['test'][0] { "qid" : ..., # query ID "topk" : { doc ID: ..., # document ID as the key; None or a score as the value doc ID: ..., ... }, } ``` ## Citation ``` @inproceedings{Gao2023PolicyGradientTO, title={Policy-Gradient Training of Language Models for Ranking}, author={Ge Gao and Jonathan D. Chang and Claire Cardie and Kiant{\'e} Brantley and Thorsten Joachims}, booktitle={Conference on Neural Information Processing Systems (Foundation Models for Decising Making Workshop)}, year={2023}, url={https://arxiv.org/pdf/2310.04407} } ``` ## Dataset Card Author and Contact [Ge Gao](https://gao-g.github.io/)

提供机构：

NeuralPGRank

5,000+

优质数据集

54 个

任务类型

进入经典数据集