five

NeuralPGRank/fiqa-hard-negatives

收藏
Hugging Face2024-11-23 更新2025-11-29 收录
下载链接:
https://hf-mirror.com/datasets/NeuralPGRank/fiqa-hard-negatives
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-sa-4.0 language: - en --- # Dataset Card ## Dataset Details This dataset contains a set of candidate documents for second-stage re-ranking on fiqa (test split in [BEIR](https://huggingface.co/BeIR)). Those candidate documents are composed of hard negatives mined from [gtr-t5-xl](https://huggingface.co/sentence-transformers/gtr-t5-xl) as Stage 1 ranker and ground-truth documents that are known to be relevant to the query. This is a release from our paper [Policy-Gradient Training of Language Models for Ranking](https://gao-g.github.io/), so please cite it if using this dataset. ## Direct Use You can load the dataset by: ```python from datasets import load_dataset dataset = load_dataset("NeuralPGRank/fiqa-hard-negatives") ``` Each example is an dictionary: ```python >>> python dataset['test'][0] { "qid" : ..., # query ID "topk" : { doc ID: ..., # document ID as the key; None or a score as the value doc ID: ..., ... }, } ``` ## Citation ``` @inproceedings{Gao2023PolicyGradientTO, title={Policy-Gradient Training of Language Models for Ranking}, author={Ge Gao and Jonathan D. Chang and Claire Cardie and Kiant{\'e} Brantley and Thorsten Joachims}, booktitle={Conference on Neural Information Processing Systems (Foundation Models for Decising Making Workshop)}, year={2023}, url={https://arxiv.org/pdf/2310.04407} } ``` ## Dataset Card Author and Contact [Ge Gao](https://gao-g.github.io/)
提供机构:
NeuralPGRank
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作