PAQ
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/facebookresearch/paq
下载链接
链接失效反馈官方服务:
资源简介:
该数据集名为PAQ,包含6500万个自动生成的问题-答案对,旨在提升开放领域问答模型的性能。这些数据集来源于维基百科,并根据答案提取器和问题生成器的不同版本进行了分类。规模上,该数据集包含了6500万个问答对,其任务专注于问答领域。
This dataset, named PAQ, contains 65 million automatically generated question-answer pairs, aiming to enhance the performance of open-domain question answering models. Derived from Wikipedia, it is categorized based on different versions of answer extractors and question generators. In terms of scale, this dataset has 65 million question-answer pairs, focusing on the question answering domain.



