HotpotQA

Opencsg2024-03-21 更新2024-06-22 收录

下载链接：

https://www.opencsg.com/datasets/OpenDataLab/HotpotQA

下载链接

链接失效反馈

官方服务：

资源简介：

HotpotQA 是收集在英语维基百科上的问答数据集，包含大约 113K 众包问题，这些问题的构建需要两篇维基百科文章的介绍段落才能回答。数据集中的每个问题都带有两个黄金段落，以及这些段落中的句子列表，众包工作人员认为这些句子是回答问题所必需的支持事实。 HotpotQA 提供了多种推理策略，包括涉及问题中缺失实体的问题、交叉问题（什么满足属性 A 和属性 B？）和比较问题，其中两个实体通过一个共同属性进行比较等。在少文档干扰设置中，QA 模型有 10 个段落，保证能找到黄金段落；在开放域全维基设置中，模型只给出问题和整个维基百科。模型根据其答案准确性和可解释性进行评估，其中前者被测量为具有完全匹配 (EM) 和 unigram F1 的预测答案和黄金答案之间的重叠，后者关注预测的支持事实句子与人类注释的匹配程度（Supporting Fact EM/F1)。该数据集还报告了一个联合指标，它鼓励系统同时在两项任务上表现良好。来源：通过迭代查询生成回答复杂的开放域问题

HotpotQA is a question answering dataset collected from English Wikipedia, containing approximately 113K crowdsourced questions. These questions can only be answered by leveraging the introductory paragraphs of two Wikipedia articles. Each question in the dataset is paired with two gold paragraphs, along with a list of sentences from these paragraphs that crowdworkers identify as necessary supporting facts for answering the question. HotpotQA supports multiple reasoning strategies, including questions with missing entities in the query, cross-type questions (e.g., "What satisfies both Attribute A and Attribute B?"), and comparative questions where two entities are compared via a shared attribute, among others. In the few-document distraction setting, the QA model is provided with 10 paragraphs, among which the gold paragraphs are guaranteed to be included; in the open-domain full Wikipedia setting, the model is only given the question and the entire English Wikipedia corpus. Models are evaluated based on both answer accuracy and explainability. The former is measured by the overlap between predicted answers and gold answers using Exact Match (EM) and unigram F1 scores; the latter focuses on the degree of match between predicted supporting fact sentences and human-annotated ones (Supporting Fact EM/F1). The dataset also reports a joint metric that encourages systems to perform well on both tasks. Source: Answering Complex Open-Domain Questions via Iterative Query Generation

创建时间：

2024-03-21

搜集汇总

数据集介绍

背景与挑战

背景概述

HotpotQA是一个英语维基百科问答数据集，包含约113K个需要多跳推理的问题，每个问题附带两个黄金段落和支持事实句子。数据集支持多种推理策略，并提供了答案准确性和可解释性的评估指标。

以上内容由遇见数据集搜集并总结生成

5,000+

优质数据集

54 个

任务类型

进入经典数据集