aialt/RetrievalQA
收藏Hugging Face2024-05-28 更新2025-04-26 收录
下载链接:
https://hf-mirror.com/datasets/aialt/RetrievalQA
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mit
task_categories:
- question-answering
language:
- en
size_categories:
- 1K<n<10K
viewer: false
---
## Dataset Summary
**RetrievalQA** is a short-form open-domain question answering (QA) dataset comprising 2,785 questions covering new world and long-tail knowledge. It contains 1,271 questions needing external knowledge retrieval and 1,514 questions that most LLMs can answer with internal parametric knowledge.
RetrievalQA enables us to evaluate the effectiveness of **adaptive retrieval-augmented generation (RAG)** approaches, an aspect predominantly overlooked
in prior studies and recent RAG evaluation systems, which focus only on task performance, the relevance of retrieval context or the faithfulness of answers.
## Dataset Sources
- **Repository:** https://github.com/hyintell/RetrievalQA
- **Paper:** https://arxiv.org/abs/2402.16457
## Dataset Structure
Here is an example of a data instance:
```json
{
"data_source": "realtimeqa",
"question_id": "realtimeqa_20231013_1",
"question": "What percentage of couples are 'sleep divorced', according to new research?",
"ground_truth": ["15%"],
"context": [
{
"title": "Do We Sleep Longer When We Share a Bed?",
"text": "1.4% of respondents have started a sleep divorce, or sleeping separately from their partner, and maintained it in the past year. Adults who have ..."
}, ...
],
"param_knowledge_answerable": 0
}
```
where:
- `data_source`: the origin dataset of the question comes from
- `question`: the question
- `ground_truth`: a list of possible answers
- `context`: a list of dictionaries of retrieved relevant evidence. Note that the `title` of the document might be empty.
- `param_knowledge_answerable`: 0 indicates the question needs external retrieval; 1 indicates the question can be answerable using its parametric knowledge
## Citation
<!-- If there is a paper or blog post introducing the dataset, the APA and Bibtex information for that should go in this section. -->
```bibtex
@misc{zhang2024retrievalqa,
title={RetrievalQA: Assessing Adaptive Retrieval-Augmented Generation for Short-form Open-Domain Question Answering},
author={Zihan Zhang and Meng Fang and Ling Chen},
year={2024},
eprint={2402.16457},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
```
许可证:MIT许可证
任务类别:
- 问答任务
语言:
- 英语
样本规模类别:
- 1000 < 样本数 < 10000
数据集查看器:禁用
---
## 数据集概述
**RetrievalQA**是一个短篇开放域问答(QA)数据集,共包含2785道覆盖新兴领域与长尾知识的问题。其中1271道问题需要借助外部知识检索完成作答,另有1514道问题可通过多数大语言模型(Large Language Model,LLM)的内部参数知识直接解答。
RetrievalQA可用于评估自适应检索增强生成(Adaptive Retrieval-Augmented Generation,RAG)方法的有效性,而此前的相关研究与现有RAG评估系统大多仅关注任务性能、检索上下文相关性或答案忠实度,忽略了这一关键评估维度。
## 数据集来源
- **代码仓库**:https://github.com/hyintell/RetrievalQA
- **相关论文**:https://arxiv.org/abs/2402.16457
## 数据集结构
以下为单条数据实例的示例:
json
{
"data_source": "realtimeqa",
"question_id": "realtimeqa_20231013_1",
"question": "最新研究显示,“睡眠离婚”的夫妻占比为多少?",
"ground_truth": ["15%"],
"context": [
{
"title": "同床共眠时我们的睡眠时间更长吗?",
"text": "1.4%的受访者已开启“睡眠离婚”,即与伴侣分床睡,并在过去一年中保持这一状态。有此经历的成年人……"
}, ...
],
"param_knowledge_answerable": 0
}
其中各字段含义如下:
- `data_source`:该问题的来源数据集
- `question`:待解答的问题
- `ground_truth`:标准答案列表
- `context`:检索到的相关证据字典列表,需注意文档的`title`字段可能为空
- `param_knowledge_answerable`:0代表该问题需要外部检索作答,1代表可通过参数知识直接解答
## 引用
<!-- 若该数据集由论文或博客文章推出,请在此处附上对应的APA格式与Bibtex格式引用信息。 -->
bibtex
@misc{zhang2024retrievalqa,
title={RetrievalQA: Assessing Adaptive Retrieval-Augmented Generation for Short-form Open-Domain Question Answering},
author={Zihan Zhang and Meng Fang and Ling Chen},
year={2024},
eprint={2402.16457},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
提供机构:
aialt



