RUC-NLPIR/FlashRAG_datasets
收藏Hugging Face2025-05-06 更新2024-07-22 收录
下载链接:
https://hf-mirror.com/datasets/RUC-NLPIR/FlashRAG_datasets
下载链接
链接失效反馈官方服务:
资源简介:
FlashRAG数据集是一个用于检索增强生成(RAG)研究的Python工具包,包含了32个预处理的基准RAG数据集和13种最先进的RAG算法。这些数据集涵盖了多种任务类别,如问答、摘要生成、文本生成等,并且所有数据集都以统一的格式进行了预处理,以便于使用。每个数据集都包含了训练集、开发集和测试集,具体大小在README中有详细列出。此外,还提供了用于检索的文档库的路径和格式说明。
The FlashRAG dataset is a Python toolkit for Retrieval-Augmented Generation (RAG) research, containing 32 pre-processed benchmark RAG datasets and 13 state-of-the-art RAG algorithms. These datasets cover various task categories such as question answering, summarization, and text generation, and all datasets have been pre-processed to ensure a consistent format for ease of use. Each dataset includes training, development, and test sets, with specific sizes detailed in the README. Additionally, the toolkit provides a document corpus for retrieval, with path and format instructions included.
提供机构:
RUC-NLPIR
原始信息汇总
FlashRAG Datasets
基本信息
- 许可证: CC BY-SA 4.0
- 任务类别:
- 问答
- 摘要生成
- 文本到文本生成
- 语言: 英语
- 数据集大小: 1M < n < 10M
数据集名称
- 名称: FlashRAG Datasets



