yue1045/rag-mini-wikipedia
收藏Hugging Face2026-04-28 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/yue1045/rag-mini-wikipedia
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是一个基于维基百科的小型开放域问答数据集,适用于问答和句子相似性任务。它包含两个配置:文本语料(text-corpus)和问答对(question-answer),用于支持信息检索、RAG(检索增强生成)和DPR(密集段落检索)等应用。数据集源自Kaggle的questionanswer-dataset,并通过generate.py生成了子集,规模较小(少于1K样本)。
This dataset is a small open-domain question-answering dataset based on Wikipedia, suitable for question-answering and sentence-similarity tasks. It includes two configurations: text-corpus and question-answer, designed to support applications such as information retrieval, RAG (Retrieval-Augmented Generation), and DPR (Dense Passage Retrieval). The dataset is derived from the Kaggle questionanswer-dataset, with a subset generated using generate.py, and has a small size (less than 1K samples).
提供机构:
yue1045



