five

Graph-COM/GraphKV

收藏
Hugging Face2025-06-30 更新2026-01-03 收录
下载链接:
https://hf-mirror.com/datasets/Graph-COM/GraphKV
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: apache-2.0 --- # Datahub for Graph-KV This directory contains processed datasets for retrieval-augmented generation (RAG) and Arxiv-QA tasks, used in the paper [Graph-KV: Breaking Sequence via Injecting Structural Biases into Large Language Models](http://www.arxiv.org/pdf/2506.07334). It is organized into two main folders: `rag` and `arxiv`, and `results`. --- ## 📁 `rag/` This folder includes preprocessed data for several commonly used RAG datasets. Each subdirectory corresponds to a different dataset split or benchmark: - `2wiki_dev`: Processed subset from the [2Wiki](https://arxiv.org/abs/2011.01060) dataset (dev split) - `hqa_eval`: [HotpotQA](https://arxiv.org/abs/1809.09600) evaluation set - `morehopqa`: [Morehop-QA](https://arxiv.org/abs/2406.13397) dataset - `multihopqa`: [Multihop-RAG](https://arxiv.org/abs/2401.15391) QA benchmark - `nq_eval`: [Narritive-QA](https://arxiv.org/abs/1712.07040) evaluation set - `tqa_eval`: [TriviaQA](https://arxiv.org/abs/1705.03551) evaluation set All files are preprocessed and ready for use in RAG-based models. --- ## 📁 `arxiv/` This folder contains the processed version of the proposed ArxivQA dataset. It is organized into the following subdirectories: - `questions/`: Contains the questions used for QA tasks. - `answers/`: Contains the corresponding answers. - `text_data/`: Raw extracted text data from the ArXiv PDFs. - `pdfs/`: Original PDF files. - For each QA sample, `0` folder is the central (main) paper. - `1` folder is the reference paper cited by the central paper. --- ## 📁 `results` This folder saves the computed results of the RAG task and the Arxiv-QA task. ---
提供机构:
Graph-COM
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作