five

CRaQAn (Coreference Resolution in Question-Answering)

收藏
arXiv2023-11-28 更新2024-06-21 收录
下载链接:
https://huggingface.co/datasets/Edge-Pyxos/CRaQAn_v1
下载链接
链接失效反馈
官方服务:
资源简介:
CRaQAn数据集是由边缘分析公司和Pyxos公司联合开发的,专注于问答任务中的指代消解问题。该数据集包含261个经过人工审核的问答样本,每个样本都涉及跨句子的指代消解。数据集的内容主要来源于维基百科关于现代美国法律的文章,这些文章因其复杂的指代关系而被选中。创建过程中,使用了GPT-4和递归批评与改进循环(RCI)技术来自动生成高质量的数据集。CRaQAn数据集的应用领域主要集中在测试和评估问答系统中的信息检索策略,特别是在处理长文档时的指代消解能力。

The CRaQAn dataset was co-developed by Edge Analytics and Pyxos, focusing on coreference resolution in question answering (QA) tasks. This dataset contains 261 manually reviewed QA samples, each involving cross-sentence coreference resolution. The content of the dataset is primarily sourced from Wikipedia articles on modern American law, which were selected for their complex coreference relations. During the dataset's development, GPT-4 and the Recursive Criticism and Improvement (RCI) technique were utilized to automatically generate high-quality samples for this dataset. The main application areas of the CRaQAn dataset focus on testing and evaluating information retrieval strategies in QA systems, particularly the coreference resolution capability when processing long documents.
提供机构:
边缘分析公司和Pyxos公司
创建时间:
2023-11-28
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作