five

COMPLEXTEMPQA

收藏
arXiv2024-06-07 更新2024-06-21 收录
下载链接:
https://github.com/DataScienceUIBK/ComplexTempQA
下载链接
链接失效反馈
官方服务:
资源简介:
COMPLEXTEMPQA是由因斯布鲁克大学创建的一个大规模时序问答数据集,包含超过1亿个问答对。该数据集利用维基百科和Wikidata的数据,覆盖了跨越两十年的广泛话题,并引入了独特的分类法,将问题分为属性、比较和计数三类。数据集的特点是其问题的高复杂性,需要跨时间比较、时序聚合和多跳推理等高级推理技能。此外,每个问题都附有详细的元数据,包括特定的时间范围,用于全面评估和增强大型语言模型的时序推理能力。COMPLEXTEMPQA不仅作为开发复杂AI模型的测试平台,也是推动问答、信息检索和语言理解研究的基础。

COMPLEXTEMPQA is a large-scale temporal question answering dataset created by the University of Innsbruck, containing over 100 million question-answer pairs. Leveraging data from Wikipedia and Wikidata, it covers a wide range of topics spanning two decades and introduces a unique taxonomy that categorizes questions into three types: attribute-based, comparative, and counting-based. The dataset is characterized by highly complex questions that require advanced reasoning skills such as cross-temporal comparison, temporal aggregation, and multi-hop reasoning. Additionally, each question is accompanied by detailed metadata including specific time ranges, which enables comprehensive evaluation and enhancement of the temporal reasoning capabilities of large language models (LLMs). COMPLEXTEMPQA not only serves as a testbed for developing complex AI models but also acts as a foundational resource for advancing research in question answering, information retrieval, and natural language understanding.
提供机构:
因斯布鲁克大学
创建时间:
2024-06-07
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作