COMPLEXTEMPQA
收藏arXiv2024-06-07 更新2024-06-21 收录
下载链接:
https://github.com/DataScienceUIBK/ComplexTempQA
下载链接
链接失效反馈官方服务:
资源简介:
COMPLEXTEMPQA是由因斯布鲁克大学创建的一个大规模时序问答数据集,包含超过1亿个问答对。该数据集利用维基百科和Wikidata的数据,覆盖了跨越两十年的广泛话题,并引入了独特的分类法,将问题分为属性、比较和计数三类。数据集的特点是其问题的高复杂性,需要跨时间比较、时序聚合和多跳推理等高级推理技能。此外,每个问题都附有详细的元数据,包括特定的时间范围,用于全面评估和增强大型语言模型的时序推理能力。COMPLEXTEMPQA不仅作为开发复杂AI模型的测试平台,也是推动问答、信息检索和语言理解研究的基础。
COMPLEXTEMPQA is a large-scale temporal question answering dataset created by the University of Innsbruck, containing over 100 million question-answer pairs. Leveraging data from Wikipedia and Wikidata, it covers a wide range of topics spanning two decades and introduces a unique taxonomy that categorizes questions into three types: attribute-based, comparative, and counting-based. The dataset is characterized by highly complex questions that require advanced reasoning skills such as cross-temporal comparison, temporal aggregation, and multi-hop reasoning. Additionally, each question is accompanied by detailed metadata including specific time ranges, which enables comprehensive evaluation and enhancement of the temporal reasoning capabilities of large language models (LLMs). COMPLEXTEMPQA not only serves as a testbed for developing complex AI models but also acts as a foundational resource for advancing research in question answering, information retrieval, and natural language understanding.
提供机构:
因斯布鲁克大学
创建时间:
2024-06-07



