MINTAKA
收藏arXiv2022-10-04 更新2024-06-21 收录
下载链接:
https://github.com/amazon-research/mintaka
下载链接
链接失效反馈官方服务:
资源简介:
MINTAKA是一个大规模、复杂、自然且多语言的问答数据集,由亚马逊Alexa人工智能团队创建。该数据集包含20,000个英语问题-答案对,并被翻译成阿拉伯语、法语、德语等8种语言,总计180,000个样本。数据集涵盖8种复杂问题类型,如最高级、交集和多跳问题,这些问题自然地由众包工作者提出。MINTAKA通过要求众包工作者将问题和答案文本与Wikidata实体关联,从而与知识图谱链接。该数据集旨在解决现有问答模型在处理复杂问题上的不足,特别是在多语言环境下的应用。
MINTAKA is a large-scale, complex, natural, multilingual question answering (QA) dataset developed by the Amazon Alexa AI team. It comprises 20,000 English question-answer pairs, which have been translated into 8 languages including Arabic, French, German and others, with a total of 180,000 samples. The dataset covers 8 types of complex questions, such as superlative, intersectional and multi-hop questions, which are naturally formulated by crowdworkers. MINTAKA is grounded in the knowledge graph by requiring crowdworkers to associate the texts of questions and answers with Wikidata entities. This dataset is designed to address the limitations of existing QA models in handling complex questions, particularly their applications in multilingual scenarios.
提供机构:
亚马逊Alexa人工智能
创建时间:
2022-10-04



