ArabicaQA
收藏arXiv2024-03-27 更新2024-06-21 收录
下载链接:
https://github.com/DataScienceUIBK/ArabicaQA
下载链接
链接失效反馈官方服务:
资源简介:
ArabicaQA是由因斯布鲁克大学创建的第一个大规模阿拉伯语机器阅读理解和开放领域问答数据集。该数据集包含89,095个可回答问题和3,701个不可回答问题,由众包工作者创建,旨在模拟真实的信息查询场景。数据集的创建过程涉及严格的标注流程,确保问题和答案的质量。ArabicaQA的应用领域广泛,旨在解决阿拉伯语自然语言处理中的资源和研究缺口,特别是在机器阅读理解和开放领域问答方面。
ArabicaQA is the first large-scale Arabic machine reading comprehension and open-domain question answering dataset created by the University of Innsbruck. It contains 89,095 answerable questions and 3,701 unanswerable questions, which were developed by crowdworkers to simulate real-world information-seeking scenarios. The dataset was constructed through a rigorous annotation process to ensure the quality of both questions and their corresponding answers. ArabicaQA covers a wide range of application scenarios, and it is designed to address the resource and research gaps in Arabic natural language processing, particularly in the fields of machine reading comprehension and open-domain question answering.
提供机构:
因斯布鲁克大学
创建时间:
2024-03-27



