PQuAD
收藏arXiv2022-02-13 更新2024-08-06 收录
下载链接:
http://arxiv.org/abs/2202.06219v1
下载链接
链接失效反馈官方服务:
资源简介:
PQuAD是一个由阿米尔卡比尔理工大学创建的波斯语阅读理解数据集,基于波斯语维基百科文章,包含80,000个人工标注的问题及其答案。数据集覆盖广泛的主题,旨在为波斯语的机器阅读理解研究提供基准。创建过程中,通过精心挑选的维基百科文章和多阶段的人工标注确保了数据集的质量。PQuAD的应用领域包括波斯语阅读理解和问答系统的开发,旨在解决波斯语资源有限的问题,推动波斯语自然语言处理的发展。
PQuAD is a Persian reading comprehension dataset developed by Amirkabir University of Technology. It is constructed based on Persian Wikipedia articles and contains 80,000 manually annotated questions and their corresponding answers. Covering a wide range of topics, this dataset aims to serve as a benchmark for research on Persian machine reading comprehension. During its development, carefully selected Wikipedia articles and multi-stage manual annotation procedures were employed to ensure the dataset's quality. PQuAD can be applied to Persian reading comprehension and the development of question answering systems, and is designed to address the problem of limited Persian language resources, thus promoting the advancement of Persian natural language processing (NLP).
提供机构:
阿米尔卡比尔理工大学(德黑兰理工)
创建时间:
2022-02-13



