SberQuAD
收藏arXiv2020-05-03 更新2024-06-21 收录
下载链接:
https://github.com/sberbank-ai/data-science-journey-2017
下载链接
链接失效反馈官方服务:
资源简介:
SberQuAD是一个大规模的俄语阅读理解数据集,类似于英语的Stanford SQuAD。该数据集由Sberbank创建,包含约50,364个训练实例,用于机器阅读理解任务。数据集通过众包方式,从维基百科页面中提取段落,并由众包工人提出问题和答案。SberQuAD主要用于俄语阅读理解和问答系统的研究,旨在解决俄语环境下的自然语言理解问题。
SberQuAD is a large-scale Russian reading comprehension dataset analogous to the English Stanford SQuAD. Created by Sberbank, it contains approximately 50,364 training instances for machine reading comprehension tasks. The dataset is developed through crowdsourcing: paragraphs are extracted from Wikipedia pages, and questions and corresponding answers are formulated by crowd workers. SberQuAD is primarily utilized for research on Russian reading comprehension and question answering systems, with the objective of addressing natural language understanding challenges within the Russian language context.
提供机构:
圣彼得堡国立大学
创建时间:
2019-12-20



