Chinese-Mongolian bilingual legal field question and answer corpus dataset

Name: Chinese-Mongolian bilingual legal field question and answer corpus dataset
Creator: Science Data Bank
Published: 2025-04-27 22:13:48
License: 暂无描述

DataCite Commons2025-04-27 更新2025-04-16 收录

下载链接：

https://www.scidb.cn/detail?dataSetId=35804ce8c72247f6ba167f1463a54550

下载链接

链接失效反馈

官方服务：

资源简介：

With the development of large model technology, intelligent question answering is more and more widely used in people's work and life. However, due to the limitation of data resources, the intelligent question answering system of low-resource languages such as Mongolian can not meet the application needs of people. This study uses the existing Chinese question and answer(Q&A) corpus, constructs 50,000 pairs of Chinese-Mongolian bilingual Q&A corpus data and corresponding classification labels through the steps of rule screening, Chinese-Mongolian translation and manual correction. This dataset can provide researchers with rich, accurate question-answering samples for training and evaluating the performance of intelligent question-answering systems, as well as for tasks such as machine translation and text classification. The manual evaluation verifies that 92% of the corpus conforms to the Q&A in the field of Chinese-Mongolian bilingual law. Therefore, the data set has important usage value for promoting the research of Chinese and Mongolian multi-language intelligent question answering.

提供机构：

Science Data Bank

创建时间：

2024-05-21