five

Chinese-Mongolian bilingual legal field question and answer corpus dataset

收藏
DataCite Commons2025-04-27 更新2025-04-16 收录
下载链接:
https://www.scidb.cn/detail?dataSetId=35804ce8c72247f6ba167f1463a54550
下载链接
链接失效反馈
官方服务:
资源简介:
With the development of large model technology, intelligent question answering is more and more widely used in people's work and life. However, due to the limitation of data resources, the intelligent question answering system of low-resource languages such as Mongolian can not meet the application needs of people. This study uses the existing Chinese question and answer(Q&A) corpus, constructs 50,000 pairs of Chinese-Mongolian bilingual Q&A corpus data and corresponding classification labels through the steps of rule screening, Chinese-Mongolian translation and manual correction. This dataset can provide researchers with rich, accurate question-answering samples for training and evaluating the performance of intelligent question-answering systems, as well as for tasks such as machine translation and text classification. The manual evaluation verifies that 92% of the corpus conforms to the Q&A in the field of Chinese-Mongolian bilingual law. Therefore, the data set has important usage value for promoting the research of Chinese and Mongolian multi-language intelligent question answering.
提供机构:
Science Data Bank
创建时间:
2024-05-21
二维码
社区交流群
二维码
科研交流群
商业服务