KorNLI and KorSTS
收藏arXiv2020-10-05 更新2024-06-21 收录
下载链接:
https://github.com/kakaobrain/KorNLUDatasets
下载链接
链接失效反馈官方服务:
资源简介:
KorNLI和KorSTS是专为韩语自然语言理解(NLU)任务设计的新基准数据集。KorNLI源自SNLI、MNLI和XNLI,包含950,354个自动翻译的训练示例和7,500个人工翻译的评估示例。KorSTS则源自STS-B,包含5,749个自动翻译的训练示例和2,879个人工翻译的评估示例。这两个数据集的构建过程包括机器翻译和人工后编辑,确保翻译质量。它们主要用于评估韩语NLI和STS任务,旨在推动韩语NLU模型的研究和开发。
KorNLI and KorSTS are novel benchmark datasets specifically designed for Korean natural language understanding (NLU) tasks. Derived from SNLI, MNLI, and XNLI, KorNLI contains 950,354 automatically translated training examples and 7,500 manually translated evaluation examples. KorSTS, originating from STS-B, includes 5,749 automatically translated training examples and 2,879 manually translated evaluation examples. The construction of both datasets involves machine translation and human post-editing to ensure translation quality. These datasets are primarily used to evaluate Korean NLI and STS tasks, aiming to promote the research and development of Korean NLU models.
提供机构:
Kakao Brain
创建时间:
2020-04-07



