CMNLI

OpenDataLab2026-04-12 更新2024-05-09 收录

下载链接：

https://opendatalab.org.cn/OpenDataLab/CMNLI

下载链接

链接失效反馈

资源简介：

CMNLI数据由两部分组成: XNLI和MNLI。数据来自小说、电话、旅游、政府、石板等。将原MNLI数据和XNLI数据转换成中英文，保留原训练集，将XNLI中的dev和MNLI中的match合并，作为CMNLI的开发，和XNLI合并MNLI中的测试和CMNLI中的测试之间的不匹配是相关的，并且顺序是混洗的。该数据集可用于判断给定句子之间的两个句子属于暗示，中立和矛盾。

CMNLI dataset consists of two subsets: XNLI and MNLI. The data is sourced from various domains including fiction, telephone conversations, travel-related content, government documents, and slate texts. The original MNLI and XNLI datasets are translated into both Chinese and English, with their original training splits retained. The development split of CMNLI is created by merging the dev split from XNLI and the matched split from MNLI. The mismatched test split from MNLI and the test split from XNLI are combined to form the CMNLI test split, and the order of all samples is shuffled. This dataset can be used to classify the semantic relationship between two given sentences into three categories: entailment, neutral, and contradiction.

提供机构：

OpenDataLab

创建时间：

2023-09-04

搜集汇总

数据集介绍