KETI-AIR/kor_snli

Name: KETI-AIR/kor_snli
Creator: KETI-AIR
Published: 2023-11-15 01:12:23
License: 暂无描述

Hugging Face2023-11-15 更新2024-03-04 收录

下载链接：

https://hf-mirror.com/datasets/KETI-AIR/kor_snli

下载链接

链接失效反馈

官方服务：

资源简介：

--- language: - ko license: cc-by-4.0 size_categories: - 100K<n<1M task_categories: - text-classification task_ids: - natural-language-inference - multi-input-text-classification dataset_info: features: - name: premise dtype: string - name: hypothesis dtype: string - name: label dtype: class_label: names: '0': entailment '1': neutral '2': contradiction - name: data_index_by_user dtype: int32 splits: - name: train num_bytes: 85943643 num_examples: 550152 - name: validation num_bytes: 1631544 num_examples: 10000 - name: test num_bytes: 1638084 num_examples: 10000 download_size: 27268480 dataset_size: 89213271 --- # Dataset Card for QASC ## Licensing Information The data is distributed under the [CC BY 4.0](https://creativecommons.org/licenses/by/4.0/) license. ## Source Data Citation INformation ``` @inproceedings{snli:emnlp2015, Author = {Bowman, Samuel R. and Angeli, Gabor and Potts, Christopher, and Manning, Christopher D.}, Booktitle = {Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP)}, Publisher = {Association for Computational Linguistics}, Title = {A large annotated corpus for learning natural language inference}, Year = {2015} } ```

--- 语言： - 韩语（ko）许可协议：cc-by-4.0 样本规模区间： - 10万条 < n < 100万条任务类别： - 文本分类任务子类型： - 自然语言推理（natural-language-inference） - 多输入文本分类（multi-input-text-classification）数据集信息：特征： - 名称：前提（premise），数据类型：字符串 - 名称：假设（hypothesis），数据类型：字符串 - 名称：标签（label），数据类型为类别标签：类别名称： '0': 蕴含（entailment） '1': 中性（neutral） '2': 矛盾（contradiction） - 名称：用户数据索引（data_index_by_user），数据类型：int32 拆分集： - 名称：训练集（train），字节数：85943643，样本数量：550152 - 名称：验证集（validation），字节数：1631544，样本数量：10000 - 名称：测试集（test），字节数：1638084，样本数量：10000 下载大小：27268480 数据集总大小：89213271 --- # QASC数据集卡片 ## 许可协议说明本数据集采用[CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)许可协议发布。 ## 源数据引用信息 @inproceedings{snli:emnlp2015, 作者 = {Bowman, Samuel R. and Angeli, Gabor and Potts, Christopher, and Manning, Christopher D.}, 会议文集 = {2015年经验方法自然语言处理会议（EMNLP）论文集}, 出版者 = {计算语言学协会（Association for Computational Linguistics)}, 标题 = {用于自然语言推理学习的大型标注语料库（A large annotated corpus for learning natural language inference)}, 年份 = {2015} }

提供机构：

KETI-AIR

原始信息汇总

数据集概述

基本信息

语言: 韩语
许可证: CC BY 4.0
数据规模: 100K<n<1M

任务类别

文本分类
自然语言推理
多输入文本分类

数据集结构

特征

premise: 字符串类型
hypothesis: 字符串类型
label: 分类标签
- 0: entailment
- 1: neutral
- 2: contradiction
data_index_by_user: 32位整数类型

分割

训练集
- 字节数: 85943643
- 样本数: 550152
验证集
- 字节数: 1631544
- 样本数: 10000
测试集
- 字节数: 1638084
- 样本数: 10000

大小

下载大小: 27268480 字节
数据集大小: 89213271 字节

5,000+

优质数据集

54 个

任务类型

进入经典数据集