KETI-AIR/kor_anli
收藏Hugging Face2023-11-15 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/KETI-AIR/kor_anli
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- ko
license:
- cc-by-nc-4.0
size_categories:
- 100K<n<1M
task_categories:
- text-classification
task_ids:
- natural-language-inference
- multi-input-text-classification
paperswithcode_id: anli
pretty_name: Adversarial NLI
dataset_info:
features:
- name: data_index_by_user
dtype: int32
- name: premise
dtype: string
- name: hypothesis
dtype: string
- name: label
dtype:
class_label:
names:
'0': entailment
'1': neutral
'2': contradiction
- name: reason
dtype: string
splits:
- name: train_r1
num_bytes: 8505556
num_examples: 16946
- name: train_r2
num_bytes: 22521662
num_examples: 45460
- name: train_r3
num_bytes: 48605206
num_examples: 100459
- name: dev_r1
num_bytes: 628891
num_examples: 1000
- name: dev_r2
num_bytes: 613763
num_examples: 1000
- name: dev_r3
num_bytes: 740840
num_examples: 1200
- name: test_r1
num_bytes: 626555
num_examples: 1000
- name: test_r2
num_bytes: 633241
num_examples: 1000
- name: test_r3
num_bytes: 736887
num_examples: 1200
download_size: 23386318
dataset_size: 83612601
---
# Dataset Card for anli
## Licensing Information
[cc-4 Attribution-NonCommercial](https://github.com/facebookresearch/anli/blob/main/LICENSE)
## Source Data Citation INformation
```
@InProceedings{nie2019adversarial,
title={Adversarial NLI: A New Benchmark for Natural Language Understanding},
author={Nie, Yixin
and Williams, Adina
and Dinan, Emily
and Bansal, Mohit
and Weston, Jason
and Kiela, Douwe},
booktitle = "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics",
year = "2020",
publisher = "Association for Computational Linguistics",
}
提供机构:
KETI-AIR
原始信息汇总
数据集概述
语言
- 韩语 (ko)
许可证
- CC BY-NC 4.0
数据集规模
- 100K<n<1M
任务类别
- 文本分类 (text-classification)
任务ID
- 自然语言推理 (natural-language-inference)
- 多输入文本分类 (multi-input-text-classification)
数据集信息
特征
- data_index_by_user: 数据索引,类型为 int32
- premise: 前提,类型为 string
- hypothesis: 假设,类型为 string
- label: 标签,类型为 class_label,包含以下类别:
- 0: entailment
- 1: neutral
- 2: contradiction
- reason: 原因,类型为 string
数据集划分
- train_r1: 训练集第一轮,字节数 8505556,样本数 16946
- train_r2: 训练集第二轮,字节数 22521662,样本数 45460
- train_r3: 训练集第三轮,字节数 48605206,样本数 100459
- dev_r1: 开发集第一轮,字节数 628891,样本数 1000
- dev_r2: 开发集第二轮,字节数 613763,样本数 1000
- dev_r3: 开发集第三轮,字节数 740840,样本数 1200
- test_r1: 测试集第一轮,字节数 626555,样本数 1000
- test_r2: 测试集第二轮,字节数 633241,样本数 1000
- test_r3: 测试集第三轮,字节数 736887,样本数 1200
数据集大小
- 下载大小: 23386318 字节
- 数据集总大小: 83612601 字节



