KETI-AIR/kor_glue
收藏Hugging Face2023-12-05 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/KETI-AIR/kor_glue
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
- config_name: cola
features:
- name: data_index_by_user
dtype: int32
- name: label
dtype: int32
- name: sentence
dtype: string
splits:
- name: train
num_bytes: 569511
num_examples: 8551
- name: validation
num_bytes: 72661
num_examples: 1043
- name: test
num_bytes: 72979
num_examples: 1063
download_size: 381894
dataset_size: 715151
- config_name: mrpc
features:
- name: data_index_by_user
dtype: int32
- name: sentence1
dtype: string
- name: sentence2
dtype: string
- name: label
dtype: int32
- name: idx
dtype: int32
splits:
- name: train
num_bytes: 1078522
num_examples: 3668
- name: validation
num_bytes: 120306
num_examples: 408
- name: test
num_bytes: 504069
num_examples: 1725
download_size: 1176356
dataset_size: 1702897
- config_name: qnli
features:
- name: data_index_by_user
dtype: int32
- name: label
dtype: int32
- name: question
dtype: string
- name: sentence
dtype: string
splits:
- name: train
num_bytes: 28343211
num_examples: 104743
- name: validation
num_bytes: 1507016
num_examples: 5463
- name: test
num_bytes: 1510880
num_examples: 5463
download_size: 21097078
dataset_size: 31361107
- config_name: qqp
features:
- name: data_index_by_user
dtype: int32
- name: question1
dtype: string
- name: question2
dtype: string
- name: label
dtype: int32
- name: idx
dtype: int32
splits:
- name: train
num_bytes: 64564524
num_examples: 363846
download_size: 40798086
dataset_size: 64564524
- config_name: wnli
features:
- name: data_index_by_user
dtype: int32
- name: sentence1
dtype: string
- name: sentence2
dtype: string
- name: label
dtype: int32
- name: idx
dtype: int32
splits:
- name: train
num_bytes: 132171
num_examples: 635
- name: validation
num_bytes: 15331
num_examples: 71
- name: test
num_bytes: 47430
num_examples: 146
download_size: 80151
dataset_size: 194932
configs:
- config_name: cola
data_files:
- split: train
path: cola/train-*
- split: validation
path: cola/validation-*
- split: test
path: cola/test-*
- config_name: mrpc
data_files:
- split: train
path: mrpc/train-*
- split: validation
path: mrpc/validation-*
- split: test
path: mrpc/test-*
- config_name: qnli
data_files:
- split: train
path: qnli/train-*
- split: validation
path: qnli/validation-*
- split: test
path: qnli/test-*
- config_name: qqp
data_files:
- split: train
path: qqp/train-*
- config_name: wnli
data_files:
- split: train
path: wnli/train-*
- split: validation
path: wnli/validation-*
- split: test
path: wnli/test-*
license: cc-by-4.0
---
# Dataset Card for "kor_glue"
[More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
# Source Data Citation Information
```
@article{warstadt2018neural,
title={Neural Network Acceptability Judgments},
author={Warstadt, Alex and Singh, Amanpreet and Bowman, Samuel R},
journal={arXiv preprint arXiv:1805.12471},
year={2018}
}
@inproceedings{wang2019glue,
title={{GLUE}: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding},
author={Wang, Alex and Singh, Amanpreet and Michael, Julian and Hill, Felix and Levy, Omer and Bowman, Samuel R.},
note={In the Proceedings of ICLR.},
year={2019}
}
Note that each GLUE dataset has its own citation. Please see the source to see
the correct citation for each contained dataset.
```
提供机构:
KETI-AIR
原始信息汇总
数据集概述
数据集配置
COLA
- 特征:
data_index_by_user: int32label: int32sentence: string
- 分割:
train: 569511 bytes, 8551 examplesvalidation: 72661 bytes, 1043 examplestest: 72979 bytes, 1063 examples
- 下载大小: 381894 bytes
- 数据集大小: 715151 bytes
MRPC
- 特征:
data_index_by_user: int32sentence1: stringsentence2: stringlabel: int32idx: int32
- 分割:
train: 1078522 bytes, 3668 examplesvalidation: 120306 bytes, 408 examplestest: 504069 bytes, 1725 examples
- 下载大小: 1176356 bytes
- 数据集大小: 1702897 bytes
QNLI
- 特征:
data_index_by_user: int32label: int32question: stringsentence: string
- 分割:
train: 28343211 bytes, 104743 examplesvalidation: 1507016 bytes, 5463 examplestest: 1510880 bytes, 5463 examples
- 下载大小: 21097078 bytes
- 数据集大小: 31361107 bytes
QQP
- 特征:
data_index_by_user: int32question1: stringquestion2: stringlabel: int32idx: int32
- 分割:
train: 64564524 bytes, 363846 examples
- 下载大小: 40798086 bytes
- 数据集大小: 64564524 bytes
WNLI
- 特征:
data_index_by_user: int32sentence1: stringsentence2: stringlabel: int32idx: int32
- 分割:
train: 132171 bytes, 635 examplesvalidation: 15331 bytes, 71 examplestest: 47430 bytes, 146 examples
- 下载大小: 80151 bytes
- 数据集大小: 194932 bytes
数据文件路径
COLA
train: cola/train-*validation: cola/validation-*test: cola/test-*
MRPC
train: mrpc/train-*validation: mrpc/validation-*test: mrpc/test-*
QNLI
train: qnli/train-*validation: qnli/validation-*test: qnli/test-*
QQP
train: qqp/train-*
WNLI
train: wnli/train-*validation: wnli/validation-*test: wnli/test-*
许可证
- cc-by-4.0



