mbzuai-ugrip-statement-tuning/xstorycloze
收藏Hugging Face2024-06-06 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/mbzuai-ugrip-statement-tuning/xstorycloze
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
- config_name: ar
features:
- name: answer_right_ending
dtype: int32
- name: statement1
dtype: string
- name: statement2
dtype: string
splits:
- name: test
num_bytes: 1172623
num_examples: 1511
download_size: 581963
dataset_size: 1172623
- config_name: en
features:
- name: answer_right_ending
dtype: int32
- name: statement1
dtype: string
- name: statement2
dtype: string
splits:
- name: test
num_bytes: 822860
num_examples: 1511
download_size: 459654
dataset_size: 822860
- config_name: es
features:
- name: answer_right_ending
dtype: int32
- name: statement1
dtype: string
- name: statement2
dtype: string
splits:
- name: test
num_bytes: 894779
num_examples: 1511
download_size: 507534
dataset_size: 894779
- config_name: eu
features:
- name: answer_right_ending
dtype: int32
- name: statement1
dtype: string
- name: statement2
dtype: string
splits:
- name: test
num_bytes: 896518
num_examples: 1511
download_size: 485370
dataset_size: 896518
- config_name: hi
features:
- name: answer_right_ending
dtype: int32
- name: statement1
dtype: string
- name: statement2
dtype: string
splits:
- name: test
num_bytes: 1958974
num_examples: 1511
download_size: 739647
dataset_size: 1958974
- config_name: id
features:
- name: answer_right_ending
dtype: int32
- name: statement1
dtype: string
- name: statement2
dtype: string
splits:
- name: test
num_bytes: 913463
num_examples: 1511
download_size: 476175
dataset_size: 913463
- config_name: my
features:
- name: answer_right_ending
dtype: int32
- name: statement1
dtype: string
- name: statement2
dtype: string
splits:
- name: test
num_bytes: 2735802
num_examples: 1511
download_size: 849229
dataset_size: 2735802
- config_name: ru
features:
- name: answer_right_ending
dtype: int32
- name: statement1
dtype: string
- name: statement2
dtype: string
splits:
- name: test
num_bytes: 1415824
num_examples: 1511
download_size: 688117
dataset_size: 1415824
- config_name: sw
features:
- name: answer_right_ending
dtype: int32
- name: statement1
dtype: string
- name: statement2
dtype: string
splits:
- name: test
num_bytes: 879761
num_examples: 1511
download_size: 468032
dataset_size: 879761
- config_name: te
features:
- name: answer_right_ending
dtype: int32
- name: statement1
dtype: string
- name: statement2
dtype: string
splits:
- name: test
num_bytes: 2037222
num_examples: 1511
download_size: 745049
dataset_size: 2037222
- config_name: zh
features:
- name: answer_right_ending
dtype: int32
- name: statement1
dtype: string
- name: statement2
dtype: string
splits:
- name: test
num_bytes: 806541
num_examples: 1511
download_size: 485889
dataset_size: 806541
configs:
- config_name: ar
data_files:
- split: test
path: ar/test-*
- config_name: en
data_files:
- split: test
path: en/test-*
- config_name: es
data_files:
- split: test
path: es/test-*
- config_name: eu
data_files:
- split: test
path: eu/test-*
- config_name: hi
data_files:
- split: test
path: hi/test-*
- config_name: id
data_files:
- split: test
path: id/test-*
- config_name: my
data_files:
- split: test
path: my/test-*
- config_name: ru
data_files:
- split: test
path: ru/test-*
- config_name: sw
data_files:
- split: test
path: sw/test-*
- config_name: te
data_files:
- split: test
path: te/test-*
- config_name: zh
data_files:
- split: test
path: zh/test-*
---
提供机构:
mbzuai-ugrip-statement-tuning
原始信息汇总
数据集概述
数据集配置
- config_name: 包含多种语言配置,如ar, en, es, eu, hi, id, my, ru, sw, te, zh。
- features: 每种配置包含三个特征:
answer_right_ending: 数据类型为int32。statement1: 数据类型为string。statement2: 数据类型为string。
数据集分割
- split: 所有配置均包含名为
test的分割。 - num_examples: 每个
test分割包含1511个样本。 - num_bytes: 不同语言配置的
test分割大小不同,具体如下:- ar: 1172623 bytes
- en: 822860 bytes
- es: 894779 bytes
- eu: 896518 bytes
- hi: 1958974 bytes
- id: 913463 bytes
- my: 2735802 bytes
- ru: 1415824 bytes
- sw: 879761 bytes
- te: 2037222 bytes
- zh: 806541 bytes
数据集大小与下载大小
- download_size: 不同语言配置的下载大小不同,具体如下:
- ar: 581963 bytes
- en: 459654 bytes
- es: 507534 bytes
- eu: 485370 bytes
- hi: 739647 bytes
- id: 476175 bytes
- my: 849229 bytes
- ru: 688117 bytes
- sw: 468032 bytes
- te: 745049 bytes
- zh: 485889 bytes
- dataset_size: 与
num_bytes相同,反映数据集的实际大小。
数据文件路径
- path: 每个配置的
test分割数据文件路径格式为[语言代码]/test-*。



