Cognitive-Lab/GoogleIndicGenBench_flores_enxx_in
收藏Hugging Face2024-06-04 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/Cognitive-Lab/GoogleIndicGenBench_flores_enxx_in
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
- config_name: gu
features:
- name: target
dtype: string
- name: source
dtype: string
- name: translation_direction
dtype: string
- name: lang
dtype: string
splits:
- name: test
num_bytes: 485428
num_examples: 1012
- name: dev
num_bytes: 485428
num_examples: 1012
download_size: 500702
dataset_size: 970856
- config_name: hi
features:
- name: target
dtype: string
- name: source
dtype: string
- name: translation_direction
dtype: string
- name: lang
dtype: string
splits:
- name: test
num_bytes: 491438
num_examples: 1012
- name: dev
num_bytes: 491438
num_examples: 1012
download_size: 499392
dataset_size: 982876
- config_name: kn
features:
- name: target
dtype: string
- name: source
dtype: string
- name: translation_direction
dtype: string
- name: lang
dtype: string
splits:
- name: test
num_bytes: 529755
num_examples: 1012
- name: dev
num_bytes: 529755
num_examples: 1012
download_size: 530084
dataset_size: 1059510
- config_name: ml
features:
- name: target
dtype: string
- name: source
dtype: string
- name: translation_direction
dtype: string
- name: lang
dtype: string
splits:
- name: test
num_bytes: 566113
num_examples: 1012
- name: dev
num_bytes: 566113
num_examples: 1012
download_size: 550094
dataset_size: 1132226
- config_name: mr
features:
- name: target
dtype: string
- name: source
dtype: string
- name: translation_direction
dtype: string
- name: lang
dtype: string
splits:
- name: test
num_bytes: 509877
num_examples: 1012
- name: dev
num_bytes: 509877
num_examples: 1012
download_size: 516800
dataset_size: 1019754
- config_name: ta
features:
- name: target
dtype: string
- name: source
dtype: string
- name: translation_direction
dtype: string
- name: lang
dtype: string
splits:
- name: test
num_bytes: 576001
num_examples: 1012
- name: dev
num_bytes: 576001
num_examples: 1012
download_size: 537366
dataset_size: 1152002
- config_name: te
features:
- name: target
dtype: string
- name: source
dtype: string
- name: translation_direction
dtype: string
- name: lang
dtype: string
splits:
- name: test
num_bytes: 508047
num_examples: 1012
- name: dev
num_bytes: 508047
num_examples: 1012
download_size: 515080
dataset_size: 1016094
configs:
- config_name: gu
data_files:
- split: test
path: gu/test-*
- split: dev
path: gu/dev-*
- config_name: hi
data_files:
- split: test
path: hi/test-*
- split: dev
path: hi/dev-*
- config_name: kn
data_files:
- split: test
path: kn/test-*
- split: dev
path: kn/dev-*
- config_name: ml
data_files:
- split: test
path: ml/test-*
- split: dev
path: ml/dev-*
- config_name: mr
data_files:
- split: test
path: mr/test-*
- split: dev
path: mr/dev-*
- config_name: ta
data_files:
- split: test
path: ta/test-*
- split: dev
path: ta/dev-*
- config_name: te
data_files:
- split: test
path: te/test-*
- split: dev
path: te/dev-*
---
提供机构:
Cognitive-Lab
原始信息汇总
数据集概述
数据集配置信息
| 配置名称 | 特征 |
|---|---|
| gu | 目标: string, 源: string, 翻译方向: string, 语言: string |
| hi | 目标: string, 源: string, 翻译方向: string, 语言: string |
| kn | 目标: string, 源: string, 翻译方向: string, 语言: string |
| ml | 目标: string, 源: string, 翻译方向: string, 语言: string |
| mr | 目标: string, 源: string, 翻译方向: string, 语言: string |
| ta | 目标: string, 源: string, 翻译方向: string, 语言: string |
| te | 目标: string, 源: string, 翻译方向: string, 语言: string |
数据集分割信息
| 配置名称 | 分割类型 | 字节数 | 示例数 |
|---|---|---|---|
| gu | test | 485428 | 1012 |
| gu | dev | 485428 | 1012 |
| hi | test | 491438 | 1012 |
| hi | dev | 491438 | 1012 |
| kn | test | 529755 | 1012 |
| kn | dev | 529755 | 1012 |
| ml | test | 566113 | 1012 |
| ml | dev | 566113 | 1012 |
| mr | test | 509877 | 1012 |
| mr | dev | 509877 | 1012 |
| ta | test | 576001 | 1012 |
| ta | dev | 576001 | 1012 |
| te | test | 508047 | 1012 |
| te | dev | 508047 | 1012 |
数据集大小信息
| 配置名称 | 下载大小 | 数据集大小 |
|---|---|---|
| gu | 500702 | 970856 |
| hi | 499392 | 982876 |
| kn | 530084 | 1059510 |
| ml | 550094 | 1132226 |
| mr | 516800 | 1019754 |
| ta | 537366 | 1152002 |
| te | 515080 | 1016094 |
数据文件路径
| 配置名称 | 分割类型 | 文件路径 |
|---|---|---|
| gu | test | gu/test-* |
| gu | dev | gu/dev-* |
| hi | test | hi/test-* |
| hi | dev | hi/dev-* |
| kn | test | kn/test-* |
| kn | dev | kn/dev-* |
| ml | test | ml/test-* |
| ml | dev | ml/dev-* |
| mr | test | mr/test-* |
| mr | dev | mr/dev-* |
| ta | test | ta/test-* |
| ta | dev | ta/dev-* |
| te | test | te/test-* |
| te | dev | te/dev-* |



