five

Cognitive-Lab/GoogleIndicGenBench_flores_enxx_in

收藏
Hugging Face2024-06-04 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/Cognitive-Lab/GoogleIndicGenBench_flores_enxx_in
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: - config_name: gu features: - name: target dtype: string - name: source dtype: string - name: translation_direction dtype: string - name: lang dtype: string splits: - name: test num_bytes: 485428 num_examples: 1012 - name: dev num_bytes: 485428 num_examples: 1012 download_size: 500702 dataset_size: 970856 - config_name: hi features: - name: target dtype: string - name: source dtype: string - name: translation_direction dtype: string - name: lang dtype: string splits: - name: test num_bytes: 491438 num_examples: 1012 - name: dev num_bytes: 491438 num_examples: 1012 download_size: 499392 dataset_size: 982876 - config_name: kn features: - name: target dtype: string - name: source dtype: string - name: translation_direction dtype: string - name: lang dtype: string splits: - name: test num_bytes: 529755 num_examples: 1012 - name: dev num_bytes: 529755 num_examples: 1012 download_size: 530084 dataset_size: 1059510 - config_name: ml features: - name: target dtype: string - name: source dtype: string - name: translation_direction dtype: string - name: lang dtype: string splits: - name: test num_bytes: 566113 num_examples: 1012 - name: dev num_bytes: 566113 num_examples: 1012 download_size: 550094 dataset_size: 1132226 - config_name: mr features: - name: target dtype: string - name: source dtype: string - name: translation_direction dtype: string - name: lang dtype: string splits: - name: test num_bytes: 509877 num_examples: 1012 - name: dev num_bytes: 509877 num_examples: 1012 download_size: 516800 dataset_size: 1019754 - config_name: ta features: - name: target dtype: string - name: source dtype: string - name: translation_direction dtype: string - name: lang dtype: string splits: - name: test num_bytes: 576001 num_examples: 1012 - name: dev num_bytes: 576001 num_examples: 1012 download_size: 537366 dataset_size: 1152002 - config_name: te features: - name: target dtype: string - name: source dtype: string - name: translation_direction dtype: string - name: lang dtype: string splits: - name: test num_bytes: 508047 num_examples: 1012 - name: dev num_bytes: 508047 num_examples: 1012 download_size: 515080 dataset_size: 1016094 configs: - config_name: gu data_files: - split: test path: gu/test-* - split: dev path: gu/dev-* - config_name: hi data_files: - split: test path: hi/test-* - split: dev path: hi/dev-* - config_name: kn data_files: - split: test path: kn/test-* - split: dev path: kn/dev-* - config_name: ml data_files: - split: test path: ml/test-* - split: dev path: ml/dev-* - config_name: mr data_files: - split: test path: mr/test-* - split: dev path: mr/dev-* - config_name: ta data_files: - split: test path: ta/test-* - split: dev path: ta/dev-* - config_name: te data_files: - split: test path: te/test-* - split: dev path: te/dev-* ---
提供机构:
Cognitive-Lab
原始信息汇总

数据集概述

数据集配置信息

配置名称 特征
gu 目标: string, 源: string, 翻译方向: string, 语言: string
hi 目标: string, 源: string, 翻译方向: string, 语言: string
kn 目标: string, 源: string, 翻译方向: string, 语言: string
ml 目标: string, 源: string, 翻译方向: string, 语言: string
mr 目标: string, 源: string, 翻译方向: string, 语言: string
ta 目标: string, 源: string, 翻译方向: string, 语言: string
te 目标: string, 源: string, 翻译方向: string, 语言: string

数据集分割信息

配置名称 分割类型 字节数 示例数
gu test 485428 1012
gu dev 485428 1012
hi test 491438 1012
hi dev 491438 1012
kn test 529755 1012
kn dev 529755 1012
ml test 566113 1012
ml dev 566113 1012
mr test 509877 1012
mr dev 509877 1012
ta test 576001 1012
ta dev 576001 1012
te test 508047 1012
te dev 508047 1012

数据集大小信息

配置名称 下载大小 数据集大小
gu 500702 970856
hi 499392 982876
kn 530084 1059510
ml 550094 1132226
mr 516800 1019754
ta 537366 1152002
te 515080 1016094

数据文件路径

配置名称 分割类型 文件路径
gu test gu/test-*
gu dev gu/dev-*
hi test hi/test-*
hi dev hi/dev-*
kn test kn/test-*
kn dev kn/dev-*
ml test ml/test-*
ml dev ml/dev-*
mr test mr/test-*
mr dev mr/dev-*
ta test ta/test-*
ta dev ta/dev-*
te test te/test-*
te dev te/dev-*
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作