hgissbkh/ALMA-R-Preference
收藏Hugging Face2024-05-20 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/hgissbkh/ALMA-R-Preference
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
- config_name: cs-en
features:
- name: translation
struct:
- name: Delta
dtype: float64
- name: alma_cs
dtype: string
- name: alma_cs_kiwi
dtype: float64
- name: alma_cs_kiwi_xcomet
dtype: float64
- name: alma_cs_xcomet
dtype: float64
- name: alma_en
dtype: string
- name: alma_en_kiwi
dtype: float64
- name: alma_en_kiwi_xcomet
dtype: float64
- name: alma_en_xcomet
dtype: float64
- name: cs
dtype: string
- name: en
dtype: string
- name: gpt4_cs
dtype: string
- name: gpt4_cs_kiwi
dtype: float64
- name: gpt4_cs_kiwi_xcomet
dtype: float64
- name: gpt4_cs_xcomet
dtype: float64
- name: gpt4_en
dtype: string
- name: gpt4_en_kiwi
dtype: float64
- name: gpt4_en_kiwi_xcomet
dtype: float64
- name: gpt4_en_xcomet
dtype: float64
- name: language_pair
dtype: string
- name: ref_cs_kiwi
dtype: float64
- name: ref_cs_kiwi_xcomet
dtype: float64
- name: ref_cs_xcomet
dtype: float64
- name: ref_en_kiwi
dtype: float64
- name: ref_en_kiwi_xcomet
dtype: float64
- name: ref_en_xcomet
dtype: float64
- name: required_directions
dtype: string
- name: systems
sequence: string
splits:
- name: train
num_bytes: 2027881
num_examples: 2009
download_size: 1408495
dataset_size: 2027881
- config_name: de-en
features:
- name: translation
struct:
- name: Delta
dtype: float64
- name: alma_de
dtype: string
- name: alma_de_kiwi
dtype: float64
- name: alma_de_kiwi_xcomet
dtype: float64
- name: alma_de_xcomet
dtype: float64
- name: alma_en
dtype: string
- name: alma_en_kiwi
dtype: float64
- name: alma_en_kiwi_xcomet
dtype: float64
- name: alma_en_xcomet
dtype: float64
- name: de
dtype: string
- name: en
dtype: string
- name: gpt4_de
dtype: string
- name: gpt4_de_kiwi
dtype: float64
- name: gpt4_de_kiwi_xcomet
dtype: float64
- name: gpt4_de_xcomet
dtype: float64
- name: gpt4_en
dtype: string
- name: gpt4_en_kiwi
dtype: float64
- name: gpt4_en_kiwi_xcomet
dtype: float64
- name: gpt4_en_xcomet
dtype: float64
- name: language_pair
dtype: string
- name: ref_de_kiwi
dtype: float64
- name: ref_de_kiwi_xcomet
dtype: float64
- name: ref_de_xcomet
dtype: float64
- name: ref_en_kiwi
dtype: float64
- name: ref_en_kiwi_xcomet
dtype: float64
- name: ref_en_xcomet
dtype: float64
- name: required_directions
dtype: string
- name: systems
sequence: string
splits:
- name: train
num_bytes: 2826030
num_examples: 3065
download_size: 1784636
dataset_size: 2826030
- config_name: is-en
features:
- name: translation
struct:
- name: Delta
dtype: float64
- name: alma_en
dtype: string
- name: alma_en_kiwi
dtype: float64
- name: alma_en_kiwi_xcomet
dtype: float64
- name: alma_en_xcomet
dtype: float64
- name: alma_is
dtype: string
- name: alma_is_kiwi
dtype: float64
- name: alma_is_kiwi_xcomet
dtype: float64
- name: alma_is_xcomet
dtype: float64
- name: en
dtype: string
- name: gpt4_en
dtype: string
- name: gpt4_en_kiwi
dtype: float64
- name: gpt4_en_kiwi_xcomet
dtype: float64
- name: gpt4_en_xcomet
dtype: float64
- name: gpt4_is
dtype: string
- name: gpt4_is_kiwi
dtype: float64
- name: gpt4_is_kiwi_xcomet
dtype: float64
- name: gpt4_is_xcomet
dtype: float64
- name: is
dtype: string
- name: language_pair
dtype: string
- name: ref_en_kiwi
dtype: float64
- name: ref_en_kiwi_xcomet
dtype: float64
- name: ref_en_xcomet
dtype: float64
- name: ref_is_kiwi
dtype: float64
- name: ref_is_kiwi_xcomet
dtype: float64
- name: ref_is_xcomet
dtype: float64
- name: required_directions
dtype: string
- name: systems
sequence: string
splits:
- name: train
num_bytes: 2044849
num_examples: 2009
download_size: 1387081
dataset_size: 2044849
- config_name: ru-en
features:
- name: translation
struct:
- name: Delta
dtype: float64
- name: alma_en
dtype: string
- name: alma_en_kiwi
dtype: float64
- name: alma_en_kiwi_xcomet
dtype: float64
- name: alma_en_xcomet
dtype: float64
- name: alma_ru
dtype: string
- name: alma_ru_kiwi
dtype: float64
- name: alma_ru_kiwi_xcomet
dtype: float64
- name: alma_ru_xcomet
dtype: float64
- name: en
dtype: string
- name: gpt4_en
dtype: string
- name: gpt4_en_kiwi
dtype: float64
- name: gpt4_en_kiwi_xcomet
dtype: float64
- name: gpt4_en_xcomet
dtype: float64
- name: gpt4_ru
dtype: string
- name: gpt4_ru_kiwi
dtype: float64
- name: gpt4_ru_kiwi_xcomet
dtype: float64
- name: gpt4_ru_xcomet
dtype: float64
- name: language_pair
dtype: string
- name: ref_en_kiwi
dtype: float64
- name: ref_en_kiwi_xcomet
dtype: float64
- name: ref_en_xcomet
dtype: float64
- name: ref_ru_kiwi
dtype: float64
- name: ref_ru_kiwi_xcomet
dtype: float64
- name: ref_ru_xcomet
dtype: float64
- name: required_directions
dtype: string
- name: ru
dtype: string
- name: systems
sequence: string
splits:
- name: train
num_bytes: 2720806
num_examples: 2009
download_size: 1628749
dataset_size: 2720806
- config_name: zh-en
features:
- name: translation
struct:
- name: Delta
dtype: float64
- name: alma_en
dtype: string
- name: alma_en_kiwi
dtype: float64
- name: alma_en_kiwi_xcomet
dtype: float64
- name: alma_en_xcomet
dtype: float64
- name: alma_zh
dtype: string
- name: alma_zh_kiwi
dtype: float64
- name: alma_zh_kiwi_xcomet
dtype: float64
- name: alma_zh_xcomet
dtype: float64
- name: en
dtype: string
- name: gpt4_en
dtype: string
- name: gpt4_en_kiwi
dtype: float64
- name: gpt4_en_kiwi_xcomet
dtype: float64
- name: gpt4_en_xcomet
dtype: float64
- name: gpt4_zh
dtype: string
- name: gpt4_zh_kiwi
dtype: float64
- name: gpt4_zh_kiwi_xcomet
dtype: float64
- name: gpt4_zh_xcomet
dtype: float64
- name: language_pair
dtype: string
- name: ref_en_kiwi
dtype: float64
- name: ref_en_kiwi_xcomet
dtype: float64
- name: ref_en_xcomet
dtype: float64
- name: ref_zh_kiwi
dtype: float64
- name: ref_zh_kiwi_xcomet
dtype: float64
- name: ref_zh_xcomet
dtype: float64
- name: required_directions
dtype: string
- name: systems
sequence: string
- name: zh
dtype: string
splits:
- name: train
num_bytes: 2544865
num_examples: 3065
download_size: 1699012
dataset_size: 2544865
configs:
- config_name: cs-en
data_files:
- split: train
path: cs-en/train-*
- config_name: de-en
data_files:
- split: train
path: de-en/train-*
- config_name: is-en
data_files:
- split: train
path: is-en/train-*
- config_name: ru-en
data_files:
- split: train
path: ru-en/train-*
- config_name: zh-en
data_files:
- split: train
path: zh-en/train-*
---
提供机构:
hgissbkh
原始信息汇总
数据集概述
配置名称:cs-en
- 特征:
- 名称:translation
- 结构:包含多个字段,如Delta, alma_cs, alma_en, gpt4_cs, gpt4_en等,数据类型包括float64和string。
- 分割:
- 名称:train
- 大小:2027881字节
- 示例数:2009
- 下载大小:1408495字节
- 数据集大小:2027881字节
配置名称:de-en
- 特征:
- 名称:translation
- 结构:包含多个字段,如Delta, alma_de, alma_en, gpt4_de, gpt4_en等,数据类型包括float64和string。
- 分割:
- 名称:train
- 大小:2826030字节
- 示例数:3065
- 下载大小:1784636字节
- 数据集大小:2826030字节
配置名称:is-en
- 特征:
- 名称:translation
- 结构:包含多个字段,如Delta, alma_is, alma_en, gpt4_is, gpt4_en等,数据类型包括float64和string。
- 分割:
- 名称:train
- 大小:2044849字节
- 示例数:2009
- 下载大小:1387081字节
- 数据集大小:2044849字节
配置名称:ru-en
- 特征:
- 名称:translation
- 结构:包含多个字段,如Delta, alma_ru, alma_en, gpt4_ru, gpt4_en等,数据类型包括float64和string。
- 分割:
- 名称:train
- 大小:2720806字节
- 示例数:2009
- 下载大小:1628749字节
- 数据集大小:2720806字节
配置名称:zh-en
- 特征:
- 名称:translation
- 结构:包含多个字段,如Delta, alma_zh, alma_en, gpt4_zh, gpt4_en等,数据类型包括float64和string。
- 分割:
- 名称:train
- 大小:2544865字节
- 示例数:3065
- 下载大小:1699012字节
- 数据集大小:2544865字节



