ZurichNLP/mlit-alpaca-eval
收藏Hugging Face2023-12-22 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/ZurichNLP/mlit-alpaca-eval
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
- config_name: ca
features:
- name: instruction
dtype: string
splits:
- name: test
num_bytes: 154255
num_examples: 805
download_size: 99320
dataset_size: 154255
- config_name: da
features:
- name: instruction
dtype: string
splits:
- name: test
num_bytes: 144724
num_examples: 805
download_size: 96555
dataset_size: 144724
- config_name: de
features:
- name: instruction
dtype: string
splits:
- name: test
num_bytes: 164871
num_examples: 805
download_size: 109435
dataset_size: 164871
- config_name: el
features:
- name: instruction
dtype: string
splits:
- name: test
num_bytes: 287985
num_examples: 805
download_size: 143043
dataset_size: 287985
- config_name: en
features:
- name: instruction
dtype: string
splits:
- name: test
num_bytes: 136100
num_examples: 805
download_size: 88817
dataset_size: 136100
- config_name: es
features:
- name: instruction
dtype: string
splits:
- name: test
num_bytes: 157880
num_examples: 805
download_size: 100029
dataset_size: 157880
- config_name: fr
features:
- name: instruction
dtype: string
splits:
- name: test
num_bytes: 168389
num_examples: 805
download_size: 104885
dataset_size: 168389
- config_name: hi
features:
- name: instruction
dtype: string
splits:
- name: test
num_bytes: 353161
num_examples: 805
download_size: 140012
dataset_size: 353161
- config_name: is
features:
- name: instruction
dtype: string
splits:
- name: test
num_bytes: 152739
num_examples: 805
download_size: 99913
dataset_size: 152739
- config_name: 'no'
features:
- name: instruction
dtype: string
splits:
- name: test
num_bytes: 141316
num_examples: 805
download_size: 94018
dataset_size: 141316
- config_name: ru
features:
- name: instruction
dtype: string
splits:
- name: test
num_bytes: 262317
num_examples: 805
download_size: 133403
dataset_size: 262317
- config_name: sv
features:
- name: instruction
dtype: string
splits:
- name: test
num_bytes: 146366
num_examples: 805
download_size: 96223
dataset_size: 146366
- config_name: zh
features:
- name: instruction
dtype: string
splits:
- name: test
num_bytes: 125499
num_examples: 805
download_size: 87092
dataset_size: 125499
configs:
- config_name: ca
data_files:
- split: test
path: ca/test-*
- config_name: da
data_files:
- split: test
path: da/test-*
- config_name: de
data_files:
- split: test
path: de/test-*
- config_name: el
data_files:
- split: test
path: el/test-*
- config_name: en
data_files:
- split: test
path: en/test-*
- config_name: es
data_files:
- split: test
path: es/test-*
- config_name: fr
data_files:
- split: test
path: fr/test-*
- config_name: hi
data_files:
- split: test
path: hi/test-*
- config_name: is
data_files:
- split: test
path: is/test-*
- config_name: 'no'
data_files:
- split: test
path: no/test-*
- config_name: ru
data_files:
- split: test
path: ru/test-*
- config_name: sv
data_files:
- split: test
path: sv/test-*
- config_name: zh
data_files:
- split: test
path: zh/test-*
---
# Description
Translated versions of the [AlpacaEval prompt dataset](https://huggingface.co/datasets/tatsu-lab/alpaca_eval) for evaluating the performance of chat LLMs.
Translations were generated using `gpt-3.5-turbo-0613` using the following prompt template (adapted from [Lai et al, 2023](https://arxiv.org/pdf/2307.16039.pdf)):
```
You are a helpful assistant.
Translate the following text into {{target_language}}.
Keep the structure of the original text and preserve things like code and names.
Please ensure that your response contains only the translated text.
The translation must convey the same meaning as the original and be natural for
native speakers with correct grammar and proper word choices.
Your translation must also use exact terminology to provide
accurate information even for the experts in the related fields.
Original: {{source_text}}
Translation into {{target_language}}:
```
# Usage
```python
from datasets import load_dataset
ds = load_dataset('ZurichNLP/mlit-alpaca-eval', 'ca')
print(ds)
>>> DatasetDict({
test: Dataset({
features: ['instruction'],
num_rows: 805
})
})
```
# Citation
```
@misc{kew2023turning,
title={Turning English-centric LLMs Into Polyglots: How Much Multilinguality Is Needed?},
author={Tannon Kew and Florian Schottmann and Rico Sennrich},
year={2023},
eprint={2312.12683},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
```
---
dataset_info:
- config_name: ca
features:
- name: instruction
dtype: string
splits:
- name: test
num_bytes: 154255
num_examples: 805
download_size: 99320
dataset_size: 154255
- config_name: da
features:
- name: instruction
dtype: string
splits:
- name: test
num_bytes: 144724
num_examples: 805
download_size: 96555
dataset_size: 144724
- config_name: de
features:
- name: instruction
dtype: string
splits:
- name: test
num_bytes: 164871
num_examples: 805
download_size: 109435
dataset_size: 164871
- config_name: el
features:
- name: instruction
dtype: string
splits:
- name: test
num_bytes: 287985
num_examples: 805
download_size: 143043
dataset_size: 287985
- config_name: en
features:
- name: instruction
dtype: string
splits:
- name: test
num_bytes: 136100
num_examples: 805
download_size: 88817
dataset_size: 136100
- config_name: es
features:
- name: instruction
dtype: string
splits:
- name: test
num_bytes: 157880
num_examples: 805
download_size: 100029
dataset_size: 157880
- config_name: fr
features:
- name: instruction
dtype: string
splits:
- name: test
num_bytes: 168389
num_examples: 805
download_size: 104885
dataset_size: 168389
- config_name: hi
features:
- name: instruction
dtype: string
splits:
- name: test
num_bytes: 353161
num_examples: 805
download_size: 140012
dataset_size: 353161
- config_name: is
features:
- name: instruction
dtype: string
splits:
- name: test
num_bytes: 152739
num_examples: 805
download_size: 99913
dataset_size: 152739
- config_name: 'no'
features:
- name: instruction
dtype: string
splits:
- name: test
num_bytes: 141316
num_examples: 805
download_size: 94018
dataset_size: 141316
- config_name: ru
features:
- name: instruction
dtype: string
splits:
- name: test
num_bytes: 262317
num_examples: 805
download_size: 133403
dataset_size: 262317
- config_name: sv
features:
- name: instruction
dtype: string
splits:
- name: test
num_bytes: 146366
num_examples: 805
download_size: 96223
dataset_size: 146366
- config_name: zh
features:
- name: instruction
dtype: string
splits:
- name: test
num_bytes: 125499
num_examples: 805
download_size: 87092
dataset_size: 125499
configs:
- config_name: ca
data_files:
- split: test
path: ca/test-*
- config_name: da
data_files:
- split: test
path: da/test-*
- config_name: de
data_files:
- split: test
path: de/test-*
- config_name: el
data_files:
- split: test
path: el/test-*
- config_name: en
data_files:
- split: test
path: en/test-*
- config_name: es
data_files:
- split: test
path: es/test-*
- config_name: fr
data_files:
- split: test
path: fr/test-*
- config_name: hi
data_files:
- split: test
path: hi/test-*
- config_name: is
data_files:
- split: test
path: is/test-*
- config_name: 'no'
data_files:
- split: test
path: no/test-*
- config_name: ru
data_files:
- split: test
path: ru/test-*
- config_name: sv
data_files:
- split: test
path: sv/test-*
- config_name: zh
data_files:
- split: test
path: zh/test-*
license: cc
task_categories:
- conversational
- question-answering
language:
- en
- ca
- bg
- da
- de
- el
- es
- fr
- hi
- is
- 'no'
- ru
- sv
- zh
---
---
dataset_info:
- config_name: ca
features:
- name: instruction
dtype: string
splits:
- name: test
num_bytes: 154255
num_examples: 805
download_size: 99320
dataset_size: 154255
- config_name: da
features:
- name: instruction
dtype: string
splits:
- name: test
num_bytes: 144724
num_examples: 805
download_size: 96555
dataset_size: 144724
- config_name: de
features:
- name: instruction
dtype: string
splits:
- name: test
num_bytes: 164871
num_examples: 805
download_size: 109435
dataset_size: 164871
- config_name: el
features:
- name: instruction
dtype: string
splits:
- name: test
num_bytes: 287985
num_examples: 805
download_size: 143043
dataset_size: 287985
- config_name: en
features:
- name: instruction
dtype: string
splits:
- name: test
num_bytes: 136100
num_examples: 805
download_size: 88817
dataset_size: 136100
- config_name: es
features:
- name: instruction
dtype: string
splits:
- name: test
num_bytes: 157880
num_examples: 805
download_size: 100029
dataset_size: 157880
- config_name: fr
features:
- name: instruction
dtype: string
splits:
- name: test
num_bytes: 168389
num_examples: 805
download_size: 104885
dataset_size: 168389
- config_name: hi
features:
- name: instruction
dtype: string
splits:
- name: test
num_bytes: 353161
num_examples: 805
download_size: 140012
dataset_size: 353161
- config_name: is
features:
- name: instruction
dtype: string
splits:
- name: test
num_bytes: 152739
num_examples: 805
download_size: 99913
dataset_size: 152739
- config_name: 'no'
features:
- name: instruction
dtype: string
splits:
- name: test
num_bytes: 141316
num_examples: 805
download_size: 94018
dataset_size: 141316
- config_name: ru
features:
- name: instruction
dtype: string
splits:
- name: test
num_bytes: 262317
num_examples: 805
download_size: 133403
dataset_size: 262317
- config_name: sv
features:
- name: instruction
dtype: string
splits:
- name: test
num_bytes: 146366
num_examples: 805
download_size: 96223
dataset_size: 146366
- config_name: zh
features:
- name: instruction
dtype: string
splits:
- name: test
num_bytes: 125499
num_examples: 805
download_size: 87092
dataset_size: 125499
configs:
- config_name: ca
data_files:
- split: test
path: ca/test-*
- config_name: da
data_files:
- split: test
path: da/test-*
- config_name: de
data_files:
- split: test
path: de/test-*
- config_name: el
data_files:
- split: test
path: el/test-*
- config_name: en
data_files:
- split: test
path: en/test-*
- config_name: es
data_files:
- split: test
path: es/test-*
- config_name: fr
data_files:
- split: test
path: fr/test-*
- config_name: hi
data_files:
- split: test
path: hi/test-*
- config_name: is
data_files:
- split: test
path: is/test-*
- config_name: 'no'
data_files:
- split: test
path: no/test-*
- config_name: ru
data_files:
- split: test
path: ru/test-*
- config_name: sv
data_files:
- split: test
path: sv/test-*
- config_name: zh
data_files:
- split: test
path: zh/test-*
---
提供机构:
ZurichNLP
原始信息汇总
数据集概述
数据集配置
配置名称:ca
- 特征:
instruction: 数据类型为string
- 分割:
test: 字节数为 154255,示例数为 805
- 下载大小: 99320 字节
- 数据集大小: 154255 字节
- 数据文件:
test: 路径为ca/test-*
配置名称:da
- 特征:
instruction: 数据类型为string
- 分割:
test: 字节数为 144724,示例数为 805
- 下载大小: 96555 字节
- 数据集大小: 144724 字节
- 数据文件:
test: 路径为da/test-*
配置名称:de
- 特征:
instruction: 数据类型为string
- 分割:
test: 字节数为 164871,示例数为 805
- 下载大小: 109435 字节
- 数据集大小: 164871 字节
- 数据文件:
test: 路径为de/test-*
配置名称:el
- 特征:
instruction: 数据类型为string
- 分割:
test: 字节数为 287985,示例数为 805
- 下载大小: 143043 字节
- 数据集大小: 287985 字节
- 数据文件:
test: 路径为el/test-*
配置名称:en
- 特征:
instruction: 数据类型为string
- 分割:
test: 字节数为 136100,示例数为 805
- 下载大小: 88817 字节
- 数据集大小: 136100 字节
- 数据文件:
test: 路径为en/test-*
配置名称:es
- 特征:
instruction: 数据类型为string
- 分割:
test: 字节数为 157880,示例数为 805
- 下载大小: 100029 字节
- 数据集大小: 157880 字节
- 数据文件:
test: 路径为es/test-*
配置名称:fr
- 特征:
instruction: 数据类型为string
- 分割:
test: 字节数为 168389,示例数为 805
- 下载大小: 104885 字节
- 数据集大小: 168389 字节
- 数据文件:
test: 路径为fr/test-*
配置名称:hi
- 特征:
instruction: 数据类型为string
- 分割:
test: 字节数为 353161,示例数为 805
- 下载大小: 140012 字节
- 数据集大小: 353161 字节
- 数据文件:
test: 路径为hi/test-*
配置名称:is
- 特征:
instruction: 数据类型为string
- 分割:
test: 字节数为 152739,示例数为 805
- 下载大小: 99913 字节
- 数据集大小: 152739 字节
- 数据文件:
test: 路径为is/test-*
配置名称:no
- 特征:
instruction: 数据类型为string
- 分割:
test: 字节数为 141316,示例数为 805
- 下载大小: 94018 字节
- 数据集大小: 141316 字节
- 数据文件:
test: 路径为no/test-*
配置名称:ru
- 特征:
instruction: 数据类型为string
- 分割:
test: 字节数为 262317,示例数为 805
- 下载大小: 133403 字节
- 数据集大小: 262317 字节
- 数据文件:
test: 路径为ru/test-*
配置名称:sv
- 特征:
instruction: 数据类型为string
- 分割:
test: 字节数为 146366,示例数为 805
- 下载大小: 96223 字节
- 数据集大小: 146366 字节
- 数据文件:
test: 路径为sv/test-*
配置名称:zh
- 特征:
instruction: 数据类型为string
- 分割:
test: 字节数为 125499,示例数为 805
- 下载大小: 87092 字节
- 数据集大小: 125499 字节
- 数据文件:
test: 路径为zh/test-*



