five

ZurichNLP/mlit-alpaca-eval

收藏
Hugging Face2023-12-22 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/ZurichNLP/mlit-alpaca-eval
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: - config_name: ca features: - name: instruction dtype: string splits: - name: test num_bytes: 154255 num_examples: 805 download_size: 99320 dataset_size: 154255 - config_name: da features: - name: instruction dtype: string splits: - name: test num_bytes: 144724 num_examples: 805 download_size: 96555 dataset_size: 144724 - config_name: de features: - name: instruction dtype: string splits: - name: test num_bytes: 164871 num_examples: 805 download_size: 109435 dataset_size: 164871 - config_name: el features: - name: instruction dtype: string splits: - name: test num_bytes: 287985 num_examples: 805 download_size: 143043 dataset_size: 287985 - config_name: en features: - name: instruction dtype: string splits: - name: test num_bytes: 136100 num_examples: 805 download_size: 88817 dataset_size: 136100 - config_name: es features: - name: instruction dtype: string splits: - name: test num_bytes: 157880 num_examples: 805 download_size: 100029 dataset_size: 157880 - config_name: fr features: - name: instruction dtype: string splits: - name: test num_bytes: 168389 num_examples: 805 download_size: 104885 dataset_size: 168389 - config_name: hi features: - name: instruction dtype: string splits: - name: test num_bytes: 353161 num_examples: 805 download_size: 140012 dataset_size: 353161 - config_name: is features: - name: instruction dtype: string splits: - name: test num_bytes: 152739 num_examples: 805 download_size: 99913 dataset_size: 152739 - config_name: 'no' features: - name: instruction dtype: string splits: - name: test num_bytes: 141316 num_examples: 805 download_size: 94018 dataset_size: 141316 - config_name: ru features: - name: instruction dtype: string splits: - name: test num_bytes: 262317 num_examples: 805 download_size: 133403 dataset_size: 262317 - config_name: sv features: - name: instruction dtype: string splits: - name: test num_bytes: 146366 num_examples: 805 download_size: 96223 dataset_size: 146366 - config_name: zh features: - name: instruction dtype: string splits: - name: test num_bytes: 125499 num_examples: 805 download_size: 87092 dataset_size: 125499 configs: - config_name: ca data_files: - split: test path: ca/test-* - config_name: da data_files: - split: test path: da/test-* - config_name: de data_files: - split: test path: de/test-* - config_name: el data_files: - split: test path: el/test-* - config_name: en data_files: - split: test path: en/test-* - config_name: es data_files: - split: test path: es/test-* - config_name: fr data_files: - split: test path: fr/test-* - config_name: hi data_files: - split: test path: hi/test-* - config_name: is data_files: - split: test path: is/test-* - config_name: 'no' data_files: - split: test path: no/test-* - config_name: ru data_files: - split: test path: ru/test-* - config_name: sv data_files: - split: test path: sv/test-* - config_name: zh data_files: - split: test path: zh/test-* --- # Description Translated versions of the [AlpacaEval prompt dataset](https://huggingface.co/datasets/tatsu-lab/alpaca_eval) for evaluating the performance of chat LLMs. Translations were generated using `gpt-3.5-turbo-0613` using the following prompt template (adapted from [Lai et al, 2023](https://arxiv.org/pdf/2307.16039.pdf)): ``` You are a helpful assistant. Translate the following text into {{target_language}}. Keep the structure of the original text and preserve things like code and names. Please ensure that your response contains only the translated text. The translation must convey the same meaning as the original and be natural for native speakers with correct grammar and proper word choices. Your translation must also use exact terminology to provide accurate information even for the experts in the related fields. Original: {{source_text}} Translation into {{target_language}}: ``` # Usage ```python from datasets import load_dataset ds = load_dataset('ZurichNLP/mlit-alpaca-eval', 'ca') print(ds) >>> DatasetDict({ test: Dataset({ features: ['instruction'], num_rows: 805 }) }) ``` # Citation ``` @misc{kew2023turning, title={Turning English-centric LLMs Into Polyglots: How Much Multilinguality Is Needed?}, author={Tannon Kew and Florian Schottmann and Rico Sennrich}, year={2023}, eprint={2312.12683}, archivePrefix={arXiv}, primaryClass={cs.CL} } ``` --- dataset_info: - config_name: ca features: - name: instruction dtype: string splits: - name: test num_bytes: 154255 num_examples: 805 download_size: 99320 dataset_size: 154255 - config_name: da features: - name: instruction dtype: string splits: - name: test num_bytes: 144724 num_examples: 805 download_size: 96555 dataset_size: 144724 - config_name: de features: - name: instruction dtype: string splits: - name: test num_bytes: 164871 num_examples: 805 download_size: 109435 dataset_size: 164871 - config_name: el features: - name: instruction dtype: string splits: - name: test num_bytes: 287985 num_examples: 805 download_size: 143043 dataset_size: 287985 - config_name: en features: - name: instruction dtype: string splits: - name: test num_bytes: 136100 num_examples: 805 download_size: 88817 dataset_size: 136100 - config_name: es features: - name: instruction dtype: string splits: - name: test num_bytes: 157880 num_examples: 805 download_size: 100029 dataset_size: 157880 - config_name: fr features: - name: instruction dtype: string splits: - name: test num_bytes: 168389 num_examples: 805 download_size: 104885 dataset_size: 168389 - config_name: hi features: - name: instruction dtype: string splits: - name: test num_bytes: 353161 num_examples: 805 download_size: 140012 dataset_size: 353161 - config_name: is features: - name: instruction dtype: string splits: - name: test num_bytes: 152739 num_examples: 805 download_size: 99913 dataset_size: 152739 - config_name: 'no' features: - name: instruction dtype: string splits: - name: test num_bytes: 141316 num_examples: 805 download_size: 94018 dataset_size: 141316 - config_name: ru features: - name: instruction dtype: string splits: - name: test num_bytes: 262317 num_examples: 805 download_size: 133403 dataset_size: 262317 - config_name: sv features: - name: instruction dtype: string splits: - name: test num_bytes: 146366 num_examples: 805 download_size: 96223 dataset_size: 146366 - config_name: zh features: - name: instruction dtype: string splits: - name: test num_bytes: 125499 num_examples: 805 download_size: 87092 dataset_size: 125499 configs: - config_name: ca data_files: - split: test path: ca/test-* - config_name: da data_files: - split: test path: da/test-* - config_name: de data_files: - split: test path: de/test-* - config_name: el data_files: - split: test path: el/test-* - config_name: en data_files: - split: test path: en/test-* - config_name: es data_files: - split: test path: es/test-* - config_name: fr data_files: - split: test path: fr/test-* - config_name: hi data_files: - split: test path: hi/test-* - config_name: is data_files: - split: test path: is/test-* - config_name: 'no' data_files: - split: test path: no/test-* - config_name: ru data_files: - split: test path: ru/test-* - config_name: sv data_files: - split: test path: sv/test-* - config_name: zh data_files: - split: test path: zh/test-* license: cc task_categories: - conversational - question-answering language: - en - ca - bg - da - de - el - es - fr - hi - is - 'no' - ru - sv - zh --- --- dataset_info: - config_name: ca features: - name: instruction dtype: string splits: - name: test num_bytes: 154255 num_examples: 805 download_size: 99320 dataset_size: 154255 - config_name: da features: - name: instruction dtype: string splits: - name: test num_bytes: 144724 num_examples: 805 download_size: 96555 dataset_size: 144724 - config_name: de features: - name: instruction dtype: string splits: - name: test num_bytes: 164871 num_examples: 805 download_size: 109435 dataset_size: 164871 - config_name: el features: - name: instruction dtype: string splits: - name: test num_bytes: 287985 num_examples: 805 download_size: 143043 dataset_size: 287985 - config_name: en features: - name: instruction dtype: string splits: - name: test num_bytes: 136100 num_examples: 805 download_size: 88817 dataset_size: 136100 - config_name: es features: - name: instruction dtype: string splits: - name: test num_bytes: 157880 num_examples: 805 download_size: 100029 dataset_size: 157880 - config_name: fr features: - name: instruction dtype: string splits: - name: test num_bytes: 168389 num_examples: 805 download_size: 104885 dataset_size: 168389 - config_name: hi features: - name: instruction dtype: string splits: - name: test num_bytes: 353161 num_examples: 805 download_size: 140012 dataset_size: 353161 - config_name: is features: - name: instruction dtype: string splits: - name: test num_bytes: 152739 num_examples: 805 download_size: 99913 dataset_size: 152739 - config_name: 'no' features: - name: instruction dtype: string splits: - name: test num_bytes: 141316 num_examples: 805 download_size: 94018 dataset_size: 141316 - config_name: ru features: - name: instruction dtype: string splits: - name: test num_bytes: 262317 num_examples: 805 download_size: 133403 dataset_size: 262317 - config_name: sv features: - name: instruction dtype: string splits: - name: test num_bytes: 146366 num_examples: 805 download_size: 96223 dataset_size: 146366 - config_name: zh features: - name: instruction dtype: string splits: - name: test num_bytes: 125499 num_examples: 805 download_size: 87092 dataset_size: 125499 configs: - config_name: ca data_files: - split: test path: ca/test-* - config_name: da data_files: - split: test path: da/test-* - config_name: de data_files: - split: test path: de/test-* - config_name: el data_files: - split: test path: el/test-* - config_name: en data_files: - split: test path: en/test-* - config_name: es data_files: - split: test path: es/test-* - config_name: fr data_files: - split: test path: fr/test-* - config_name: hi data_files: - split: test path: hi/test-* - config_name: is data_files: - split: test path: is/test-* - config_name: 'no' data_files: - split: test path: no/test-* - config_name: ru data_files: - split: test path: ru/test-* - config_name: sv data_files: - split: test path: sv/test-* - config_name: zh data_files: - split: test path: zh/test-* ---
提供机构:
ZurichNLP
原始信息汇总

数据集概述

数据集配置

配置名称:ca

  • 特征:
    • instruction: 数据类型为 string
  • 分割:
    • test: 字节数为 154255,示例数为 805
  • 下载大小: 99320 字节
  • 数据集大小: 154255 字节
  • 数据文件:
    • test: 路径为 ca/test-*

配置名称:da

  • 特征:
    • instruction: 数据类型为 string
  • 分割:
    • test: 字节数为 144724,示例数为 805
  • 下载大小: 96555 字节
  • 数据集大小: 144724 字节
  • 数据文件:
    • test: 路径为 da/test-*

配置名称:de

  • 特征:
    • instruction: 数据类型为 string
  • 分割:
    • test: 字节数为 164871,示例数为 805
  • 下载大小: 109435 字节
  • 数据集大小: 164871 字节
  • 数据文件:
    • test: 路径为 de/test-*

配置名称:el

  • 特征:
    • instruction: 数据类型为 string
  • 分割:
    • test: 字节数为 287985,示例数为 805
  • 下载大小: 143043 字节
  • 数据集大小: 287985 字节
  • 数据文件:
    • test: 路径为 el/test-*

配置名称:en

  • 特征:
    • instruction: 数据类型为 string
  • 分割:
    • test: 字节数为 136100,示例数为 805
  • 下载大小: 88817 字节
  • 数据集大小: 136100 字节
  • 数据文件:
    • test: 路径为 en/test-*

配置名称:es

  • 特征:
    • instruction: 数据类型为 string
  • 分割:
    • test: 字节数为 157880,示例数为 805
  • 下载大小: 100029 字节
  • 数据集大小: 157880 字节
  • 数据文件:
    • test: 路径为 es/test-*

配置名称:fr

  • 特征:
    • instruction: 数据类型为 string
  • 分割:
    • test: 字节数为 168389,示例数为 805
  • 下载大小: 104885 字节
  • 数据集大小: 168389 字节
  • 数据文件:
    • test: 路径为 fr/test-*

配置名称:hi

  • 特征:
    • instruction: 数据类型为 string
  • 分割:
    • test: 字节数为 353161,示例数为 805
  • 下载大小: 140012 字节
  • 数据集大小: 353161 字节
  • 数据文件:
    • test: 路径为 hi/test-*

配置名称:is

  • 特征:
    • instruction: 数据类型为 string
  • 分割:
    • test: 字节数为 152739,示例数为 805
  • 下载大小: 99913 字节
  • 数据集大小: 152739 字节
  • 数据文件:
    • test: 路径为 is/test-*

配置名称:no

  • 特征:
    • instruction: 数据类型为 string
  • 分割:
    • test: 字节数为 141316,示例数为 805
  • 下载大小: 94018 字节
  • 数据集大小: 141316 字节
  • 数据文件:
    • test: 路径为 no/test-*

配置名称:ru

  • 特征:
    • instruction: 数据类型为 string
  • 分割:
    • test: 字节数为 262317,示例数为 805
  • 下载大小: 133403 字节
  • 数据集大小: 262317 字节
  • 数据文件:
    • test: 路径为 ru/test-*

配置名称:sv

  • 特征:
    • instruction: 数据类型为 string
  • 分割:
    • test: 字节数为 146366,示例数为 805
  • 下载大小: 96223 字节
  • 数据集大小: 146366 字节
  • 数据文件:
    • test: 路径为 sv/test-*

配置名称:zh

  • 特征:
    • instruction: 数据类型为 string
  • 分割:
    • test: 字节数为 125499,示例数为 805
  • 下载大小: 87092 字节
  • 数据集大小: 125499 字节
  • 数据文件:
    • test: 路径为 zh/test-*
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作