five

mbzuai-ugrip-statement-tuning/xstorycloze

收藏
Hugging Face2024-06-06 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/mbzuai-ugrip-statement-tuning/xstorycloze
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: - config_name: ar features: - name: answer_right_ending dtype: int32 - name: statement1 dtype: string - name: statement2 dtype: string splits: - name: test num_bytes: 1172623 num_examples: 1511 download_size: 581963 dataset_size: 1172623 - config_name: en features: - name: answer_right_ending dtype: int32 - name: statement1 dtype: string - name: statement2 dtype: string splits: - name: test num_bytes: 822860 num_examples: 1511 download_size: 459654 dataset_size: 822860 - config_name: es features: - name: answer_right_ending dtype: int32 - name: statement1 dtype: string - name: statement2 dtype: string splits: - name: test num_bytes: 894779 num_examples: 1511 download_size: 507534 dataset_size: 894779 - config_name: eu features: - name: answer_right_ending dtype: int32 - name: statement1 dtype: string - name: statement2 dtype: string splits: - name: test num_bytes: 896518 num_examples: 1511 download_size: 485370 dataset_size: 896518 - config_name: hi features: - name: answer_right_ending dtype: int32 - name: statement1 dtype: string - name: statement2 dtype: string splits: - name: test num_bytes: 1958974 num_examples: 1511 download_size: 739647 dataset_size: 1958974 - config_name: id features: - name: answer_right_ending dtype: int32 - name: statement1 dtype: string - name: statement2 dtype: string splits: - name: test num_bytes: 913463 num_examples: 1511 download_size: 476175 dataset_size: 913463 - config_name: my features: - name: answer_right_ending dtype: int32 - name: statement1 dtype: string - name: statement2 dtype: string splits: - name: test num_bytes: 2735802 num_examples: 1511 download_size: 849229 dataset_size: 2735802 - config_name: ru features: - name: answer_right_ending dtype: int32 - name: statement1 dtype: string - name: statement2 dtype: string splits: - name: test num_bytes: 1415824 num_examples: 1511 download_size: 688117 dataset_size: 1415824 - config_name: sw features: - name: answer_right_ending dtype: int32 - name: statement1 dtype: string - name: statement2 dtype: string splits: - name: test num_bytes: 879761 num_examples: 1511 download_size: 468032 dataset_size: 879761 - config_name: te features: - name: answer_right_ending dtype: int32 - name: statement1 dtype: string - name: statement2 dtype: string splits: - name: test num_bytes: 2037222 num_examples: 1511 download_size: 745049 dataset_size: 2037222 - config_name: zh features: - name: answer_right_ending dtype: int32 - name: statement1 dtype: string - name: statement2 dtype: string splits: - name: test num_bytes: 806541 num_examples: 1511 download_size: 485889 dataset_size: 806541 configs: - config_name: ar data_files: - split: test path: ar/test-* - config_name: en data_files: - split: test path: en/test-* - config_name: es data_files: - split: test path: es/test-* - config_name: eu data_files: - split: test path: eu/test-* - config_name: hi data_files: - split: test path: hi/test-* - config_name: id data_files: - split: test path: id/test-* - config_name: my data_files: - split: test path: my/test-* - config_name: ru data_files: - split: test path: ru/test-* - config_name: sw data_files: - split: test path: sw/test-* - config_name: te data_files: - split: test path: te/test-* - config_name: zh data_files: - split: test path: zh/test-* ---
提供机构:
mbzuai-ugrip-statement-tuning
原始信息汇总

数据集概述

数据集配置

  • config_name: 包含多种语言配置,如ar, en, es, eu, hi, id, my, ru, sw, te, zh。
  • features: 每种配置包含三个特征:
    • answer_right_ending: 数据类型为int32
    • statement1: 数据类型为string
    • statement2: 数据类型为string

数据集分割

  • split: 所有配置均包含名为test的分割。
  • num_examples: 每个test分割包含1511个样本。
  • num_bytes: 不同语言配置的test分割大小不同,具体如下:
    • ar: 1172623 bytes
    • en: 822860 bytes
    • es: 894779 bytes
    • eu: 896518 bytes
    • hi: 1958974 bytes
    • id: 913463 bytes
    • my: 2735802 bytes
    • ru: 1415824 bytes
    • sw: 879761 bytes
    • te: 2037222 bytes
    • zh: 806541 bytes

数据集大小与下载大小

  • download_size: 不同语言配置的下载大小不同,具体如下:
    • ar: 581963 bytes
    • en: 459654 bytes
    • es: 507534 bytes
    • eu: 485370 bytes
    • hi: 739647 bytes
    • id: 476175 bytes
    • my: 849229 bytes
    • ru: 688117 bytes
    • sw: 468032 bytes
    • te: 745049 bytes
    • zh: 485889 bytes
  • dataset_size: 与num_bytes相同,反映数据集的实际大小。

数据文件路径

  • path: 每个配置的test分割数据文件路径格式为[语言代码]/test-*
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作