five

ibragim-bad/hs_multilang

收藏
Hugging Face2024-02-22 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/ibragim-bad/hs_multilang
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: - config_name: ar features: - name: ind dtype: int32 - name: activity_label dtype: string - name: ctx_a dtype: string - name: ctx_b dtype: string - name: ctx dtype: string - name: endings sequence: string - name: source_id dtype: string - name: split dtype: string - name: split_type dtype: string - name: label dtype: string splits: - name: validation num_bytes: 15026500 num_examples: 9176 download_size: 7468005 dataset_size: 15026500 - config_name: de features: - name: ind dtype: int32 - name: activity_label dtype: string - name: ctx_a dtype: string - name: ctx_b dtype: string - name: ctx dtype: string - name: endings sequence: string - name: source_id dtype: string - name: split dtype: string - name: split_type dtype: string - name: label dtype: string splits: - name: validation num_bytes: 12344284 num_examples: 9368 download_size: 7095322 dataset_size: 12344284 - config_name: es features: - name: ind dtype: int32 - name: activity_label dtype: string - name: ctx_a dtype: string - name: ctx_b dtype: string - name: ctx dtype: string - name: endings sequence: string - name: source_id dtype: string - name: split dtype: string - name: split_type dtype: string - name: label dtype: string splits: - name: validation num_bytes: 11630674 num_examples: 9374 download_size: 6725858 dataset_size: 11630674 - config_name: fr features: - name: ind dtype: int32 - name: activity_label dtype: string - name: ctx_a dtype: string - name: ctx_b dtype: string - name: ctx dtype: string - name: endings sequence: string - name: source_id dtype: string - name: split dtype: string - name: split_type dtype: string - name: label dtype: string splits: - name: validation num_bytes: 12527721 num_examples: 9338 download_size: 7040656 dataset_size: 12527721 - config_name: he features: - name: ind dtype: int64 - name: ctx dtype: string - name: ctx_a dtype: string - name: ctx_b dtype: string - name: endings sequence: string - name: activity_label dtype: string - name: source_id dtype: string - name: split dtype: string - name: split_type dtype: string - name: label dtype: string splits: - name: validation num_bytes: 11346822 num_examples: 8355 download_size: 5155175 dataset_size: 11346822 - config_name: it features: - name: ind dtype: int32 - name: activity_label dtype: string - name: ctx_a dtype: string - name: ctx_b dtype: string - name: ctx dtype: string - name: endings sequence: string - name: source_id dtype: string - name: split dtype: string - name: split_type dtype: string - name: label dtype: string splits: - name: validation num_bytes: 11458511 num_examples: 9193 download_size: 6651885 dataset_size: 11458511 - config_name: ru features: - name: ind dtype: int32 - name: activity_label dtype: string - name: ctx_a dtype: string - name: ctx_b dtype: string - name: ctx dtype: string - name: endings sequence: string - name: source_id dtype: string - name: split dtype: string - name: split_type dtype: string - name: label dtype: string splits: - name: validation num_bytes: 18603749 num_examples: 9272 download_size: 9065335 dataset_size: 18603749 configs: - config_name: ar data_files: - split: validation path: ar/validation-* - config_name: de data_files: - split: validation path: de/validation-* - config_name: es data_files: - split: validation path: es/validation-* - config_name: fr data_files: - split: validation path: fr/validation-* - config_name: he data_files: - split: validation path: he/validation-* - config_name: it data_files: - split: validation path: it/validation-* - config_name: ru data_files: - split: validation path: ru/validation-* ---
提供机构:
ibragim-bad
原始信息汇总

数据集概述

配置信息

阿拉伯语 (ar)

  • 特征:
    • ind: int32
    • activity_label: string
    • ctx_a: string
    • ctx_b: string
    • ctx: string
    • endings: sequence of string
    • source_id: string
    • split: string
    • split_type: string
    • label: string
  • 分割:
    • validation:
      • 字节数: 15026500
      • 样本数: 9176
  • 下载大小: 7468005
  • 数据集大小: 15026500
  • 数据文件:
    • validation: ar/validation-*

德语 (de)

  • 特征:
    • ind: int32
    • activity_label: string
    • ctx_a: string
    • ctx_b: string
    • ctx: string
    • endings: sequence of string
    • source_id: string
    • split: string
    • split_type: string
    • label: string
  • 分割:
    • validation:
      • 字节数: 12344284
      • 样本数: 9368
  • 下载大小: 7095322
  • 数据集大小: 12344284
  • 数据文件:
    • validation: de/validation-*

西班牙语 (es)

  • 特征:
    • ind: int32
    • activity_label: string
    • ctx_a: string
    • ctx_b: string
    • ctx: string
    • endings: sequence of string
    • source_id: string
    • split: string
    • split_type: string
    • label: string
  • 分割:
    • validation:
      • 字节数: 11630674
      • 样本数: 9374
  • 下载大小: 6725858
  • 数据集大小: 11630674
  • 数据文件:
    • validation: es/validation-*

法语 (fr)

  • 特征:
    • ind: int32
    • activity_label: string
    • ctx_a: string
    • ctx_b: string
    • ctx: string
    • endings: sequence of string
    • source_id: string
    • split: string
    • split_type: string
    • label: string
  • 分割:
    • validation:
      • 字节数: 12527721
      • 样本数: 9338
  • 下载大小: 7040656
  • 数据集大小: 12527721
  • 数据文件:
    • validation: fr/validation-*

希伯来语 (he)

  • 特征:
    • ind: int64
    • ctx: string
    • ctx_a: string
    • ctx_b: string
    • endings: sequence of string
    • activity_label: string
    • source_id: string
    • split: string
    • split_type: string
    • label: string
  • 分割:
    • validation:
      • 字节数: 11346822
      • 样本数: 8355
  • 下载大小: 5155175
  • 数据集大小: 11346822
  • 数据文件:
    • validation: he/validation-*

意大利语 (it)

  • 特征:
    • ind: int32
    • activity_label: string
    • ctx_a: string
    • ctx_b: string
    • ctx: string
    • endings: sequence of string
    • source_id: string
    • split: string
    • split_type: string
    • label: string
  • 分割:
    • validation:
      • 字节数: 11458511
      • 样本数: 9193
  • 下载大小: 6651885
  • 数据集大小: 11458511
  • 数据文件:
    • validation: it/validation-*

俄语 (ru)

  • 特征:
    • ind: int32
    • activity_label: string
    • ctx_a: string
    • ctx_b: string
    • ctx: string
    • endings: sequence of string
    • source_id: string
    • split: string
    • split_type: string
    • label: string
  • 分割:
    • validation:
      • 字节数: 18603749
      • 样本数: 9272
  • 下载大小: 9065335
  • 数据集大小: 18603749
  • 数据文件:
    • validation: ru/validation-*
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作