five

awinml/test-MultiFin

收藏
Hugging Face2024-05-01 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/awinml/test-MultiFin
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: - config_name: da_highlevel features: - name: text dtype: string - name: label dtype: string - name: id dtype: string splits: - name: train num_bytes: 88789.7435458787 num_examples: 891 - name: validation num_bytes: 22445.89303482587 num_examples: 223 - name: test num_bytes: 27936.783582089553 num_examples: 279 download_size: 71192 dataset_size: 139172.42016279412 - config_name: el_highlevel features: - name: text dtype: string - name: label dtype: string - name: id dtype: string splits: - name: train num_bytes: 17040.455832037325 num_examples: 171 - name: validation num_bytes: 4328.13184079602 num_examples: 43 - name: test num_bytes: 5407.119402985075 num_examples: 54 download_size: 22653 dataset_size: 26775.70707581842 - config_name: en_highlevel features: - name: text dtype: string - name: label dtype: string - name: id dtype: string splits: - name: train num_bytes: 174091.67449455676 num_examples: 1747 - name: validation num_bytes: 43985.89800995025 num_examples: 437 - name: test num_bytes: 54671.985074626864 num_examples: 546 download_size: 110216 dataset_size: 272749.5575791339 - config_name: en_lowlevel features: - name: text dtype: string - name: labels sequence: string - name: id dtype: string splits: - name: train num_bytes: 172316.0 num_examples: 1747 - name: validation num_bytes: 43479.0 num_examples: 437 download_size: 90710 dataset_size: 215795.0 - config_name: es_highlevel features: - name: text dtype: string - name: label dtype: string - name: id dtype: string splits: - name: train num_bytes: 73144.41275272162 num_examples: 734 - name: validation num_bytes: 18520.378109452737 num_examples: 184 - name: test num_bytes: 23030.32338308458 num_examples: 230 download_size: 56032 dataset_size: 114695.11424525894 - config_name: es_lowlevel features: - name: text dtype: string - name: labels sequence: string - name: id dtype: string splits: - name: train num_bytes: 0.0 num_examples: 0 - name: validation num_bytes: 0.0 num_examples: 0 download_size: 2132 dataset_size: 0.0 - config_name: fi_highlevel features: - name: text dtype: string - name: label dtype: string - name: id dtype: string splits: - name: train num_bytes: 16143.589735614309 num_examples: 162 - name: validation num_bytes: 4026.169154228856 num_examples: 40 - name: test num_bytes: 5106.723880597015 num_examples: 51 download_size: 13810 dataset_size: 25276.48277044018 - config_name: he_highlevel features: - name: text dtype: string - name: label dtype: string - name: id dtype: string splits: - name: train num_bytes: 14848.116485225506 num_examples: 149 - name: validation num_bytes: 3724.2064676616915 num_examples: 37 - name: test num_bytes: 4706.1965174129355 num_examples: 47 download_size: 19087 dataset_size: 23278.519470300133 - config_name: hu_highlevel features: - name: text dtype: string - name: label dtype: string - name: id dtype: string splits: - name: train num_bytes: 13452.991446345257 num_examples: 135 - name: validation num_bytes: 3422.2437810945275 num_examples: 34 - name: test num_bytes: 4205.537313432836 num_examples: 42 download_size: 20042 dataset_size: 21080.772540872622 - config_name: is_highlevel features: - name: text dtype: string - name: label dtype: string - name: id dtype: string splits: - name: train num_bytes: 3786.767962674961 num_examples: 38 - name: validation num_bytes: 905.8880597014926 num_examples: 9 - name: test num_bytes: 1101.4502487562188 num_examples: 11 download_size: 7523 dataset_size: 5794.106271132672 - config_name: it_highlevel features: - name: text dtype: string - name: label dtype: string - name: id dtype: string splits: - name: train num_bytes: 5381.196578538103 num_examples: 54 - name: validation num_bytes: 1409.1592039800994 num_examples: 14 - name: test num_bytes: 1702.2412935323382 num_examples: 17 download_size: 10079 dataset_size: 8492.59707605054 - config_name: ja_highlevel features: - name: text dtype: string - name: label dtype: string - name: id dtype: string splits: - name: train num_bytes: 14648.812908242613 num_examples: 147 - name: validation num_bytes: 3724.2064676616915 num_examples: 37 - name: test num_bytes: 4606.064676616916 num_examples: 46 download_size: 24222 dataset_size: 22979.08405252122 - config_name: no_highlevel features: - name: text dtype: string - name: label dtype: string - name: id dtype: string splits: - name: train num_bytes: 10064.830637636082 num_examples: 101 - name: validation num_bytes: 2516.355721393035 num_examples: 25 - name: test num_bytes: 3104.087064676617 num_examples: 31 download_size: 13022 dataset_size: 15685.273423705734 - config_name: pl_highlevel features: - name: text dtype: string - name: label dtype: string - name: id dtype: string splits: - name: train num_bytes: 54310.22472783826 num_examples: 545 - name: validation num_bytes: 13688.97512437811 num_examples: 136 - name: test num_bytes: 17022.412935323384 num_examples: 170 download_size: 55564 dataset_size: 85021.61278753975 - config_name: ru_highlevel features: - name: text dtype: string - name: label dtype: string - name: id dtype: string splits: - name: train num_bytes: 9466.919906687403 num_examples: 95 - name: validation num_bytes: 2415.7014925373132 num_examples: 24 - name: test num_bytes: 3003.955223880597 num_examples: 30 download_size: 24800 dataset_size: 14886.576623105313 - config_name: sv_highlevel features: - name: text dtype: string - name: label dtype: string - name: id dtype: string splits: - name: train num_bytes: 2491.2947122861588 num_examples: 25 - name: validation num_bytes: 603.9253731343283 num_examples: 6 - name: test num_bytes: 700.9228855721393 num_examples: 7 download_size: 8301 dataset_size: 3796.1429709926265 - config_name: tr_highlevel features: - name: text dtype: string - name: label dtype: string - name: id dtype: string splits: - name: train num_bytes: 143099.96827371695 num_examples: 1436 - name: validation num_bytes: 36134.86815920398 num_examples: 359 - name: test num_bytes: 44959.19651741294 num_examples: 449 download_size: 104477 dataset_size: 224194.03295033387 configs: - config_name: da_highlevel data_files: - split: train path: da_highlevel/train-* - split: validation path: da_highlevel/validation-* - split: test path: da_highlevel/test-* - config_name: el_highlevel data_files: - split: train path: el_highlevel/train-* - split: validation path: el_highlevel/validation-* - split: test path: el_highlevel/test-* - config_name: en_highlevel data_files: - split: train path: en_highlevel/train-* - split: validation path: en_highlevel/validation-* - split: test path: en_highlevel/test-* - config_name: en_lowlevel data_files: - split: train path: en_lowlevel/train-* - split: validation path: en_lowlevel/validation-* - config_name: es_highlevel data_files: - split: train path: es_highlevel/train-* - split: validation path: es_highlevel/validation-* - split: test path: es_highlevel/test-* - config_name: es_lowlevel data_files: - split: train path: es_lowlevel/train-* - split: validation path: es_lowlevel/validation-* - config_name: fi_highlevel data_files: - split: train path: fi_highlevel/train-* - split: validation path: fi_highlevel/validation-* - split: test path: fi_highlevel/test-* - config_name: he_highlevel data_files: - split: train path: he_highlevel/train-* - split: validation path: he_highlevel/validation-* - split: test path: he_highlevel/test-* - config_name: hu_highlevel data_files: - split: train path: hu_highlevel/train-* - split: validation path: hu_highlevel/validation-* - split: test path: hu_highlevel/test-* - config_name: is_highlevel data_files: - split: train path: is_highlevel/train-* - split: validation path: is_highlevel/validation-* - split: test path: is_highlevel/test-* - config_name: it_highlevel data_files: - split: train path: it_highlevel/train-* - split: validation path: it_highlevel/validation-* - split: test path: it_highlevel/test-* - config_name: ja_highlevel data_files: - split: train path: ja_highlevel/train-* - split: validation path: ja_highlevel/validation-* - split: test path: ja_highlevel/test-* - config_name: no_highlevel data_files: - split: train path: no_highlevel/train-* - split: validation path: no_highlevel/validation-* - split: test path: no_highlevel/test-* - config_name: pl_highlevel data_files: - split: train path: pl_highlevel/train-* - split: validation path: pl_highlevel/validation-* - split: test path: pl_highlevel/test-* - config_name: ru_highlevel data_files: - split: train path: ru_highlevel/train-* - split: validation path: ru_highlevel/validation-* - split: test path: ru_highlevel/test-* - config_name: sv_highlevel data_files: - split: train path: sv_highlevel/train-* - split: validation path: sv_highlevel/validation-* - split: test path: sv_highlevel/test-* - config_name: tr_highlevel data_files: - split: train path: tr_highlevel/train-* - split: validation path: tr_highlevel/validation-* - split: test path: tr_highlevel/test-* ---
提供机构:
awinml
原始信息汇总

数据集概述

1. 数据集配置信息

  • da_highlevel

    • 特征:
      • text: 字符串类型
      • label: 字符串类型
      • id: 字符串类型
    • 分割:
      • train: 891个样本,88789.7435458787字节
      • validation: 223个样本,22445.89303482587字节
      • test: 279个样本,27936.783582089553字节
    • 下载大小: 71192字节
    • 数据集大小: 139172.42016279412字节
  • el_highlevel

    • 特征:
      • text: 字符串类型
      • label: 字符串类型
      • id: 字符串类型
    • 分割:
      • train: 171个样本,17040.455832037325字节
      • validation: 43个样本,4328.13184079602字节
      • test: 54个样本,5407.119402985075字节
    • 下载大小: 22653字节
    • 数据集大小: 26775.70707581842字节
  • en_highlevel

    • 特征:
      • text: 字符串类型
      • label: 字符串类型
      • id: 字符串类型
    • 分割:
      • train: 1747个样本,174091.67449455676字节
      • validation: 437个样本,43985.89800995025字节
      • test: 546个样本,54671.985074626864字节
    • 下载大小: 110216字节
    • 数据集大小: 272749.5575791339字节
  • en_lowlevel

    • 特征:
      • text: 字符串类型
      • labels: 字符串序列类型
      • id: 字符串类型
    • 分割:
      • train: 1747个样本,172316.0字节
      • validation: 437个样本,43479.0字节
    • 下载大小: 90710字节
    • 数据集大小: 215795.0字节
  • es_highlevel

    • 特征:
      • text: 字符串类型
      • label: 字符串类型
      • id: 字符串类型
    • 分割:
      • train: 734个样本,73144.41275272162字节
      • validation: 184个样本,18520.378109452737字节
      • test: 230个样本,23030.32338308458字节
    • 下载大小: 56032字节
    • 数据集大小: 114695.11424525894字节
  • es_lowlevel

    • 特征:
      • text: 字符串类型
      • labels: 字符串序列类型
      • id: 字符串类型
    • 分割:
      • train: 0个样本,0.0字节
      • validation: 0个样本,0.0字节
    • 下载大小: 2132字节
    • 数据集大小: 0.0字节
  • fi_highlevel

    • 特征:
      • text: 字符串类型
      • label: 字符串类型
      • id: 字符串类型
    • 分割:
      • train: 162个样本,16143.589735614309字节
      • validation: 40个样本,4026.169154228856字节
      • test: 51个样本,5106.723880597015字节
    • 下载大小: 13810字节
    • 数据集大小: 25276.48277044018字节
  • he_highlevel

    • 特征:
      • text: 字符串类型
      • label: 字符串类型
      • id: 字符串类型
    • 分割:
      • train: 149个样本,14848.116485225506字节
      • validation: 37个样本,3724.2064676616915字节
      • test: 47个样本,4706.1965174129355字节
    • 下载大小: 19087字节
    • 数据集大小: 23278.519470300133字节
  • hu_highlevel

    • 特征:
      • text: 字符串类型
      • label: 字符串类型
      • id: 字符串类型
    • 分割:
      • train: 135个样本,13452.991446345257字节
      • validation: 34个样本,3422.2437810945275字节
      • test: 42个样本,4205.537313432836字节
    • 下载大小: 20042字节
    • 数据集大小: 21080.772540872622字节
  • is_highlevel

    • 特征:
      • text: 字符串类型
      • label: 字符串类型
      • id: 字符串类型
    • 分割:
      • train: 38个样本,3786.767962674961字节
      • validation: 9个样本,905.8880597014926字节
      • test: 11个样本,1101.4502487562188字节
    • 下载大小: 7523字节
    • 数据集大小: 5794.106271132672字节
  • it_highlevel

    • 特征:
      • text: 字符串类型
      • label: 字符串类型
      • id: 字符串类型
    • 分割:
      • train: 54个样本,5381.196578538103字节
      • validation: 14个样本,1409.1592039800994字节
      • test: 17个样本,1702.2412935323382字节
    • 下载大小: 10079字节
    • 数据集大小: 8492.59707605054字节
  • ja_highlevel

    • 特征:
      • text: 字符串类型
      • label: 字符串类型
      • id: 字符串类型
    • 分割:
      • train: 147个样本,14648.812908242613字节
      • validation: 37个样本,3724.2064676616915字节
      • test: 46个样本,4606.064676616916字节
    • 下载大小: 24222字节
    • 数据集大小: 22979.08405252122字节
  • no_highlevel

    • 特征:
      • text: 字符串类型
      • label: 字符串类型
      • id: 字符串类型
    • 分割:
      • train: 101个样本,10064.830637636082字节
      • validation: 25个样本,2516.355721393035字节
      • test: 31个样本,3104.087064676617字节
    • 下载大小: 13022字节
    • 数据集大小: 15685.273423705734字节
  • pl_highlevel

    • 特征:
      • text: 字符串类型
      • label: 字符串类型
      • id: 字符串类型
    • 分割:
      • train: 545个样本,54310.22472783826字节
      • validation: 136个样本,13688.97512437811字节
      • test: 170个样本,17022.412935323384字节
    • 下载大小: 55564字节
    • 数据集大小: 85021.61278753975字节
  • ru_highlevel

    • 特征:
      • text: 字符串类型
      • label: 字符串类型
      • id: 字符串类型
    • 分割:
      • train: 95个样本,9466.919906687403字节
      • validation: 24个样本,2415.7014925373132字节
      • test: 30个样本,3003.955223880597字节
    • 下载大小: 24800字节
    • 数据集大小: 14886.576623105313字节
  • sv_highlevel

    • 特征:
      • text: 字符串类型
      • label: 字符串类型
      • id: 字符串类型
    • 分割:
      • train: 25个样本,2491.2947122861588字节
      • validation: 6个样本,603.9253731343283字节
      • test: 7个样本,700.9228855721393字节
    • 下载大小: 8301字节
    • 数据集大小: 3796.1429709926265字节
  • tr_highlevel

    • 特征:
      • text: 字符串类型
      • label: 字符串类型
      • id: 字符串类型
    • 分割:
      • train: 1436个样本,143099.96827371695字节
      • validation: 359个样本,36134.86815920398字节
      • test: 449个样本,44959.19651741294字节
    • 下载大小: 104477字节
    • 数据集大小: 224194.03295033387字节

2. 数据集文件路径

  • 每个配置的数据集文件路径格式如下:
    • config_name/split-*
    • 例如:da_highlevel/train-*
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作