awinml/test-MultiFin

Name: awinml/test-MultiFin
Creator: awinml
Published: 2024-05-01 09:34:59
License: 暂无描述

Hugging Face2024-05-01 更新2024-06-12 收录

下载链接：

https://hf-mirror.com/datasets/awinml/test-MultiFin

下载链接

链接失效反馈

官方服务：

资源简介：

--- dataset_info: - config_name: da_highlevel features: - name: text dtype: string - name: label dtype: string - name: id dtype: string splits: - name: train num_bytes: 88789.7435458787 num_examples: 891 - name: validation num_bytes: 22445.89303482587 num_examples: 223 - name: test num_bytes: 27936.783582089553 num_examples: 279 download_size: 71192 dataset_size: 139172.42016279412 - config_name: el_highlevel features: - name: text dtype: string - name: label dtype: string - name: id dtype: string splits: - name: train num_bytes: 17040.455832037325 num_examples: 171 - name: validation num_bytes: 4328.13184079602 num_examples: 43 - name: test num_bytes: 5407.119402985075 num_examples: 54 download_size: 22653 dataset_size: 26775.70707581842 - config_name: en_highlevel features: - name: text dtype: string - name: label dtype: string - name: id dtype: string splits: - name: train num_bytes: 174091.67449455676 num_examples: 1747 - name: validation num_bytes: 43985.89800995025 num_examples: 437 - name: test num_bytes: 54671.985074626864 num_examples: 546 download_size: 110216 dataset_size: 272749.5575791339 - config_name: en_lowlevel features: - name: text dtype: string - name: labels sequence: string - name: id dtype: string splits: - name: train num_bytes: 172316.0 num_examples: 1747 - name: validation num_bytes: 43479.0 num_examples: 437 download_size: 90710 dataset_size: 215795.0 - config_name: es_highlevel features: - name: text dtype: string - name: label dtype: string - name: id dtype: string splits: - name: train num_bytes: 73144.41275272162 num_examples: 734 - name: validation num_bytes: 18520.378109452737 num_examples: 184 - name: test num_bytes: 23030.32338308458 num_examples: 230 download_size: 56032 dataset_size: 114695.11424525894 - config_name: es_lowlevel features: - name: text dtype: string - name: labels sequence: string - name: id dtype: string splits: - name: train num_bytes: 0.0 num_examples: 0 - name: validation num_bytes: 0.0 num_examples: 0 download_size: 2132 dataset_size: 0.0 - config_name: fi_highlevel features: - name: text dtype: string - name: label dtype: string - name: id dtype: string splits: - name: train num_bytes: 16143.589735614309 num_examples: 162 - name: validation num_bytes: 4026.169154228856 num_examples: 40 - name: test num_bytes: 5106.723880597015 num_examples: 51 download_size: 13810 dataset_size: 25276.48277044018 - config_name: he_highlevel features: - name: text dtype: string - name: label dtype: string - name: id dtype: string splits: - name: train num_bytes: 14848.116485225506 num_examples: 149 - name: validation num_bytes: 3724.2064676616915 num_examples: 37 - name: test num_bytes: 4706.1965174129355 num_examples: 47 download_size: 19087 dataset_size: 23278.519470300133 - config_name: hu_highlevel features: - name: text dtype: string - name: label dtype: string - name: id dtype: string splits: - name: train num_bytes: 13452.991446345257 num_examples: 135 - name: validation num_bytes: 3422.2437810945275 num_examples: 34 - name: test num_bytes: 4205.537313432836 num_examples: 42 download_size: 20042 dataset_size: 21080.772540872622 - config_name: is_highlevel features: - name: text dtype: string - name: label dtype: string - name: id dtype: string splits: - name: train num_bytes: 3786.767962674961 num_examples: 38 - name: validation num_bytes: 905.8880597014926 num_examples: 9 - name: test num_bytes: 1101.4502487562188 num_examples: 11 download_size: 7523 dataset_size: 5794.106271132672 - config_name: it_highlevel features: - name: text dtype: string - name: label dtype: string - name: id dtype: string splits: - name: train num_bytes: 5381.196578538103 num_examples: 54 - name: validation num_bytes: 1409.1592039800994 num_examples: 14 - name: test num_bytes: 1702.2412935323382 num_examples: 17 download_size: 10079 dataset_size: 8492.59707605054 - config_name: ja_highlevel features: - name: text dtype: string - name: label dtype: string - name: id dtype: string splits: - name: train num_bytes: 14648.812908242613 num_examples: 147 - name: validation num_bytes: 3724.2064676616915 num_examples: 37 - name: test num_bytes: 4606.064676616916 num_examples: 46 download_size: 24222 dataset_size: 22979.08405252122 - config_name: no_highlevel features: - name: text dtype: string - name: label dtype: string - name: id dtype: string splits: - name: train num_bytes: 10064.830637636082 num_examples: 101 - name: validation num_bytes: 2516.355721393035 num_examples: 25 - name: test num_bytes: 3104.087064676617 num_examples: 31 download_size: 13022 dataset_size: 15685.273423705734 - config_name: pl_highlevel features: - name: text dtype: string - name: label dtype: string - name: id dtype: string splits: - name: train num_bytes: 54310.22472783826 num_examples: 545 - name: validation num_bytes: 13688.97512437811 num_examples: 136 - name: test num_bytes: 17022.412935323384 num_examples: 170 download_size: 55564 dataset_size: 85021.61278753975 - config_name: ru_highlevel features: - name: text dtype: string - name: label dtype: string - name: id dtype: string splits: - name: train num_bytes: 9466.919906687403 num_examples: 95 - name: validation num_bytes: 2415.7014925373132 num_examples: 24 - name: test num_bytes: 3003.955223880597 num_examples: 30 download_size: 24800 dataset_size: 14886.576623105313 - config_name: sv_highlevel features: - name: text dtype: string - name: label dtype: string - name: id dtype: string splits: - name: train num_bytes: 2491.2947122861588 num_examples: 25 - name: validation num_bytes: 603.9253731343283 num_examples: 6 - name: test num_bytes: 700.9228855721393 num_examples: 7 download_size: 8301 dataset_size: 3796.1429709926265 - config_name: tr_highlevel features: - name: text dtype: string - name: label dtype: string - name: id dtype: string splits: - name: train num_bytes: 143099.96827371695 num_examples: 1436 - name: validation num_bytes: 36134.86815920398 num_examples: 359 - name: test num_bytes: 44959.19651741294 num_examples: 449 download_size: 104477 dataset_size: 224194.03295033387 configs: - config_name: da_highlevel data_files: - split: train path: da_highlevel/train-* - split: validation path: da_highlevel/validation-* - split: test path: da_highlevel/test-* - config_name: el_highlevel data_files: - split: train path: el_highlevel/train-* - split: validation path: el_highlevel/validation-* - split: test path: el_highlevel/test-* - config_name: en_highlevel data_files: - split: train path: en_highlevel/train-* - split: validation path: en_highlevel/validation-* - split: test path: en_highlevel/test-* - config_name: en_lowlevel data_files: - split: train path: en_lowlevel/train-* - split: validation path: en_lowlevel/validation-* - config_name: es_highlevel data_files: - split: train path: es_highlevel/train-* - split: validation path: es_highlevel/validation-* - split: test path: es_highlevel/test-* - config_name: es_lowlevel data_files: - split: train path: es_lowlevel/train-* - split: validation path: es_lowlevel/validation-* - config_name: fi_highlevel data_files: - split: train path: fi_highlevel/train-* - split: validation path: fi_highlevel/validation-* - split: test path: fi_highlevel/test-* - config_name: he_highlevel data_files: - split: train path: he_highlevel/train-* - split: validation path: he_highlevel/validation-* - split: test path: he_highlevel/test-* - config_name: hu_highlevel data_files: - split: train path: hu_highlevel/train-* - split: validation path: hu_highlevel/validation-* - split: test path: hu_highlevel/test-* - config_name: is_highlevel data_files: - split: train path: is_highlevel/train-* - split: validation path: is_highlevel/validation-* - split: test path: is_highlevel/test-* - config_name: it_highlevel data_files: - split: train path: it_highlevel/train-* - split: validation path: it_highlevel/validation-* - split: test path: it_highlevel/test-* - config_name: ja_highlevel data_files: - split: train path: ja_highlevel/train-* - split: validation path: ja_highlevel/validation-* - split: test path: ja_highlevel/test-* - config_name: no_highlevel data_files: - split: train path: no_highlevel/train-* - split: validation path: no_highlevel/validation-* - split: test path: no_highlevel/test-* - config_name: pl_highlevel data_files: - split: train path: pl_highlevel/train-* - split: validation path: pl_highlevel/validation-* - split: test path: pl_highlevel/test-* - config_name: ru_highlevel data_files: - split: train path: ru_highlevel/train-* - split: validation path: ru_highlevel/validation-* - split: test path: ru_highlevel/test-* - config_name: sv_highlevel data_files: - split: train path: sv_highlevel/train-* - split: validation path: sv_highlevel/validation-* - split: test path: sv_highlevel/test-* - config_name: tr_highlevel data_files: - split: train path: tr_highlevel/train-* - split: validation path: tr_highlevel/validation-* - split: test path: tr_highlevel/test-* ---

提供机构：

awinml

原始信息汇总

数据集概述

1. 数据集配置信息

da_highlevel
- 特征:
  - text: 字符串类型
  - label: 字符串类型
  - id: 字符串类型
- 分割:
  - train: 891个样本，88789.7435458787字节
  - validation: 223个样本，22445.89303482587字节
  - test: 279个样本，27936.783582089553字节
- 下载大小: 71192字节
- 数据集大小: 139172.42016279412字节
el_highlevel
- 特征:
  - text: 字符串类型
  - label: 字符串类型
  - id: 字符串类型
- 分割:
  - train: 171个样本，17040.455832037325字节
  - validation: 43个样本，4328.13184079602字节
  - test: 54个样本，5407.119402985075字节
- 下载大小: 22653字节
- 数据集大小: 26775.70707581842字节
en_highlevel
- 特征:
  - text: 字符串类型
  - label: 字符串类型
  - id: 字符串类型
- 分割:
  - train: 1747个样本，174091.67449455676字节
  - validation: 437个样本，43985.89800995025字节
  - test: 546个样本，54671.985074626864字节
- 下载大小: 110216字节
- 数据集大小: 272749.5575791339字节
en_lowlevel
- 特征:
  - text: 字符串类型
  - labels: 字符串序列类型
  - id: 字符串类型
- 分割:
  - train: 1747个样本，172316.0字节
  - validation: 437个样本，43479.0字节
- 下载大小: 90710字节
- 数据集大小: 215795.0字节
es_highlevel
- 特征:
  - text: 字符串类型
  - label: 字符串类型
  - id: 字符串类型
- 分割:
  - train: 734个样本，73144.41275272162字节
  - validation: 184个样本，18520.378109452737字节
  - test: 230个样本，23030.32338308458字节
- 下载大小: 56032字节
- 数据集大小: 114695.11424525894字节
es_lowlevel
- 特征:
  - text: 字符串类型
  - labels: 字符串序列类型
  - id: 字符串类型
- 分割:
  - train: 0个样本，0.0字节
  - validation: 0个样本，0.0字节
- 下载大小: 2132字节
- 数据集大小: 0.0字节
fi_highlevel
- 特征:
  - text: 字符串类型
  - label: 字符串类型
  - id: 字符串类型
- 分割:
  - train: 162个样本，16143.589735614309字节
  - validation: 40个样本，4026.169154228856字节
  - test: 51个样本，5106.723880597015字节
- 下载大小: 13810字节
- 数据集大小: 25276.48277044018字节
he_highlevel
- 特征:
  - text: 字符串类型
  - label: 字符串类型
  - id: 字符串类型
- 分割:
  - train: 149个样本，14848.116485225506字节
  - validation: 37个样本，3724.2064676616915字节
  - test: 47个样本，4706.1965174129355字节
- 下载大小: 19087字节
- 数据集大小: 23278.519470300133字节
hu_highlevel
- 特征:
  - text: 字符串类型
  - label: 字符串类型
  - id: 字符串类型
- 分割:
  - train: 135个样本，13452.991446345257字节
  - validation: 34个样本，3422.2437810945275字节
  - test: 42个样本，4205.537313432836字节
- 下载大小: 20042字节
- 数据集大小: 21080.772540872622字节
is_highlevel
- 特征:
  - text: 字符串类型
  - label: 字符串类型
  - id: 字符串类型
- 分割:
  - train: 38个样本，3786.767962674961字节
  - validation: 9个样本，905.8880597014926字节
  - test: 11个样本，1101.4502487562188字节
- 下载大小: 7523字节
- 数据集大小: 5794.106271132672字节
it_highlevel
- 特征:
  - text: 字符串类型
  - label: 字符串类型
  - id: 字符串类型
- 分割:
  - train: 54个样本，5381.196578538103字节
  - validation: 14个样本，1409.1592039800994字节
  - test: 17个样本，1702.2412935323382字节
- 下载大小: 10079字节
- 数据集大小: 8492.59707605054字节
ja_highlevel
- 特征:
  - text: 字符串类型
  - label: 字符串类型
  - id: 字符串类型
- 分割:
  - train: 147个样本，14648.812908242613字节
  - validation: 37个样本，3724.2064676616915字节
  - test: 46个样本，4606.064676616916字节
- 下载大小: 24222字节
- 数据集大小: 22979.08405252122字节
no_highlevel
- 特征:
  - text: 字符串类型
  - label: 字符串类型
  - id: 字符串类型
- 分割:
  - train: 101个样本，10064.830637636082字节
  - validation: 25个样本，2516.355721393035字节
  - test: 31个样本，3104.087064676617字节
- 下载大小: 13022字节
- 数据集大小: 15685.273423705734字节
pl_highlevel
- 特征:
  - text: 字符串类型
  - label: 字符串类型
  - id: 字符串类型
- 分割:
  - train: 545个样本，54310.22472783826字节
  - validation: 136个样本，13688.97512437811字节
  - test: 170个样本，17022.412935323384字节
- 下载大小: 55564字节
- 数据集大小: 85021.61278753975字节
ru_highlevel
- 特征:
  - text: 字符串类型
  - label: 字符串类型
  - id: 字符串类型
- 分割:
  - train: 95个样本，9466.919906687403字节
  - validation: 24个样本，2415.7014925373132字节
  - test: 30个样本，3003.955223880597字节
- 下载大小: 24800字节
- 数据集大小: 14886.576623105313字节
sv_highlevel
- 特征:
  - text: 字符串类型
  - label: 字符串类型
  - id: 字符串类型
- 分割:
  - train: 25个样本，2491.2947122861588字节
  - validation: 6个样本，603.9253731343283字节
  - test: 7个样本，700.9228855721393字节
- 下载大小: 8301字节
- 数据集大小: 3796.1429709926265字节
tr_highlevel
- 特征:
  - text: 字符串类型
  - label: 字符串类型
  - id: 字符串类型
- 分割:
  - train: 1436个样本，143099.96827371695字节
  - validation: 359个样本，36134.86815920398字节
  - test: 449个样本，44959.19651741294字节
- 下载大小: 104477字节
- 数据集大小: 224194.03295033387字节

2. 数据集文件路径

每个配置的数据集文件路径格式如下：
- config_name/split-*
- 例如：da_highlevel/train-*

5,000+

优质数据集

54 个

任务类型

进入经典数据集