qmeeus/MSNER-nlp
收藏Hugging Face2024-03-28 更新2024-06-11 收录
下载链接:
https://hf-mirror.com/datasets/qmeeus/MSNER-nlp
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
- config_name: de
features:
- name: tokens
sequence: string
- name: tags
sequence: string
splits:
- name: train
num_bytes: 41616289
num_examples: 108473
- name: validation
num_bytes: 791188
num_examples: 2109
- name: test
num_bytes: 747121
num_examples: 1966
download_size: 10480059
dataset_size: 43154598
- config_name: en
features:
- name: tokens
sequence: string
- name: tags
sequence: string
splits:
- name: train
num_bytes: 2204014
num_examples: 5000
- name: validation
num_bytes: 735967
num_examples: 1753
- name: test
num_bytes: 742319
num_examples: 1842
download_size: 745400
dataset_size: 3682300
- config_name: es
features:
- name: tokens
sequence: string
- name: tags
sequence: string
splits:
- name: train
num_bytes: 25555845
num_examples: 50922
- name: validation
num_bytes: 829913
num_examples: 1631
- name: test
num_bytes: 810712
num_examples: 1512
download_size: 5770971
dataset_size: 27196470
- config_name: fr
features:
- name: tokens
sequence: string
- name: tags
sequence: string
splits:
- name: train
num_bytes: 37492920
num_examples: 73561
- name: validation
num_bytes: 895731
num_examples: 1727
- name: test
num_bytes: 816506
num_examples: 1656
download_size: 8204258
dataset_size: 39205157
- config_name: nl
features:
- name: tokens
sequence: string
- name: tags
sequence: string
splits:
- name: train
num_bytes: 7597460
num_examples: 20968
- name: validation
num_bytes: 453646
num_examples: 1230
- name: test
num_bytes: 434877
num_examples: 1120
download_size: 1947747
dataset_size: 8485983
configs:
- config_name: de
data_files:
- split: train
path: de/train-*
- split: validation
path: de/validation-*
- split: test
path: de/test-*
- config_name: en
data_files:
- split: train
path: en/train-*
- split: validation
path: en/validation-*
- split: test
path: en/test-*
- config_name: es
data_files:
- split: train
path: es/train-*
- split: validation
path: es/validation-*
- split: test
path: es/test-*
- config_name: fr
data_files:
- split: train
path: fr/train-*
- split: validation
path: fr/validation-*
- split: test
path: fr/test-*
- config_name: nl
data_files:
- split: train
path: nl/train-*
- split: validation
path: nl/validation-*
- split: test
path: nl/test-*
task_categories:
- token-classification
language:
- de
- fr
- nl
- es
- en
---
# Dataset Card for "MSNER-nlp"
[More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
提供机构:
qmeeus
原始信息汇总
数据集概述
数据集配置
-
config_name: de
- features:
- tokens: sequence (string)
- tags: sequence (string)
- splits:
- train: 108473 examples, 41616289 bytes
- validation: 2109 examples, 791188 bytes
- test: 1966 examples, 747121 bytes
- download_size: 10480059 bytes
- dataset_size: 43154598 bytes
- features:
-
config_name: en
- features:
- tokens: sequence (string)
- tags: sequence (string)
- splits:
- train: 5000 examples, 2204014 bytes
- validation: 1753 examples, 735967 bytes
- test: 1842 examples, 742319 bytes
- download_size: 745400 bytes
- dataset_size: 3682300 bytes
- features:
-
config_name: es
- features:
- tokens: sequence (string)
- tags: sequence (string)
- splits:
- train: 50922 examples, 25555845 bytes
- validation: 1631 examples, 829913 bytes
- test: 1512 examples, 810712 bytes
- download_size: 5770971 bytes
- dataset_size: 27196470 bytes
- features:
-
config_name: fr
- features:
- tokens: sequence (string)
- tags: sequence (string)
- splits:
- train: 73561 examples, 37492920 bytes
- validation: 1727 examples, 895731 bytes
- test: 1656 examples, 816506 bytes
- download_size: 8204258 bytes
- dataset_size: 39205157 bytes
- features:
-
config_name: nl
- features:
- tokens: sequence (string)
- tags: sequence (string)
- splits:
- train: 20968 examples, 7597460 bytes
- validation: 1230 examples, 453646 bytes
- test: 1120 examples, 434877 bytes
- download_size: 1947747 bytes
- dataset_size: 8485983 bytes
- features:
数据集文件路径
-
config_name: de
- train: de/train-*
- validation: de/validation-*
- test: de/test-*
-
config_name: en
- train: en/train-*
- validation: en/validation-*
- test: en/test-*
-
config_name: es
- train: es/train-*
- validation: es/validation-*
- test: es/test-*
-
config_name: fr
- train: fr/train-*
- validation: fr/validation-*
- test: fr/test-*
-
config_name: nl
- train: nl/train-*
- validation: nl/validation-*
- test: nl/test-*
任务类别
- token-classification
支持的语言
- de
- fr
- nl
- es
- en



