five

qmeeus/MSNER-nlp

收藏
Hugging Face2024-03-28 更新2024-06-11 收录
下载链接:
https://hf-mirror.com/datasets/qmeeus/MSNER-nlp
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: - config_name: de features: - name: tokens sequence: string - name: tags sequence: string splits: - name: train num_bytes: 41616289 num_examples: 108473 - name: validation num_bytes: 791188 num_examples: 2109 - name: test num_bytes: 747121 num_examples: 1966 download_size: 10480059 dataset_size: 43154598 - config_name: en features: - name: tokens sequence: string - name: tags sequence: string splits: - name: train num_bytes: 2204014 num_examples: 5000 - name: validation num_bytes: 735967 num_examples: 1753 - name: test num_bytes: 742319 num_examples: 1842 download_size: 745400 dataset_size: 3682300 - config_name: es features: - name: tokens sequence: string - name: tags sequence: string splits: - name: train num_bytes: 25555845 num_examples: 50922 - name: validation num_bytes: 829913 num_examples: 1631 - name: test num_bytes: 810712 num_examples: 1512 download_size: 5770971 dataset_size: 27196470 - config_name: fr features: - name: tokens sequence: string - name: tags sequence: string splits: - name: train num_bytes: 37492920 num_examples: 73561 - name: validation num_bytes: 895731 num_examples: 1727 - name: test num_bytes: 816506 num_examples: 1656 download_size: 8204258 dataset_size: 39205157 - config_name: nl features: - name: tokens sequence: string - name: tags sequence: string splits: - name: train num_bytes: 7597460 num_examples: 20968 - name: validation num_bytes: 453646 num_examples: 1230 - name: test num_bytes: 434877 num_examples: 1120 download_size: 1947747 dataset_size: 8485983 configs: - config_name: de data_files: - split: train path: de/train-* - split: validation path: de/validation-* - split: test path: de/test-* - config_name: en data_files: - split: train path: en/train-* - split: validation path: en/validation-* - split: test path: en/test-* - config_name: es data_files: - split: train path: es/train-* - split: validation path: es/validation-* - split: test path: es/test-* - config_name: fr data_files: - split: train path: fr/train-* - split: validation path: fr/validation-* - split: test path: fr/test-* - config_name: nl data_files: - split: train path: nl/train-* - split: validation path: nl/validation-* - split: test path: nl/test-* task_categories: - token-classification language: - de - fr - nl - es - en --- # Dataset Card for "MSNER-nlp" [More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
提供机构:
qmeeus
原始信息汇总

数据集概述

数据集配置

  • config_name: de

    • features:
      • tokens: sequence (string)
      • tags: sequence (string)
    • splits:
      • train: 108473 examples, 41616289 bytes
      • validation: 2109 examples, 791188 bytes
      • test: 1966 examples, 747121 bytes
    • download_size: 10480059 bytes
    • dataset_size: 43154598 bytes
  • config_name: en

    • features:
      • tokens: sequence (string)
      • tags: sequence (string)
    • splits:
      • train: 5000 examples, 2204014 bytes
      • validation: 1753 examples, 735967 bytes
      • test: 1842 examples, 742319 bytes
    • download_size: 745400 bytes
    • dataset_size: 3682300 bytes
  • config_name: es

    • features:
      • tokens: sequence (string)
      • tags: sequence (string)
    • splits:
      • train: 50922 examples, 25555845 bytes
      • validation: 1631 examples, 829913 bytes
      • test: 1512 examples, 810712 bytes
    • download_size: 5770971 bytes
    • dataset_size: 27196470 bytes
  • config_name: fr

    • features:
      • tokens: sequence (string)
      • tags: sequence (string)
    • splits:
      • train: 73561 examples, 37492920 bytes
      • validation: 1727 examples, 895731 bytes
      • test: 1656 examples, 816506 bytes
    • download_size: 8204258 bytes
    • dataset_size: 39205157 bytes
  • config_name: nl

    • features:
      • tokens: sequence (string)
      • tags: sequence (string)
    • splits:
      • train: 20968 examples, 7597460 bytes
      • validation: 1230 examples, 453646 bytes
      • test: 1120 examples, 434877 bytes
    • download_size: 1947747 bytes
    • dataset_size: 8485983 bytes

数据集文件路径

  • config_name: de

    • train: de/train-*
    • validation: de/validation-*
    • test: de/test-*
  • config_name: en

    • train: en/train-*
    • validation: en/validation-*
    • test: en/test-*
  • config_name: es

    • train: es/train-*
    • validation: es/validation-*
    • test: es/test-*
  • config_name: fr

    • train: fr/train-*
    • validation: fr/validation-*
    • test: fr/test-*
  • config_name: nl

    • train: nl/train-*
    • validation: nl/validation-*
    • test: nl/test-*

任务类别

  • token-classification

支持的语言

  • de
  • fr
  • nl
  • es
  • en
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作