five

AresEkb/prof_standards_sbert_large_mt_nlu_ru

收藏
Hugging Face2023-01-11 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/AresEkb/prof_standards_sbert_large_mt_nlu_ru
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: - ru dataset_info: - config_name: domains features: - name: reg_number dtype: string - name: standard_name dtype: string - name: name dtype: string - name: purpose dtype: string - name: embeddings sequence: float32 splits: - name: train num_bytes: 7293978 num_examples: 1510 download_size: 7789662 dataset_size: 7293978 - config_name: generalized_functions features: - name: generalized_function_id dtype: string - name: reg_number dtype: string - name: name dtype: string - name: embeddings sequence: float32 splits: - name: train num_bytes: 24536711 num_examples: 5520 download_size: 26728782 dataset_size: 24536711 - config_name: jobs features: - name: generalized_function_id dtype: string - name: reg_number dtype: string - name: name dtype: string - name: embeddings sequence: float32 splits: - name: train num_bytes: 64746734 num_examples: 14991 download_size: 68906153 dataset_size: 64746734 - config_name: particular_functions features: - name: generalized_function_id dtype: string - name: particular_function_id dtype: string - name: reg_number dtype: string - name: name dtype: string - name: embeddings sequence: float32 splits: - name: train num_bytes: 83618997 num_examples: 18730 download_size: 89697328 dataset_size: 83618997 - config_name: actions features: - name: generalized_function_id dtype: string - name: particular_function_id dtype: string - name: reg_number dtype: string - name: name dtype: string - name: embeddings sequence: float32 splits: - name: train num_bytes: 642320840 num_examples: 143024 download_size: 680158888 dataset_size: 642320840 - config_name: skills features: - name: generalized_function_id dtype: string - name: particular_function_id dtype: string - name: reg_number dtype: string - name: name dtype: string - name: embeddings sequence: float32 splits: - name: train num_bytes: 724280125 num_examples: 161473 download_size: 747889457 dataset_size: 724280125 - config_name: knowledges features: - name: generalized_function_id dtype: string - name: particular_function_id dtype: string - name: reg_number dtype: string - name: name dtype: string - name: embeddings sequence: float32 splits: - name: train num_bytes: 1041374369 num_examples: 234283 download_size: 1022695670 dataset_size: 1041374369 pretty_name: Professional Standards size_categories: - 100K<n<1M ---
提供机构:
AresEkb
原始信息汇总

数据集概述

数据集名称

  • pretty_name: Professional Standards

数据集大小分类

  • size_categories: 100K<n<1M

数据集配置与特征

1. domains

  • config_name: domains
  • features:
    • reg_number: string
    • standard_name: string
    • name: string
    • purpose: string
    • embeddings: sequence of float32
  • splits:
    • train: 1510 examples, 7293978 bytes
  • download_size: 7789662 bytes
  • dataset_size: 7293978 bytes

2. generalized_functions

  • config_name: generalized_functions
  • features:
    • generalized_function_id: string
    • reg_number: string
    • name: string
    • embeddings: sequence of float32
  • splits:
    • train: 5520 examples, 24536711 bytes
  • download_size: 26728782 bytes
  • dataset_size: 24536711 bytes

3. jobs

  • config_name: jobs
  • features:
    • generalized_function_id: string
    • reg_number: string
    • name: string
    • embeddings: sequence of float32
  • splits:
    • train: 14991 examples, 64746734 bytes
  • download_size: 68906153 bytes
  • dataset_size: 64746734 bytes

4. particular_functions

  • config_name: particular_functions
  • features:
    • generalized_function_id: string
    • particular_function_id: string
    • reg_number: string
    • name: string
    • embeddings: sequence of float32
  • splits:
    • train: 18730 examples, 83618997 bytes
  • download_size: 89697328 bytes
  • dataset_size: 83618997 bytes

5. actions

  • config_name: actions
  • features:
    • generalized_function_id: string
    • particular_function_id: string
    • reg_number: string
    • name: string
    • embeddings: sequence of float32
  • splits:
    • train: 143024 examples, 642320840 bytes
  • download_size: 680158888 bytes
  • dataset_size: 642320840 bytes

6. skills

  • config_name: skills
  • features:
    • generalized_function_id: string
    • particular_function_id: string
    • reg_number: string
    • name: string
    • embeddings: sequence of float32
  • splits:
    • train: 161473 examples, 724280125 bytes
  • download_size: 747889457 bytes
  • dataset_size: 724280125 bytes

7. knowledges

  • config_name: knowledges
  • features:
    • generalized_function_id: string
    • particular_function_id: string
    • reg_number: string
    • name: string
    • embeddings: sequence of float32
  • splits:
    • train: 234283 examples, 1041374369 bytes
  • download_size: 1022695670 bytes
  • dataset_size: 1041374369 bytes
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作