five

maywell/korean_textbooks

收藏
Hugging Face2024-01-10 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/maywell/korean_textbooks
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: - ko license: apache-2.0 size_categories: - 1M<n<10M pretty_name: 대규모 한국어 Synthetic 데이터 dataset_info: - config_name: claude_evol features: - name: text dtype: string splits: - name: train num_bytes: 992896186 num_examples: 239102 download_size: 380188122 dataset_size: 992896186 - config_name: code-alpaca features: - name: text dtype: string splits: - name: train num_bytes: 273836723 num_examples: 64112 download_size: 100817441 dataset_size: 273836723 - config_name: helpsteer features: - name: text dtype: string splits: - name: train num_bytes: 101753037 num_examples: 25253 download_size: 38660919 dataset_size: 101753037 - config_name: ko_wikidata features: - name: text dtype: string splits: - name: train num_bytes: 527306289 num_examples: 127614 download_size: 197029339 dataset_size: 527306289 - config_name: mmlu_abstract_algebra features: - name: text dtype: string splits: - name: train num_bytes: 369008992 num_examples: 88848 download_size: 135822870 dataset_size: 369008992 - config_name: mmlu_all features: - name: text dtype: string splits: - name: train num_bytes: 406126621 num_examples: 97765 download_size: 149486712 dataset_size: 406126621 - config_name: mmlu_anatomy features: - name: text dtype: string splits: - name: train num_bytes: 404317465 num_examples: 97463 download_size: 148806011 dataset_size: 404317465 - config_name: mmlu_astronomy features: - name: text dtype: string splits: - name: train num_bytes: 404137638 num_examples: 97347 download_size: 148705490 dataset_size: 404137638 - config_name: mmlu_business_ethics features: - name: text dtype: string splits: - name: train num_bytes: 404250245 num_examples: 97327 download_size: 148763276 dataset_size: 404250245 - config_name: mmlu_clinical_knowledge features: - name: text dtype: string splits: - name: train num_bytes: 403659005 num_examples: 97226 download_size: 148688069 dataset_size: 403659005 - config_name: mmlu_college_biology features: - name: text dtype: string splits: - name: train num_bytes: 404028634 num_examples: 97285 download_size: 148722802 dataset_size: 404028634 - config_name: mmlu_college_chemistry features: - name: text dtype: string splits: - name: train num_bytes: 404667385 num_examples: 97435 download_size: 148855223 dataset_size: 404667385 - config_name: mmlu_college_computer_science features: - name: text dtype: string splits: - name: train num_bytes: 385176880 num_examples: 92606 download_size: 141868873 dataset_size: 385176880 - config_name: mmlu_college_mathematics features: - name: text dtype: string splits: - name: train num_bytes: 390603751 num_examples: 94070 download_size: 143833823 dataset_size: 390603751 - config_name: mmlu_college_medicine features: - name: text dtype: string splits: - name: train num_bytes: 395144479 num_examples: 95156 download_size: 145271248 dataset_size: 395144479 - config_name: mmlu_college_physics features: - name: '0' dtype: string splits: - name: train num_bytes: 404906114 num_examples: 97452 download_size: 148870088 dataset_size: 404906114 - config_name: mmlu_computer_security features: - name: '0' dtype: string splits: - name: train num_bytes: 403699674 num_examples: 97212 download_size: 148755211 dataset_size: 403699674 - config_name: mmlu_conceptual_physics features: - name: '0' dtype: string splits: - name: train num_bytes: 366231421 num_examples: 88216 download_size: 134989933 dataset_size: 366231421 - config_name: mmlu_econometrics features: - name: '0' dtype: string splits: - name: train num_bytes: 380851762 num_examples: 91854 download_size: 140295665 dataset_size: 380851762 - config_name: mmlu_electrical_engineering features: - name: '0' dtype: string splits: - name: train num_bytes: 364564129 num_examples: 87826 download_size: 134376902 dataset_size: 364564129 - config_name: mmlu_elementary_mathematics features: - name: '0' dtype: string splits: - name: train num_bytes: 371101672 num_examples: 89307 download_size: 136622044 dataset_size: 371101672 - config_name: mmlu_formal_logic features: - name: '0' dtype: string splits: - name: train num_bytes: 395937096 num_examples: 95483 download_size: 145736493 dataset_size: 395937096 - config_name: mmlu_global_facts features: - name: '0' dtype: string splits: - name: train num_bytes: 394596084 num_examples: 94984 download_size: 145284966 dataset_size: 394596084 - config_name: mmlu_high_school_biology features: - name: '0' dtype: string splits: - name: train num_bytes: 402382699 num_examples: 97117 download_size: 148038235 dataset_size: 402382699 - config_name: mmlu_high_school_chemistry features: - name: '0' dtype: string splits: - name: train num_bytes: 402886667 num_examples: 96907 download_size: 148323317 dataset_size: 402886667 - config_name: mmlu_high_school_computer_science features: - name: '0' dtype: string splits: - name: train num_bytes: 403966380 num_examples: 97351 download_size: 148666121 dataset_size: 403966380 - config_name: mmlu_high_school_european_history features: - name: '0' dtype: string splits: - name: train num_bytes: 403671884 num_examples: 97222 download_size: 148454177 dataset_size: 403671884 - config_name: mmlu_high_school_geography features: - name: '0' dtype: string splits: - name: train num_bytes: 404040602 num_examples: 97261 download_size: 148657890 dataset_size: 404040602 - config_name: mmlu_high_school_government_and_politics features: - name: '0' dtype: string splits: - name: train num_bytes: 403990139 num_examples: 97311 download_size: 148568388 dataset_size: 403990139 - config_name: mmlu_high_school_macroeconomics features: - name: '0' dtype: string splits: - name: train num_bytes: 404170166 num_examples: 97400 download_size: 148591243 dataset_size: 404170166 - config_name: mmlu_high_school_mathematics features: - name: '0' dtype: string splits: - name: train num_bytes: 404846407 num_examples: 97396 download_size: 149076619 dataset_size: 404846407 - config_name: mmlu_high_school_microeconomics features: - name: '0' dtype: string splits: - name: train num_bytes: 404613760 num_examples: 97435 download_size: 148970422 dataset_size: 404613760 - config_name: mmlu_high_school_physics features: - name: '0' dtype: string splits: - name: train num_bytes: 397678253 num_examples: 95740 download_size: 146340167 dataset_size: 397678253 - config_name: mmlu_high_school_psychology features: - name: '0' dtype: string splits: - name: train num_bytes: 334767526 num_examples: 80626 download_size: 123054403 dataset_size: 334767526 - config_name: mmlu_high_school_statistics features: - name: '0' dtype: string splits: - name: train num_bytes: 315209112 num_examples: 76033 download_size: 115876698 dataset_size: 315209112 - config_name: mmlu_high_school_us_history features: - name: '0' dtype: string splits: - name: train num_bytes: 329179309 num_examples: 79322 download_size: 120972668 dataset_size: 329179309 - config_name: mmlu_high_school_world_history features: - name: '0' dtype: string splits: - name: train num_bytes: 357910528 num_examples: 85990 download_size: 131809165 dataset_size: 357910528 - config_name: mmlu_human_aging features: - name: '0' dtype: string splits: - name: train num_bytes: 325427761 num_examples: 78341 download_size: 119430234 dataset_size: 325427761 - config_name: mmlu_human_sexuality features: - name: '0' dtype: string splits: - name: train num_bytes: 328912659 num_examples: 79327 download_size: 121032722 dataset_size: 328912659 - config_name: mmlu_international_law features: - name: '0' dtype: string splits: - name: train num_bytes: 327874597 num_examples: 78989 download_size: 120785769 dataset_size: 327874597 - config_name: normal_instructions features: - name: text dtype: string splits: - name: train num_bytes: 956305865 num_examples: 240523 download_size: 362796244 dataset_size: 956305865 - config_name: tiny-textbooks features: - name: text dtype: string splits: - name: train num_bytes: 1722063576 num_examples: 395985 download_size: 635724860 dataset_size: 1722063576 configs: - config_name: claude_evol data_files: - split: train path: claude_evol/train-* - config_name: code-alpaca data_files: - split: train path: code-alpaca/train-* - config_name: helpsteer data_files: - split: train path: helpsteer/train-* - config_name: ko_wikidata data_files: - split: train path: ko_wikidata/train-* - config_name: mmlu_abstract_algebra data_files: - split: train path: mmlu_abstract_algebra/train-* - config_name: mmlu_all data_files: - split: train path: mmlu_all/train-* - config_name: mmlu_anatomy data_files: - split: train path: mmlu_anatomy/train-* - config_name: mmlu_astronomy data_files: - split: train path: mmlu_astronomy/train-* - config_name: mmlu_business_ethics data_files: - split: train path: mmlu_business_ethics/train-* - config_name: mmlu_clinical_knowledge data_files: - split: train path: mmlu_clinical_knowledge/train-* - config_name: mmlu_college_biology data_files: - split: train path: mmlu_college_biology/train-* - config_name: mmlu_college_chemistry data_files: - split: train path: mmlu_college_chemistry/train-* - config_name: mmlu_college_computer_science data_files: - split: train path: mmlu_college_computer_science/train-* - config_name: mmlu_college_mathematics data_files: - split: train path: mmlu_college_mathematics/train-* - config_name: mmlu_college_medicine data_files: - split: train path: mmlu_college_medicine/train-* - config_name: mmlu_college_physics data_files: - split: train path: mmlu_college_physics/train-* - config_name: mmlu_computer_security data_files: - split: train path: mmlu_computer_security/train-* - config_name: mmlu_conceptual_physics data_files: - split: train path: mmlu_conceptual_physics/train-* - config_name: mmlu_econometrics data_files: - split: train path: mmlu_econometrics/train-* - config_name: mmlu_electrical_engineering data_files: - split: train path: mmlu_electrical_engineering/train-* - config_name: mmlu_elementary_mathematics data_files: - split: train path: mmlu_elementary_mathematics/train-* - config_name: mmlu_formal_logic data_files: - split: train path: mmlu_formal_logic/train-* - config_name: mmlu_global_facts data_files: - split: train path: mmlu_global_facts/train-* - config_name: mmlu_high_school_biology data_files: - split: train path: mmlu_high_school_biology/train-* - config_name: mmlu_high_school_chemistry data_files: - split: train path: mmlu_high_school_chemistry/train-* - config_name: mmlu_high_school_computer_science data_files: - split: train path: mmlu_high_school_computer_science/train-* - config_name: mmlu_high_school_european_history data_files: - split: train path: mmlu_high_school_european_history/train-* - config_name: mmlu_high_school_geography data_files: - split: train path: mmlu_high_school_geography/train-* - config_name: mmlu_high_school_government_and_politics data_files: - split: train path: mmlu_high_school_government_and_politics/train-* - config_name: mmlu_high_school_macroeconomics data_files: - split: train path: mmlu_high_school_macroeconomics/train-* - config_name: mmlu_high_school_mathematics data_files: - split: train path: mmlu_high_school_mathematics/train-* - config_name: mmlu_high_school_microeconomics data_files: - split: train path: mmlu_high_school_microeconomics/train-* - config_name: mmlu_high_school_physics data_files: - split: train path: mmlu_high_school_physics/train-* - config_name: mmlu_high_school_psychology data_files: - split: train path: mmlu_high_school_psychology/train-* - config_name: mmlu_high_school_statistics data_files: - split: train path: mmlu_high_school_statistics/train-* - config_name: mmlu_high_school_us_history data_files: - split: train path: mmlu_high_school_us_history/train-* - config_name: mmlu_high_school_world_history data_files: - split: train path: mmlu_high_school_world_history/train-* - config_name: mmlu_human_aging data_files: - split: train path: mmlu_human_aging/train-* - config_name: mmlu_human_sexuality data_files: - split: train path: mmlu_human_sexuality/train-* - config_name: mmlu_international_law data_files: - split: train path: mmlu_international_law/train-* - config_name: normal_instructions data_files: - split: train path: normal_instructions/train-* - config_name: tiny-textbooks data_files: - split: train path: tiny-textbooks/train-* --- # Massive Korean synthetic dataset This dataset is a large-scale Korean artificial data set created using Gemini Pro. It was created using the methodology described in *Creation of synthetic textbook-quality datasets* in [Textbooks Are All You Need](https://arxiv.org/abs/2306.11644). ## Data overview **A subset of each dataset does not indicate the contents of that dataset.** **Further modification required before use this dataset for training.** **본 데이터셋은 바로 사용하기보다는 하고자하는 task에 맞추어 가공 후 사용을 권장드립니다. ex) 로컬 모델을 사용하여 QA 셋으로 변환** | subset | row count | link | + | |---|---|---|---| | tiny-textbooks | 395,985 | [nampdn-ai/tiny-textbooks](https://huggingface.co/datasets/nampdn-ai/tiny-textbooks) | | | ko_wikidata | 127,614 | [maywell/ko_wikidata_QA](https://huggingface.co/datasets/maywell/ko_wikidata_QA) | | | normal_instructions | 240,523 | [KonstantyM/science_qa](https://huggingface.co/datasets/KonstantyM/science_qa) | with science texts | | claude_evol | 239,102 | [Norquinal/claude_evol_instruct_210k](https://huggingface.co/datasets/Norquinal/claude_evol_instruct_210k) | used 250k files from that repo | | code-alpaca | 64,112 | [theblackcat102/evol-codealpaca-v1](https://huggingface.co/datasets/theblackcat102/evol-codealpaca-v1) | original is a coding dataset, but generated data is not mainly a coding dataset | | helpsteer | 25,253 | [nvidia/HelpSteer](https://huggingface.co/datasets/nvidia/HelpSteer) | | | mmlu_abstract_algebra | 88,848 | [cais/mmlu](https://huggingface.co/datasets/cais/mmlu) | | | mmlu_all | 97,765 | [cais/mmlu](https://huggingface.co/datasets/cais/mmlu) | | | mmlu_anatomy | 97,463 | [cais/mmlu](https://huggingface.co/datasets/cais/mmlu) | | | mmlu_astronomy | 97,347 | [cais/mmlu](https://huggingface.co/datasets/cais/mmlu) | | | mmlu_business_ethics | 97,327 | [cais/mmlu](https://huggingface.co/datasets/cais/mmlu) | | | mmlu_clinical_knowledge | 97,226 | [cais/mmlu](https://huggingface.co/datasets/cais/mmlu) | | | mmlu_college_biology | 97,285 | [cais/mmlu](https://huggingface.co/datasets/cais/mmlu) | | | mmlu_college_chemistry | 97,435 | [cais/mmlu](https://huggingface.co/datasets/cais/mmlu) | | | mmlu_college_computer_science | 92,606 | [cais/mmlu](https://huggingface.co/datasets/cais/mmlu) | | | mmlu_college_mathematics | 94,070 | [cais/mmlu](https://huggingface.co/datasets/cais/mmlu) | | | mmlu_college_medicine | 95,156 | [cais/mmlu](https://huggingface.co/datasets/cais/mmlu) | | | mmlu_college_physics | 97,452 | [cais/mmlu](https://huggingface.co/datasets/cais/mmlu) | | | mmlu_computer_security | 97,212 | [cais/mmlu](https://huggingface.co/datasets/cais/mmlu) | | | mmlu_conceptual_physics | 88,216 | [cais/mmlu](https://huggingface.co/datasets/cais/mmlu) | | | mmlu_econometrics | 91,854 | [cais/mmlu](https://huggingface.co/datasets/cais/mmlu) | | | mmlu_electrical_engineering | 87,826 | [cais/mmlu](https://huggingface.co/datasets/cais/mmlu) | | | mmlu_elementary_mathematics | 89,307 | [cais/mmlu](https://huggingface.co/datasets/cais/mmlu) | | | mmlu_formal_logic | 95,483 | [cais/mmlu](https://huggingface.co/datasets/cais/mmlu) | | | mmlu_global_facts | 94,984 | [cais/mmlu](https://huggingface.co/datasets/cais/mmlu) | | | mmlu_high_school_biology | 97,117 | [cais/mmlu](https://huggingface.co/datasets/cais/mmlu) | | | mmlu_high_school_chemistry | 96,907 | [cais/mmlu](https://huggingface.co/datasets/cais/mmlu) | | | mmlu_high_school_computer_science | 97,351 | [cais/mmlu](https://huggingface.co/datasets/cais/mmlu) | | | mmlu_high_school_european_history | 97,222 | [cais/mmlu](https://huggingface.co/datasets/cais/mmlu) | | | mmlu_high_school_geography | 97,261 | [cais/mmlu](https://huggingface.co/datasets/cais/mmlu) | | | mmlu_high_school_government_and_politics | 97,311 | [cais/mmlu](https://huggingface.co/datasets/cais/mmlu) | | | mmlu_high_school_macroeconomics | 97,400 | [cais/mmlu](https://huggingface.co/datasets/cais/mmlu) | | | mmlu_high_school_mathematics | 97,396 | [cais/mmlu](https://huggingface.co/datasets/cais/mmlu) | | | mmlu_high_school_microeconomics | 97,435 | [cais/mmlu](https://huggingface.co/datasets/cais/mmlu) | | | mmlu_high_school_physics | 95,740 | [cais/mmlu](https://huggingface.co/datasets/cais/mmlu) | | | mmlu_high_school_psychology | 80,626 | [cais/mmlu](https://huggingface.co/datasets/cais/mmlu) | | | mmlu_high_school_statistics | 76,033 | [cais/mmlu](https://huggingface.co/datasets/cais/mmlu) | | | mmlu_high_school_us_history | 79,322 | [cais/mmlu](https://huggingface.co/datasets/cais/mmlu) | | | mmlu_high_school_world_history | 85,990 | [cais/mmlu](https://huggingface.co/datasets/cais/mmlu) | | | mmlu_human_aging | 78,341 | [cais/mmlu](https://huggingface.co/datasets/cais/mmlu) | | | mmlu_human_sexuality | 79,327 | [cais/mmlu](https://huggingface.co/datasets/cais/mmlu) | | | mmlu_international_law | 78,989 | [cais/mmlu](https://huggingface.co/datasets/cais/mmlu) | | ## When you find a problem If you find any issues with the dataset, please let us know in the discussion or send us a pull request.
提供机构:
maywell
原始信息汇总

数据集概述

基本信息

  • 语言: 韩语 (ko)
  • 许可证: Apache-2.0
  • 大小分类: 1M<n<10M
  • 美观名称: 대규모 한국어 Synthetic 데이터

数据集配置详情

  1. claude_evol

    • 特征:
      • text (字符串类型)
    • 分割:
      • train
        • 示例数量: 239,102
        • 数据大小: 992,896,186 字节
        • 下载大小: 380,188,122 字节
  2. code-alpaca

    • 特征:
      • text (字符串类型)
    • 分割:
      • train
        • 示例数量: 64,112
        • 数据大小: 273,836,723 字节
        • 下载大小: 100,817,441 字节
  3. helpsteer

    • 特征:
      • text (字符串类型)
    • 分割:
      • train
        • 示例数量: 25,253
        • 数据大小: 101,753,037 字节
        • 下载大小: 38,660,919 字节
  4. ko_wikidata

    • 特征:
      • text (字符串类型)
    • 分割:
      • train
        • 示例数量: 127,614
        • 数据大小: 527,306,289 字节
        • 下载大小: 197,029,339 字节
  5. mmlu_abstract_algebra

    • 特征:
      • text (字符串类型)
    • 分割:
      • train
        • 示例数量: 88,848
        • 数据大小: 369,008,992 字节
        • 下载大小: 135,822,870 字节
  6. mmlu_all

    • 特征:
      • text (字符串类型)
    • 分割:
      • train
        • 示例数量: 97,765
        • 数据大小: 406,126,621 字节
        • 下载大小: 149,486,712 字节
  7. mmlu_anatomy

    • 特征:
      • text (字符串类型)
    • 分割:
      • train
        • 示例数量: 97,463
        • 数据大小: 404,317,465 字节
        • 下载大小: 148,806,011 字节
  8. mmlu_astronomy

    • 特征:
      • text (字符串类型)
    • 分割:
      • train
        • 示例数量: 97,347
        • 数据大小: 404,137,638 字节
        • 下载大小: 148,705,490 字节
  9. mmlu_business_ethics

    • 特征:
      • text (字符串类型)
    • 分割:
      • train
        • 示例数量: 97,327
        • 数据大小: 404,250,245 字节
        • 下载大小: 148,763,276 字节
  10. mmlu_clinical_knowledge

    • 特征:
      • text (字符串类型)
    • 分割:
      • train
        • 示例数量: 97,226
        • 数据大小: 403,659,005 字节
        • 下载大小: 148,688,069 字节
  11. mmlu_college_biology

    • 特征:
      • text (字符串类型)
    • 分割:
      • train
        • 示例数量: 97,285
        • 数据大小: 404,028,634 字节
        • 下载大小: 148,722,802 字节
  12. mmlu_college_chemistry

    • 特征:
      • text (字符串类型)
    • 分割:
      • train
        • 示例数量: 97,435
        • 数据大小: 404,667,385 字节
        • 下载大小: 148,855,223 字节
  13. mmlu_college_computer_science

    • 特征:
      • text (字符串类型)
    • 分割:
      • train
        • 示例数量: 92,606
        • 数据大小: 385,176,880 字节
        • 下载大小: 141,868,873 字节
  14. mmlu_college_mathematics

    • 特征:
      • text (字符串类型)
    • 分割:
      • train
        • 示例数量: 94,070
        • 数据大小: 390,603,751 字节
        • 下载大小: 143,833,823 字节
  15. mmlu_college_medicine

    • 特征:
      • text (字符串类型)
    • 分割:
      • train
        • 示例数量: 95,156
        • 数据大小: 395,144,479 字节
        • 下载大小: 145,271,248 字节
  16. mmlu_college_physics

    • 特征:
      • text (字符串类型)
    • 分割:
      • train
        • 示例数量: 97,452
        • 数据大小: 404,906,114 字节
        • 下载大小: 148,870,088 字节
  17. mmlu_computer_security

    • 特征:
      • text (字符串类型)
    • 分割:
      • train
        • 示例数量: 97,212
        • 数据大小: 403,699,674 字节
        • 下载大小: 148,755,211 字节
  18. mmlu_conceptual_physics

    • 特征:
      • text (字符串类型)
    • 分割:
      • train
        • 示例数量: 88,216
        • 数据大小: 366,231,421 字节
        • 下载大小: 134,989,933 字节
  19. mmlu_econometrics

    • 特征:
      • text (字符串类型)
    • 分割:
      • train
        • 示例数量: 91,854
        • 数据大小: 380,851,762 字节
        • 下载大小: 140,295,665 字节
  20. mmlu_electrical_engineering

    • 特征:
      • text (字符串类型)
    • 分割:
      • train
        • 示例数量: 87,826
        • 数据大小: 364,564,129 字节
        • 下载大小: 134,376,902 字节
  21. mmlu_elementary_mathematics

    • 特征:
      • text (字符串类型)
    • 分割:
      • train
        • 示例数量: 89,307
        • 数据大小: 371,101,672 字节
        • 下载大小: 136,622,044 字节
  22. mmlu_formal_logic

    • 特征:
      • text (字符串类型)
    • 分割:
      • train
        • 示例数量: 95,483
        • 数据大小: 395,937,096 字节
        • 下载大小: 145,736,493 字节
  23. mmlu_global_facts

    • 特征:
      • text (字符串类型)
    • 分割:
      • train
        • 示例数量: 94,984
        • 数据大小: 394,596,084 字节
        • 下载大小: 145,284,966 字节
  24. mmlu_high_school_biology

    • 特征:
      • text (字符串类型)
    • 分割:
      • train
        • 示例数量: 97,117
        • 数据大小: 402,382,699 字节
        • 下载大小: 148,038,235 字节
  25. mmlu_high_school_chemistry

    • 特征:
      • text (字符串类型)
    • 分割:
      • train
        • 示例数量: 96,907
        • 数据大小: 402,886,667 字节
        • 下载大小: 148,323,317 字节
  26. mmlu_high_school_computer_science

    • 特征:
      • text (字符串类型)
    • 分割:
      • train
        • 示例数量: 97,351
        • 数据大小: 403,966,380 字节
        • 下载大小: 148,666,121 字节
  27. mmlu_high_school_european_history

    • 特征:
      • text (字符串类型)
    • 分割:
      • train
        • 示例数量: 97,222
        • 数据大小: 403,671,884 字节
        • 下载大小: 148,454,177 字节
  28. mmlu_high_school_geography

    • 特征:
      • text (字符串类型)
    • 分割:
      • train
        • 示例数量: 97,261
        • 数据大小: 404,040,602 字节
        • 下载大小: 148,657,890 字节
  29. mmlu_high_school_government_and_politics

    • 特征:
      • text (字符串类型)
    • 分割:
      • train
        • 示例数量: 97,311
        • 数据大小: 403,990,139 字节
        • 下载大小: 148,568,388 字节
  30. mmlu_high_school_macroeconomics

    • 特征:
      • text (字符串类型)
    • 分割:
      • train
        • 示例数量: 97,400
        • 数据大小: 404,170,166 字节
        • 下载大小: 148,591,243 字节
  31. mmlu_high_school_mathematics

    • 特征:
      • text (字符串类型)
    • 分割:
      • train
        • 示例数量: 97,396
        • 数据大小: 404,846,407 字节
        • 下载大小: 149,076,619 字节
  32. mmlu_high_school_microeconomics

    • 特征:
      • text (字符串类型)
    • 分割:
      • train
        • 示例数量: 97,435
        • 数据大小: 404,613,760 字节
        • 下载大小: 148,970,422 字节
  33. mmlu_high_school_physics

    • 特征:
      • text (字符串类型)
    • 分割:
      • train
        • 示例数量: 95,740
        • 数据大小: 397,678,253 字节
        • 下载大小: 146,340,167 字节
  34. mmlu_high_school_psychology

    • 特征:
      • text (字符串类型)
    • 分割:
      • train
        • 示例数量: 80,626
        • 数据大小: 334,767,526 字节
        • 下载大小: 123,054,403 字节
  35. mmlu_high_school_statistics

    • 特征:
      • text (字符串类型)
    • 分割:
      • train
        • 示例数量: 76,033
        • 数据大小: 315,209,112 字节
        • 下载大小: 115,876,698 字节
  36. mmlu_high_school_us_history

    • 特征:
      • text (字符串类型)
    • 分割:
      • train
        • 示例数量: 79,322
        • 数据大小: 329,179,309 字节
        • 下载大小: 120,972,668 字节
  37. mmlu_high_school_world_history

    • 特征:
      • text (字符串类型)
    • 分割:
      • train
        • 示例数量: 85,990
        • 数据大小: 357,910,528 字节
        • 下载大小: 131,809,165 字节
  38. mmlu_human_aging

    • 特征:
      • text (字符串类型)
    • 分割:
      • train
        • 示例数量: 78,341
        • 数据大小: 325,427,761 字节
        • 下载大小: 119,430,234 字节
  39. mmlu_human_sexuality

    • 特征:
      • text (字符串类型)
    • 分割:
      • train
        • 示例数量: 79,327
        • 数据大小: 328,912,659 字节
        • 下载大小: 121,032,722 字节
  40. mmlu_international_law

    • 特征:
      • text (字符串类型)
    • 分割:
      • train
        • 示例数量: 78,989
        • 数据大小: 327,874,597 字节
        • 下载大小: 120,785,769 字节
  41. normal_instructions

5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作