five

OpenLLMTurkishLeadboardv2/details_KOCDIGITAL__Kocdigital-LLM-8b-v0.1

收藏
Hugging Face2024-05-03 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/OpenLLMTurkishLeadboardv2/details_KOCDIGITAL__Kocdigital-LLM-8b-v0.1
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集是在Open LLM Turkish Leaderboardv0.2上对模型KOCDIGITAL/Kocdigital-LLM-8b-v0.1进行评估时自动创建的。它包含了各种任务和子任务的详细结果,如winogrande_tr_v0.2、truthfulqa_v0.2和mmlu_tr_v0.2等。数据集结构包括结果、组和组子任务,为每个任务提供了准确性和标准误差指标。

该数据集是在Open LLM Turkish Leaderboardv0.2上对模型KOCDIGITAL/Kocdigital-LLM-8b-v0.1进行评估时自动创建的。它包含了各种任务和子任务的详细结果,如winogrande_tr_v0.2、truthfulqa_v0.2和mmlu_tr_v0.2等。数据集结构包括结果、组和组子任务,为每个任务提供了准确性和标准误差指标。
提供机构:
OpenLLMTurkishLeadboardv2
原始信息汇总

数据集概述

数据集是在评估模型KOCDIGITAL/Kocdigital-LLM-8b-v0.1在Open LLM土耳其Leaderboard v0.2运行期间自动创建的。

评估结果

准确度(Acc)

  • winogrande_tr_v0.2: 0.5560821484992101
  • truthfulqa_v0.2: 0.47707655422455664
  • mmlu_tr_v0.2: 0.4734896102935739
  • mmlu_humanities_v0.2: 0.42541562286495105
  • mmlu_formal_logic_v0.2: 0.3333333333333333
  • mmlu_high_school_european_history_v0.2: 0.5666666666666667
  • mmlu_high_school_us_history_v0.2: 0.6033519553072626
  • mmlu_high_school_world_history_v0.2: 0.6103286384976526
  • mmlu_international_law_v0.2: 0.628099173553719
  • mmlu_jurisprudence_v0.2: 0.6698113207547169
  • mmlu_logical_fallacies_v0.2: 0.45962732919254656
  • mmlu_moral_disputes_v0.2: 0.5324675324675324
  • mmlu_moral_scenarios_v0.2: 0.24770642201834864
  • mmlu_philosophy_v0.2: 0.5886287625418061
  • mmlu_prehistory_v0.2: 0.5133333333333333
  • mmlu_professional_law_v0.2: 0.33213256484149856
  • mmlu_world_religions_v0.2: 0.6607142857142857
  • mmlu_other_v0.2: 0.542136695421367
  • mmlu_business_ethics_v0.2: 0.5959595959595959
  • mmlu_clinical_knowledge_v0.2: 0.55859375
  • mmlu_college_medicine_v0.2: 0.4523809523809524
  • mmlu_global_facts_v0.2: 0.37755102040816324
  • mmlu_human_aging_v0.2: 0.47641509433962265
  • mmlu_management_v0.2: 0.6464646464646465
  • mmlu_marketing_v0.2: 0.6682027649769585
  • mmlu_medical_genetics_v0.2: 0.631578947368421
  • mmlu_miscellaneous_v0.2: 0.6370757180156658
  • mmlu_nutrition_v0.2: 0.5606557377049181
  • mmlu_professional_accounting_v0.2: 0.31899641577060933
  • mmlu_professional_medicine_v0.2: 0.5019157088122606
  • mmlu_virology_v0.2: 0.44025157232704404
  • mmlu_social_sciences_v0.2: 0.5394605394605395
  • mmlu_econometrics_v0.2: 0.3157894736842105
  • mmlu_high_school_geography_v0.2: 0.6548223350253807
  • mmlu_high_school_government_and_politics_v0.2: 0.5561497326203209
  • mmlu_high_school_macroeconomics_v0.2: 0.47692307692307695
  • mmlu_high_school_microeconomics_v0.2: 0.510548523206751
  • mmlu_high_school_psychology_v0.2: 0.6060037523452158
  • mmlu_human_sexuality_v0.2: 0.5826086956521739
  • mmlu_professional_psychology_v0.2: 0.43602693602693604
  • mmlu_public_relations_v0.2: 0.5277777777777778
  • mmlu_security_studies_v0.2: 0.5854700854700855
  • mmlu_sociology_v0.2: 0.676923076923077
  • mmlu_us_foreign_policy_v0.2: 0.696969696969697
  • mmlu_stem_v0.2: 0.41123595505617977
  • mmlu_abstract_algebra_v0.2: 0.3
  • mmlu_anatomy_v0.2: 0.4198473282442748
  • mmlu_astronomy: 0.5562913907284768
  • mmlu_college_biology_v0.2: 0.4788732394366197
  • mmlu_college_chemistry_v0.2: 0.3939393939393939
  • mmlu_college_computer_science_v0.2: 0.45454545454545453
  • mmlu_college_mathematics_v0.2: 0.33
  • mmlu_college_physics_v0.2: 0.3564356435643564
  • mmlu_computer_security_v0.2: 0.5
  • mmlu_conceptual_physics_v0.2: 0.4592274678111588
  • mmlu_electrical_engineering_v0.2: 0.5208333333333334
  • mmlu_elementary_mathematics_v0.2: 0.29222520107238603
  • mmlu_high_school_biology_v0.2: 0.5533333333333333
  • mmlu_high_school_chemistry_v0.2: 0.41116751269035534
  • mmlu_high_school_computer_science_v0.2: 0.59
  • mmlu_high_school_mathematics_v0.2: 0.28888888888888886
  • mmlu_high_school_physics_v0.2: 0.32653061224489793
  • mmlu_high_school_statistics_v0.2: 0.37962962962962965
  • mmlu_machine_learning_v0.2: 0.32142857142857145

准确度标准误差(Acc_stderr)

  • winogrande_tr_v0.2: 0.013969328135351856
  • truthfulqa_v0.2: 0.015238053713783683
  • mmlu_tr_v0.2: 0.004153344496444143
  • mmlu_humanities_v0.2: 0.007147532375992986
  • mmlu_formal_logic_v0.2: 0.04216370213557835
  • mmlu_high_school_european_history_v0.2: 0.040595860168112737
  • mmlu_high_school_us_history_v0.2: 0.03666722301252672
  • mmlu_high_school_world_history_v0.2: 0.03349370481032241
  • mmlu_international_law_v0.2: 0.044120158066245044
  • mmlu_jurisprudence_v0.2: 0.045894715469579954
  • mmlu_logical_fallacies_v0.2: 0.03939940096720004
  • mmlu_moral_disputes_v0.2: 0.028476280736968677
  • mmlu_moral_scenarios_v0.2: 0.01462693167826288
  • mmlu_philosophy_v0.2: 0.028505559474849656
  • mmlu_prehistory_v0.2: 0.028905463615555037
  • mmlu_professional_law_v0.2: 0.012646275336803221
  • mmlu_world_religions_v0.2: 0.036637969765626804
  • mmlu_other_v0.2: 0.00888852342477086
  • mmlu_business_ethics_v0.2: 0.04956872738042619
  • mmlu_clinical_knowledge_v0.2: 0.031095474260005376
  • mmlu_college_medicine_v0.2: 0.038515291799729304
  • mmlu_global_facts_v0.2: 0.049221385784280064
  • mmlu_human_aging_v0.2: 0.03438310454050883
  • mmlu_management_v0.2: 0.048292065023611885
  • mmlu_marketing_v0.2: 0.032037870375751454
  • mmlu_medical_genetics_v0.2: 0.049753325624911644
  • mmlu_miscellaneous_v0.2: 0.0173849250124041
  • mmlu_nutrition_v0.2: 0.028465172711779244
  • mmlu_professional_accounting_v0.2: 0.027954079926164443
  • mmlu_professional_medicine_v0.2: 0.031008456046434155
  • mmlu_virology_v0.2: 0.03949283907134624
  • mmlu_social_sciences_v0.2: 0.008955688723395153
  • mmlu_econometrics_v0.2: 0.04372748290278006
  • mmlu_high_school_geography_v0.2: 0.03395901225226425
  • mmlu_high_school_government_and_politics_v0.2: 0.03642987131924726
  • mmlu_high_school_macroeconomics_v0.2: 0.025323990861736118
  • mmlu_high_school_microeconomics_v0.2: 0.032539983791662855
  • mmlu_high_school_psychology_v0.2: 0.02118497146460828
  • mmlu_human_sexuality_v0.2: 0.046185723795122625
  • mmlu_professional_psychology_v0.2: 0.020363784567183178
  • mmlu_public_relations_v0.2: 0.04826217294139894
  • mmlu_security_studies_v0.2: 0.03227396567623779
  • mmlu_sociology_v0.2: 0.03357544396403132
  • mmlu_us_foreign_policy_v0.2: 0.046423399544431185
  • mmlu_stem_v0.2: 0.008665314387467683
  • mmlu_abstract_algebra_v0.2: 0.046056618647183814
  • mmlu_anatomy_v0.2: 0.04328577215262973
  • mmlu_astronomy: 0.04056527902281732
  • **mmlu_college_biology_
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作