OpenLLMTurkishLeadboardv2/details_KOCDIGITAL__Kocdigital-LLM-8b-v0.1
收藏Hugging Face2024-05-03 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/OpenLLMTurkishLeadboardv2/details_KOCDIGITAL__Kocdigital-LLM-8b-v0.1
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是在Open LLM Turkish Leaderboardv0.2上对模型KOCDIGITAL/Kocdigital-LLM-8b-v0.1进行评估时自动创建的。它包含了各种任务和子任务的详细结果,如winogrande_tr_v0.2、truthfulqa_v0.2和mmlu_tr_v0.2等。数据集结构包括结果、组和组子任务,为每个任务提供了准确性和标准误差指标。
该数据集是在Open LLM Turkish Leaderboardv0.2上对模型KOCDIGITAL/Kocdigital-LLM-8b-v0.1进行评估时自动创建的。它包含了各种任务和子任务的详细结果,如winogrande_tr_v0.2、truthfulqa_v0.2和mmlu_tr_v0.2等。数据集结构包括结果、组和组子任务,为每个任务提供了准确性和标准误差指标。
提供机构:
OpenLLMTurkishLeadboardv2
原始信息汇总
数据集概述
数据集是在评估模型KOCDIGITAL/Kocdigital-LLM-8b-v0.1在Open LLM土耳其Leaderboard v0.2运行期间自动创建的。
评估结果
准确度(Acc)
- winogrande_tr_v0.2: 0.5560821484992101
- truthfulqa_v0.2: 0.47707655422455664
- mmlu_tr_v0.2: 0.4734896102935739
- mmlu_humanities_v0.2: 0.42541562286495105
- mmlu_formal_logic_v0.2: 0.3333333333333333
- mmlu_high_school_european_history_v0.2: 0.5666666666666667
- mmlu_high_school_us_history_v0.2: 0.6033519553072626
- mmlu_high_school_world_history_v0.2: 0.6103286384976526
- mmlu_international_law_v0.2: 0.628099173553719
- mmlu_jurisprudence_v0.2: 0.6698113207547169
- mmlu_logical_fallacies_v0.2: 0.45962732919254656
- mmlu_moral_disputes_v0.2: 0.5324675324675324
- mmlu_moral_scenarios_v0.2: 0.24770642201834864
- mmlu_philosophy_v0.2: 0.5886287625418061
- mmlu_prehistory_v0.2: 0.5133333333333333
- mmlu_professional_law_v0.2: 0.33213256484149856
- mmlu_world_religions_v0.2: 0.6607142857142857
- mmlu_other_v0.2: 0.542136695421367
- mmlu_business_ethics_v0.2: 0.5959595959595959
- mmlu_clinical_knowledge_v0.2: 0.55859375
- mmlu_college_medicine_v0.2: 0.4523809523809524
- mmlu_global_facts_v0.2: 0.37755102040816324
- mmlu_human_aging_v0.2: 0.47641509433962265
- mmlu_management_v0.2: 0.6464646464646465
- mmlu_marketing_v0.2: 0.6682027649769585
- mmlu_medical_genetics_v0.2: 0.631578947368421
- mmlu_miscellaneous_v0.2: 0.6370757180156658
- mmlu_nutrition_v0.2: 0.5606557377049181
- mmlu_professional_accounting_v0.2: 0.31899641577060933
- mmlu_professional_medicine_v0.2: 0.5019157088122606
- mmlu_virology_v0.2: 0.44025157232704404
- mmlu_social_sciences_v0.2: 0.5394605394605395
- mmlu_econometrics_v0.2: 0.3157894736842105
- mmlu_high_school_geography_v0.2: 0.6548223350253807
- mmlu_high_school_government_and_politics_v0.2: 0.5561497326203209
- mmlu_high_school_macroeconomics_v0.2: 0.47692307692307695
- mmlu_high_school_microeconomics_v0.2: 0.510548523206751
- mmlu_high_school_psychology_v0.2: 0.6060037523452158
- mmlu_human_sexuality_v0.2: 0.5826086956521739
- mmlu_professional_psychology_v0.2: 0.43602693602693604
- mmlu_public_relations_v0.2: 0.5277777777777778
- mmlu_security_studies_v0.2: 0.5854700854700855
- mmlu_sociology_v0.2: 0.676923076923077
- mmlu_us_foreign_policy_v0.2: 0.696969696969697
- mmlu_stem_v0.2: 0.41123595505617977
- mmlu_abstract_algebra_v0.2: 0.3
- mmlu_anatomy_v0.2: 0.4198473282442748
- mmlu_astronomy: 0.5562913907284768
- mmlu_college_biology_v0.2: 0.4788732394366197
- mmlu_college_chemistry_v0.2: 0.3939393939393939
- mmlu_college_computer_science_v0.2: 0.45454545454545453
- mmlu_college_mathematics_v0.2: 0.33
- mmlu_college_physics_v0.2: 0.3564356435643564
- mmlu_computer_security_v0.2: 0.5
- mmlu_conceptual_physics_v0.2: 0.4592274678111588
- mmlu_electrical_engineering_v0.2: 0.5208333333333334
- mmlu_elementary_mathematics_v0.2: 0.29222520107238603
- mmlu_high_school_biology_v0.2: 0.5533333333333333
- mmlu_high_school_chemistry_v0.2: 0.41116751269035534
- mmlu_high_school_computer_science_v0.2: 0.59
- mmlu_high_school_mathematics_v0.2: 0.28888888888888886
- mmlu_high_school_physics_v0.2: 0.32653061224489793
- mmlu_high_school_statistics_v0.2: 0.37962962962962965
- mmlu_machine_learning_v0.2: 0.32142857142857145
准确度标准误差(Acc_stderr)
- winogrande_tr_v0.2: 0.013969328135351856
- truthfulqa_v0.2: 0.015238053713783683
- mmlu_tr_v0.2: 0.004153344496444143
- mmlu_humanities_v0.2: 0.007147532375992986
- mmlu_formal_logic_v0.2: 0.04216370213557835
- mmlu_high_school_european_history_v0.2: 0.040595860168112737
- mmlu_high_school_us_history_v0.2: 0.03666722301252672
- mmlu_high_school_world_history_v0.2: 0.03349370481032241
- mmlu_international_law_v0.2: 0.044120158066245044
- mmlu_jurisprudence_v0.2: 0.045894715469579954
- mmlu_logical_fallacies_v0.2: 0.03939940096720004
- mmlu_moral_disputes_v0.2: 0.028476280736968677
- mmlu_moral_scenarios_v0.2: 0.01462693167826288
- mmlu_philosophy_v0.2: 0.028505559474849656
- mmlu_prehistory_v0.2: 0.028905463615555037
- mmlu_professional_law_v0.2: 0.012646275336803221
- mmlu_world_religions_v0.2: 0.036637969765626804
- mmlu_other_v0.2: 0.00888852342477086
- mmlu_business_ethics_v0.2: 0.04956872738042619
- mmlu_clinical_knowledge_v0.2: 0.031095474260005376
- mmlu_college_medicine_v0.2: 0.038515291799729304
- mmlu_global_facts_v0.2: 0.049221385784280064
- mmlu_human_aging_v0.2: 0.03438310454050883
- mmlu_management_v0.2: 0.048292065023611885
- mmlu_marketing_v0.2: 0.032037870375751454
- mmlu_medical_genetics_v0.2: 0.049753325624911644
- mmlu_miscellaneous_v0.2: 0.0173849250124041
- mmlu_nutrition_v0.2: 0.028465172711779244
- mmlu_professional_accounting_v0.2: 0.027954079926164443
- mmlu_professional_medicine_v0.2: 0.031008456046434155
- mmlu_virology_v0.2: 0.03949283907134624
- mmlu_social_sciences_v0.2: 0.008955688723395153
- mmlu_econometrics_v0.2: 0.04372748290278006
- mmlu_high_school_geography_v0.2: 0.03395901225226425
- mmlu_high_school_government_and_politics_v0.2: 0.03642987131924726
- mmlu_high_school_macroeconomics_v0.2: 0.025323990861736118
- mmlu_high_school_microeconomics_v0.2: 0.032539983791662855
- mmlu_high_school_psychology_v0.2: 0.02118497146460828
- mmlu_human_sexuality_v0.2: 0.046185723795122625
- mmlu_professional_psychology_v0.2: 0.020363784567183178
- mmlu_public_relations_v0.2: 0.04826217294139894
- mmlu_security_studies_v0.2: 0.03227396567623779
- mmlu_sociology_v0.2: 0.03357544396403132
- mmlu_us_foreign_policy_v0.2: 0.046423399544431185
- mmlu_stem_v0.2: 0.008665314387467683
- mmlu_abstract_algebra_v0.2: 0.046056618647183814
- mmlu_anatomy_v0.2: 0.04328577215262973
- mmlu_astronomy: 0.04056527902281732
- **mmlu_college_biology_



