five

OpenLLMTurkishLeadboardv2/details_Orbina__Orbita-v0.1

收藏
Hugging Face2024-04-27 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/OpenLLMTurkishLeadboardv2/details_Orbina__Orbita-v0.1
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集是在Open LLM Turkish Leaderboardv0.2上对模型Orbina/Orbita-v0.1进行评估时自动创建的。数据集包含了多个评估任务的结果,如winogrande_tr、truthfulqa_v0.2、mmlu_tr_v0.2等,每个任务都有相应的准确率和标准误差。此外,数据集中还详细列出了每个任务的配置信息,包括任务名称、数据集路径、测试分割、fewshot分割、文档到文本的转换方式等。

该数据集是在Open LLM Turkish Leaderboardv0.2上对模型Orbina/Orbita-v0.1进行评估时自动创建的。数据集包含了多个评估任务的结果,如winogrande_tr、truthfulqa_v0.2、mmlu_tr_v0.2等,每个任务都有相应的准确率和标准误差。此外,数据集中还详细列出了每个任务的配置信息,包括任务名称、数据集路径、测试分割、fewshot分割、文档到文本的转换方式等。
提供机构:
OpenLLMTurkishLeadboardv2
原始信息汇总

数据集概述

数据集是在评估模型 Orbina/Orbita-v0.1 在 Open LLM Turkish Leaderboard v0.2 上的运行过程中自动创建的。

评估结果

准确率(Acc)和标准误差(Acc_stderr)

  • winogrande_tr:

    • Acc: 0.5616113744075829
    • Acc_stderr: 0.01395090314113922
  • truthfulqa_v0.2:

    • Acc: 0.50778392726845
    • Acc_stderr: 0.015415009310400483
  • mmlu_tr_v0.2:

    • Acc: 0.49515640020705465
    • Acc_stderr: 0.004171392981675135
  • mmlu_humanities_v0.2:

    • Acc: 0.44067410612616714
    • Acc_stderr: 0.0072042837016912074
  • mmlu_formal_logic_v0.2:

    • Acc: 0.42857142857142855
    • Acc_stderr: 0.04426266681379909
  • mmlu_high_school_european_history_v0.2:

    • Acc: 0.6133333333333333
    • Acc_stderr: 0.03989546370031041
  • mmlu_high_school_us_history_v0.2:

    • Acc: 0.5810055865921788
    • Acc_stderr: 0.03698147842986738
  • mmlu_high_school_world_history_v0.2:

    • Acc: 0.6666666666666666
    • Acc_stderr: 0.0323761954119088
  • mmlu_international_law_v0.2:

    • Acc: 0.6859504132231405
    • Acc_stderr: 0.042369647530410184
  • mmlu_jurisprudence_v0.2:

    • Acc: 0.5849056603773585
    • Acc_stderr: 0.048086333949706635
  • mmlu_logical_fallacies_v0.2:

    • Acc: 0.4968944099378882
    • Acc_stderr: 0.039527708265086496
  • mmlu_moral_disputes_v0.2:

    • Acc: 0.5324675324675324
    • Acc_stderr: 0.028476280736968677
  • mmlu_moral_scenarios_v0.2:

    • Acc: 0.23853211009174313
    • Acc_stderr: 0.01444076314197968
  • mmlu_philosophy_v0.2:

    • Acc: 0.5518394648829431
    • Acc_stderr: 0.028808128856107652
  • mmlu_prehistory_v0.2:

    • Acc: 0.5366666666666666
    • Acc_stderr: 0.02883789055433726
  • mmlu_professional_law_v0.2:

    • Acc: 0.37319884726224783
    • Acc_stderr: 0.012986640233707492
  • mmlu_world_religions_v0.2:

    • Acc: 0.6071428571428571
    • Acc_stderr: 0.03779240554853983
  • mmlu_other_v0.2:

    • Acc: 0.5325149303251493
    • Acc_stderr: 0.008911579610042552
  • mmlu_business_ethics_v0.2:

    • Acc: 0.5959595959595959
    • Acc_stderr: 0.04956872738042618
  • mmlu_clinical_knowledge_v0.2:

    • Acc: 0.5390625
    • Acc_stderr: 0.0312155140597541
  • mmlu_college_medicine_v0.2:

    • Acc: 0.5178571428571429
    • Acc_stderr: 0.0386664782674949
  • mmlu_global_facts_v0.2:

    • Acc: 0.29591836734693877
    • Acc_stderr: 0.04634593001555603
  • mmlu_human_aging_v0.2:

    • Acc: 0.5188679245283019
    • Acc_stderr: 0.03439690285738042
  • mmlu_management_v0.2:

    • Acc: 0.6262626262626263
    • Acc_stderr: 0.04887069039502487
  • mmlu_marketing_v0.2:

    • Acc: 0.7004608294930875
    • Acc_stderr: 0.03116677479474285
  • mmlu_medical_genetics_v0.2:

    • Acc: 0.5578947368421052
    • Acc_stderr: 0.051224183891818126
  • mmlu_miscellaneous_v0.2:

    • Acc: 0.6096605744125326
    • Acc_stderr: 0.017637399302140862
  • mmlu_nutrition_v0.2:

    • Acc: 0.521311475409836
    • Acc_stderr: 0.02865090594093615
  • mmlu_professional_accounting_v0.2:

    • Acc: 0.3154121863799283
    • Acc_stderr: 0.027869643826017407
  • mmlu_professional_medicine_v0.2:

    • Acc: 0.4789272030651341
    • Acc_stderr: 0.03098113180316629
  • mmlu_virology_v0.2:

    • Acc: 0.4779874213836478
    • Acc_stderr: 0.03973929649561242
  • mmlu_social_sciences_v0.2:

    • Acc: 0.572094572094572
    • Acc_stderr: 0.00888927240965376
  • mmlu_econometrics_v0.2:

    • Acc: 0.37719298245614036
    • Acc_stderr: 0.04559522141958216
  • mmlu_high_school_geography_v0.2:

    • Acc: 0.6395939086294417
    • Acc_stderr: 0.03429416121196761
  • mmlu_high_school_government_and_politics_v0.2:

    • Acc: 0.5989304812834224
    • Acc_stderr: 0.03593697887872985
  • mmlu_high_school_macroeconomics_v0.2:

    • Acc: 0.5128205128205128
    • Acc_stderr: 0.025342671293807257
  • mmlu_high_school_microeconomics_v0.2:

    • Acc: 0.5569620253164557
    • Acc_stderr: 0.032335327775334835
  • mmlu_high_school_psychology_v0.2:

    • Acc: 0.6435272045028143
    • Acc_stderr: 0.020765425535814862
  • mmlu_human_sexuality_v0.2:

    • Acc: 0.6173913043478261
    • Acc_stderr: 0.04552031372871532
  • mmlu_professional_psychology_v0.2:

    • Acc: 0.45286195286195285
    • Acc_stderr: 0.020441088985356612
  • mmlu_public_relations_v0.2:

    • Acc: 0.6296296296296297
    • Acc_stderr: 0.0466840803302493
  • mmlu_security_studies_v0.2:

    • Acc: 0.6068376068376068
    • Acc_stderr: 0.03199957924651048
  • mmlu_sociology_v0.2:

    • Acc: 0.7128205128205128
    • Acc_stderr: 0.032483733385398866
  • mmlu_us_foreign_policy_v0.2:

    • Acc: 0.7373737373737373
    • Acc_stderr: 0.04445287676983945
  • mmlu_stem_v0.2:

    • Acc: 0.46163723916532906
    • Acc_stderr: 0.008775995548684237
  • mmlu_abstract_algebra_v0.2:

    • Acc: 0.27
    • Acc_stderr: 0.04461960433384741
  • mmlu_anatomy_v0.2:

    • Acc: 0.44274809160305345
    • Acc_stderr: 0.04356447202665069
  • mmlu_astronomy:

    • Acc: 0.5231788079470199
    • Acc_stderr: 0.04078093859163085
  • mmlu_college_biology_v0.2:

    • Acc: 0.528169014084507
    • Acc_stderr: 0.042040718749170536
  • mmlu_college_chemistry_v0.2:

    • Acc: 0.3838383838383838
    • Acc_stderr: 0.04912566964083466
  • mmlu_college_computer_science_v0.2:

    • Acc: 0.494949494949495
    • Acc_stderr: 0.05050505050505048
  • mmlu_college_mathematics_v0.2:

    • Acc: 0.4
    • Acc_stderr: 0.04923659639173309
  • mmlu_college_physics_v0.2:

    • Acc: 0.297029702970297
    • Acc_stderr: 0.04569497330381909
  • mmlu_computer_security_v0.2:

    • Acc: 0.6
    • Acc_stderr: 0.049236596391733084
  • mmlu_conceptual_physics_v0.2:

    • Acc: 0.5107296137339056
    • Acc_stderr: 0.03281904904358935
  • mmlu_electrical_engineering_v0.2:

    • Acc: 0.4861111111111111
    • Acc_stderr: 0.041795966175810016
  • mmlu_elementary_mathematics_v0.2:

    • Acc: 0.4772117962466488
    • Acc_stderr: 0.025896853805342974
  • mmlu_high_school_biology_v0.2:

    • Acc: 0.6433333333333333
    • Acc_stderr: 0.027702163901059916
  • mmlu_high_school_chemistry_v0.2:

    • Acc: 0.4010152284263959
    • Acc_stderr: 0.03500743470573262
  • **mmlu_high_school_computer_science_v0

5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作