five

OpenLLMTurkishLeadboardv2/details_Trendyol__Trendyol-LLM-7b-chat-v1.0

收藏
Hugging Face2024-04-27 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/OpenLLMTurkishLeadboardv2/details_Trendyol__Trendyol-LLM-7b-chat-v1.0
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集是在Open LLM Turkish Leaderboardv0.2上对模型Trendyol/Trendyol-LLM-7b-chat-v1.0进行评估时自动创建的。数据集包含了多个评估任务的结果,如winogrande_tr、truthfulqa_v0.2、mmlu_tr_v0.2等,每个任务都有对应的准确率和标准误差。此外,还提供了每个任务的配置信息,如任务名称、数据集路径、测试分割、fewshot分割等。

该数据集是在Open LLM Turkish Leaderboardv0.2上对模型Trendyol/Trendyol-LLM-7b-chat-v1.0进行评估时自动创建的。数据集包含了多个评估任务的结果,如winogrande_tr、truthfulqa_v0.2、mmlu_tr_v0.2等,每个任务都有对应的准确率和标准误差。此外,还提供了每个任务的配置信息,如任务名称、数据集路径、测试分割、fewshot分割等。
提供机构:
OpenLLMTurkishLeadboardv2
原始信息汇总

数据集概述

数据集是在评估模型Trendyol/Trendyol-LLM-7b-chat-v1.0在Open LLM土耳其Leaderboard v0.2运行期间自动创建的。

评估结果

准确度(Acc)和标准误差(Acc_stderr)

  • winogrande_tr:

    • Acc: 0.5639810426540285
    • Acc_stderr: 0.013942468639504206
  • truthfulqa_v0.2:

    • Acc: 0.431264994699023
    • Acc_stderr: 0.01584776908566045
  • mmlu_tr_v0.2:

    • Acc: 0.3948827922798196
    • Acc_stderr: 0.0041515196850484635
  • mmlu_humanities_v0.2:

    • Acc: 0.37144158506035074
    • Acc_stderr: 0.0072290666516993994
  • mmlu_formal_logic_v0.2:

    • Acc: 0.31746031746031744
    • Acc_stderr: 0.04163453031302859
  • mmlu_high_school_european_history_v0.2:

    • Acc: 0.4533333333333333
    • Acc_stderr: 0.040782795278808064
  • mmlu_high_school_us_history_v0.2:

    • Acc: 0.39664804469273746
    • Acc_stderr: 0.03666722301252672
  • mmlu_high_school_world_history_v0.2:

    • Acc: 0.4835680751173709
    • Acc_stderr: 0.03432159174112686
  • mmlu_international_law_v0.2:

    • Acc: 0.5289256198347108
    • Acc_stderr: 0.04556710331269498
  • mmlu_jurisprudence_v0.2:

    • Acc: 0.39622641509433965
    • Acc_stderr: 0.047732492983673595
  • mmlu_logical_fallacies_v0.2:

    • Acc: 0.37267080745341613
    • Acc_stderr: 0.03822525970525206
  • mmlu_moral_disputes_v0.2:

    • Acc: 0.42207792207792205
    • Acc_stderr: 0.028187838402155125
  • mmlu_moral_scenarios_v0.2:

    • Acc: 0.31880733944954126
    • Acc_stderr: 0.015790288247596724
  • mmlu_philosophy_v0.2:

    • Acc: 0.4414715719063545
    • Acc_stderr: 0.028765099513410327
  • mmlu_prehistory_v0.2:

    • Acc: 0.39666666666666667
    • Acc_stderr: 0.028291496425144964
  • mmlu_professional_law_v0.2:

    • Acc: 0.31412103746397696
    • Acc_stderr: 0.012463327930569887
  • mmlu_world_religions_v0.2:

    • Acc: 0.5238095238095238
    • Acc_stderr: 0.038647269200068196
  • mmlu_other_v0.2:

    • Acc: 0.4455872594558726
    • Acc_stderr: 0.008953386445254787
  • mmlu_business_ethics_v0.2:

    • Acc: 0.48484848484848486
    • Acc_stderr: 0.050484431990002604
  • mmlu_clinical_knowledge_v0.2:

    • Acc: 0.41015625
    • Acc_stderr: 0.030801585176036275
  • mmlu_college_medicine_v0.2:

    • Acc: 0.4226190476190476
    • Acc_stderr: 0.03822500265005227
  • mmlu_global_facts_v0.2:

    • Acc: 0.35714285714285715
    • Acc_stderr: 0.048651065269982086
  • mmlu_human_aging_v0.2:

    • Acc: 0.4339622641509434
    • Acc_stderr: 0.0341198763105892
  • mmlu_management_v0.2:

    • Acc: 0.5656565656565656
    • Acc_stderr: 0.05007027870966083
  • mmlu_marketing_v0.2:

    • Acc: 0.5529953917050692
    • Acc_stderr: 0.03382905613755031
  • mmlu_medical_genetics_v0.2:

    • Acc: 0.5157894736842106
    • Acc_stderr: 0.05154534179593067
  • mmlu_miscellaneous_v0.2:

    • Acc: 0.5287206266318538
    • Acc_stderr: 0.018047690113669693
  • mmlu_nutrition_v0.2:

    • Acc: 0.3770491803278688
    • Acc_stderr: 0.02779643435707082
  • mmlu_professional_accounting_v0.2:

    • Acc: 0.31899641577060933
    • Acc_stderr: 0.02795407992616445
  • mmlu_professional_medicine_v0.2:

    • Acc: 0.3486590038314176
    • Acc_stderr: 0.029554116131305524
  • mmlu_virology_v0.2:

    • Acc: 0.42138364779874216
    • Acc_stderr: 0.039283090474869206
  • mmlu_social_sciences_v0.2:

    • Acc: 0.4368964368964369
    • Acc_stderr: 0.008983743720354284
  • mmlu_econometrics_v0.2:

    • Acc: 0.32456140350877194
    • Acc_stderr: 0.04404556157374768
  • mmlu_high_school_geography_v0.2:

    • Acc: 0.5177664974619289
    • Acc_stderr: 0.03569173227650389
  • mmlu_high_school_government_and_politics_v0.2:

    • Acc: 0.37433155080213903
    • Acc_stderr: 0.03548492341343032
  • mmlu_high_school_macroeconomics_v0.2:

    • Acc: 0.3769230769230769
    • Acc_stderr: 0.024570975364225995
  • mmlu_high_school_microeconomics_v0.2:

    • Acc: 0.41350210970464135
    • Acc_stderr: 0.03205649904851858
  • mmlu_high_school_psychology_v0.2:

    • Acc: 0.49530956848030017
    • Acc_stderr: 0.02167679538974159
  • mmlu_human_sexuality_v0.2:

    • Acc: 0.5130434782608696
    • Acc_stderr: 0.04681335351503156
  • mmlu_professional_psychology_v0.2:

    • Acc: 0.37542087542087543
    • Acc_stderr: 0.019884999970068064
  • mmlu_public_relations_v0.2:

    • Acc: 0.5277777777777778
    • Acc_stderr: 0.04826217294139894
  • mmlu_security_studies_v0.2:

    • Acc: 0.4230769230769231
    • Acc_stderr: 0.032366121762202014
  • mmlu_sociology_v0.2:

    • Acc: 0.49743589743589745
    • Acc_stderr: 0.03589743589743589
  • mmlu_us_foreign_policy_v0.2:

    • Acc: 0.5959595959595959
    • Acc_stderr: 0.049568727380426184
  • mmlu_stem_v0.2:

    • Acc: 0.33836276083467093
    • Acc_stderr: 0.008421487382708258
  • mmlu_abstract_algebra_v0.2:

    • Acc: 0.27
    • Acc_stderr: 0.0446196043338474
  • mmlu_anatomy_v0.2:

    • Acc: 0.37404580152671757
    • Acc_stderr: 0.04243869242230524
  • mmlu_astronomy:

    • Acc: 0.4370860927152318
    • Acc_stderr: 0.04050035722230636
  • mmlu_college_biology_v0.2:

    • Acc: 0.39436619718309857
    • Acc_stderr: 0.04115715424330713
  • mmlu_college_chemistry_v0.2:

    • Acc: 0.3434343434343434
    • Acc_stderr: 0.0479675905875748
  • mmlu_college_computer_science_v0.2:

    • Acc: 0.3333333333333333
    • Acc_stderr: 0.04761904761904759
  • mmlu_college_mathematics_v0.2:

    • Acc: 0.36
    • Acc_stderr: 0.048241815132442176
  • mmlu_college_physics_v0.2:

    • Acc: 0.37623762376237624
    • Acc_stderr: 0.048444078505841884
  • mmlu_computer_security_v0.2:

    • Acc: 0.43
    • Acc_stderr: 0.049756985195624284
  • mmlu_conceptual_physics_v0.2:

    • Acc: 0.3218884120171674
    • Acc_stderr: 0.03067321238265887
  • mmlu_electrical_engineering_v0.2:

    • Acc: 0.4236111111111111
    • Acc_stderr: 0.0413212501972337
  • mmlu_elementary_mathematics_v0.2:

    • Acc: 0.3136729222520107
    • Acc_stderr: 0.024056509418958694
  • mmlu_high_school_biology_v0.2:

    • Acc: 0.41333333333333333
    • Acc_stderr: 0.028478055207315847
  • mmlu_high_school_chemistry_v0.2:

    • Acc: 0.28426395939086296
    • Acc_stderr: 0
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作