OpenLLMTurkishLeadboardv2/details_umarigan__LLama-3-8B-Instruction-tr

Name: OpenLLMTurkishLeadboardv2/details_umarigan__LLama-3-8B-Instruction-tr
Creator: OpenLLMTurkishLeadboardv2
Published: 2024-04-28 12:59:11
License: 暂无描述

Hugging Face2024-04-28 更新2024-06-12 收录

下载链接：

https://hf-mirror.com/datasets/OpenLLMTurkishLeadboardv2/details_umarigan__LLama-3-8B-Instruction-tr

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集是在Open LLM Turkish Leaderboardv0.2上对模型umarigan/LLama-3-8B-Instruction-tr进行评估时自动创建的。数据集包含了多个评估任务的结果，如winogrande_tr-v0.2、truthfulqa_v0.2、mmlu_tr_v0.2等，每个任务都有对应的准确率和标准误差。此外，还详细列出了每个任务的配置信息，如任务名称、数据集路径、测试分割、fewshot分割等。

提供机构：

OpenLLMTurkishLeadboardv2

原始信息汇总

数据集概述

数据集是在对模型umarigan/LLama-3-8B-Instruction-tr进行评估运行期间自动创建的，用于Open LLM土耳其Leaderboard v0.2的评估。

评估结果

数据集包含了多个子任务的评估结果，具体包括：

winogrande_tr-v0.2
- 准确率（acc,none）: 0.5489731437598736
- 准确率标准误差（acc_stderr,none）: 0.013990443694712388
truthfulqa_v0.2
- 准确率（acc,none）: 0.5027720461732642
- 准确率标准误差（acc_stderr,none）: 0.015581965524095102
mmlu_tr_v0.2
- 准确率（acc,none）: 0.4736375064704577
- 准确率标准误差（acc_stderr,none）: 0.004175832220391306
mmlu_humanities_v0.2
- 准确率（acc,none）: 0.43475290366659075
- 准确率标准误差（acc_stderr,none）: 0.007162716390522436
mmlu_other_v0.2
- 准确率（acc,none）: 0.5248838752488387
- 准确率标准误差（acc_stderr,none）: 0.008917781405763416
mmlu_social_sciences_v0.2
- 准确率（acc,none）: 0.5244755244755245
- 准确率标准误差（acc_stderr,none）: 0.008982696903864927
mmlu_stem_v0.2
- 准确率（acc,none）: 0.4298555377207063
- 准确率标准误差（acc_stderr,none）: 0.00878908320599801

子任务详细结果

数据集还提供了更详细的子任务评估结果，包括但不限于：

mmlu_high_school_european_history_v0.2
- 准确率（acc,none）: 0.6
- 准确率标准误差（acc_stderr,none）: 0.04013400372543903
mmlu_international_law_v0.2
- 准确率（acc,none）: 0.6859504132231405
- 准确率标准误差（acc_stderr,none）: 0.04236964753041019
mmlu_world_religions_v0.2
- 准确率（acc,none）: 0.6964285714285714
- 准确率标准误差（acc_stderr,none）: 0.03558037341028271

这些结果提供了模型在不同领域和任务上的性能评估。

5,000+

优质数据集

54 个

任务类型

进入经典数据集