open-llm-leaderboard/details_meta-llama__Meta-Llama-3-8B-Instruct

Name: open-llm-leaderboard/details_meta-llama__Meta-Llama-3-8B-Instruct
Creator: open-llm-leaderboard
Published: 2024-04-19 09:24:56
License: 暂无描述

Hugging Face2024-04-19 更新2024-04-21 收录

下载链接：

https://hf-mirror.com/datasets/open-llm-leaderboard/details_meta-llama__Meta-Llama-3-8B-Instruct

下载链接

链接失效反馈

官方服务：

资源简介：

数据集是在评估模型meta-llama/Meta-Llama-3-8B-Instruct时自动创建的，评估在Open LLM Leaderboard上进行。数据集包含63个配置，每个配置对应一个评估任务。数据集由1次运行创建，每次运行可以在每个配置中找到，运行的时间戳作为分割名称。train分割始终指向最新的结果。此外，results配置存储了所有运行的聚合结果，并用于计算和显示Open LLM Leaderboard上的聚合指标。

This dataset was automatically created during the evaluation of the model meta-llama/Meta-Llama-3-8B-Instruct on the Open LLM Leaderboard. It contains 63 configurations, each corresponding to one evaluation task. The dataset was generated via a single run, where each run under every configuration uses its timestamp as the split name. The 'train' split always points to the most recent results. Additionally, the 'results' configuration stores the aggregated results across all runs and is utilized to calculate and display the aggregate metrics on the Open LLM Leaderboard.

提供机构：

open-llm-leaderboard

原始信息汇总

数据集概述

数据集基本信息

名称: Evaluation run of meta-llama/Meta-Llama-3-8B-Instruct
创建目的: 自动创建于模型meta-llama/Meta-Llama-3-8B-Instruct在Open LLM Leaderboard的评估运行期间。
组成: 包含63个配置，每个配置对应一个评估任务。
创建次数: 从1次运行中创建。

数据集结构

配置详情:
- 每个配置包含特定任务的评估数据。
- 每个配置中的数据分为不同的分割，以运行的时间戳命名。
- “train”分割指向最新的结果。
额外配置:
- “results”配置存储所有运行的聚合结果，用于计算和显示聚合指标。

加载数据示例

python from datasets import load_dataset data = load_dataset("open-llm-leaderboard/details_meta-llama__Meta-Llama-3-8B-Instruct", "harness_winogrande_5", split="train")

数据集配置

配置列表:
- harness_arc_challenge_25
- harness_gsm8k_5
- harness_hellaswag_10
- harness_hendrycksTest_5
数据文件:
- 每个配置对应的数据文件，包含不同时间戳的分割和最新分割。
- 示例路径: **/details_harness|arc:challenge|25_2024-04-19T09-19-13.454877.parquet

5,000+

优质数据集

54 个

任务类型

进入经典数据集