OALL/details_Sakalti__ultiima-78B

Name: OALL/details_Sakalti__ultiima-78B
Creator: OALL
Published: 2025-02-07 08:17:39
License: 暂无描述

Hugging Face2025-02-07 更新2025-02-15 收录

下载链接：

https://hf-mirror.com/datasets/OALL/details_Sakalti__ultiima-78B

下载链接

链接失效反馈

官方服务：

资源简介：

数据集自动创建于模型[Sakalti/ultiima-78B](https://huggingface.co/Sakalti/ultiima-78B)的评估运行中。数据集由136个配置组成，每个配置对应一个评估任务。数据集是从1次运行中创建的。每个运行都可以作为一个特定的分割找到，分割的名称使用运行的时戳。train分割始终指向最新的结果。还有一个额外的配置results存储所有运行的聚合结果。要从一个运行中加载详细信息，可以执行以下操作： python from datasets import load_dataset data = load_dataset("OALL/details_Sakalti__ultiima-78B", "lighteval_xstory_cloze_ar_0_2025_02_07T08_14_47_656279_parquet", split="train") ## 最新结果这些是[运行2025-02-07T08:14:47.656279的最新结果](https://huggingface.co/datasets/OALL/details_Sakalti__ultiima-78B/blob/main/results_2025-02-07T08-14-47.656279.json)（注意，如果连续的评估没有覆盖相同的任务，则仓库中可能有其他任务的结果。您可以在结果中找到每个，以及每个评估的latest分割）：

Dataset automatically created during the evaluation run of model [Sakalti/ultiima-78B](https://huggingface.co/Sakalti/ultiima-78B). The dataset is composed of 136 configuration, each one corresponding to one of the evaluated tasks. The dataset has been created from 1 run(s). Each run can be found as a specific split in each configuration, the split being named using the timestamp of the run. The "train" split is always pointing to the latest results. An additional configuration "results" stores all the aggregated results of the run. To load the details from a run, you can for instance do the following: python from datasets import load_dataset data = load_dataset("OALL/details_Sakalti__ultiima-78B", "lighteval_xstory_cloze_ar_0_2025_02_07T08_14_47_656279_parquet", split="train") ## Latest results These are the [latest results from run 2025-02-07T08:14:47.656279](https://huggingface.co/datasets/OALL/details_Sakalti__ultiima-78B/blob/main/results_2025-02-07T08-14-47.656279.json)(note that there might be results for other tasks in the repos if successive evals didnt cover the same tasks. You find each in the results and the "latest" split for each eval):

提供机构：

OALL

5,000+

优质数据集

54 个

任务类型

进入经典数据集