OALL/details_tiiuae__Falcon3-10B-Base_v2_alrage
收藏Hugging Face2025-02-13 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/OALL/details_tiiuae__Falcon3-10B-Base_v2_alrage
下载链接
链接失效反馈官方服务:
资源简介:
在模型tiiuae/Falcon3-10B-Base评估期间自动创建的数据集。该数据集包含一个配置,每个配置对应于一个评估任务。数据集由一个或多个运行的结果聚合而成,数据集的结构以运行的 时间戳命名 splits,并且train split 总是指向最新的结果。还有一个名为 results 的附加配置,用于存储所有聚合结果。可以使用 Hugging Face 的 datasets 库加载数据,README 中提供了示例。最新结果包括 llm_as_judge 和 llm_as_judge_stderr 等指标。
Dataset automatically created during the evaluation of the model tiiuae/Falcon3-10B-Base. The dataset comprises one configuration corresponding to an evaluated task and includes aggregated results from one or more runs. The dataset structure is named with splits timestamped from the run, and the train split always points to the latest results. There is also an additional configuration named results that stores all aggregated results. The data can be loaded using the datasets library from Hugging Face, and an example is provided in the README. The latest results include metrics such as llm_as_judge and llm_as_judge_stderr.
提供机构:
OALL



