OALL/details_rombodawg__Rombos-LLM-V2.5-Qwen-72b_v2_alrage
收藏Hugging Face2025-02-15 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/OALL/details_rombodawg__Rombos-LLM-V2.5-Qwen-72b_v2_alrage
下载链接
链接失效反馈官方服务:
资源简介:
在模型rombodawg/Rombos-LLM-V2.5-Qwen-72b评估过程中自动创建的数据集。该数据集包含一个配置,每个配置对应于一个评估任务。数据集由一次运行创建,每个运行在每个配置中都有一个特定的分割,分割名称使用运行的时间戳。train分割始终指向最新的结果。还有一个额外的配置results存储所有运行聚合的结果。数据集文件为Parquet格式,可以使用Python中的datasets库加载。最新结果包括诸如llm_as_judge和llm_as_judge_stderr等指标。
Dataset automatically created during the evaluation of the model rombodawg/Rombos-LLM-V2.5-Qwen-72b. The dataset consists of one configuration corresponding to one of the evaluated tasks. It has been created from one run, with each run represented as a specific split in each configuration named after the runs timestamp. The train split always points to the latest results. An additional configuration results stores all aggregated results of the run. The dataset files are in Parquet format and can be loaded using the datasets library in Python. The latest results include metrics such as llm_as_judge and llm_as_judge_stderr.
提供机构:
OALL



