lvogel123/gpqa-diamond-claude-4.5-sonnet
收藏Hugging Face2025-10-22 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/lvogel123/gpqa-diamond-claude-4.5-sonnet
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了实验结果、样本和统计信息三个部分。实验结果部分记录了实验的log路径、评估ID、运行ID、创建时间、任务信息、模型名称、样本总数、完成样本数、准确度、标准误差和标准差等。样本部分包含了样本ID、epoch、目标、消息、MCQ评分器的值和答案等信息。统计信息部分则记录了实验的开始和结束时间、模型使用情况、输入输出token数量等。数据集适用于机器学习模型训练和评估。
The dataset consists of three parts: experiment results, samples, and statistics. The experiment results part records information such as log path, evaluation ID, run ID, creation time, task information, model name, total number of samples, number of completed samples, accuracy, standard error, and standard deviation. The samples part includes sample ID, epoch, target, messages, MCQ scorer value, and MCQ scorer answer. The statistics part records the start and end time of the experiment, model usage, number of input and output tokens, etc. The dataset is suitable for machine learning model training and evaluation.
提供机构:
lvogel123



