t2ance/coderm-ef-trajectories-o4-mini-qwen3-30b

Name: t2ance/coderm-ef-trajectories-o4-mini-qwen3-30b
Creator: t2ance
Published: 2025-12-17 08:32:57
License: 暂无描述

Hugging Face2025-12-17 更新2025-12-20 收录

下载链接：

https://hf-mirror.com/datasets/t2ance/coderm-ef-trajectories-o4-mini-qwen3-30b

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集包含LLM法官验证编程问题代码解决方案的轨迹。每条轨迹记录了完整的评估过程：问题、候选解决方案、法官推理、预测的正确性分数和实际执行结果。数据集适用于训练结果奖励模型、最佳选择、校准分析、错误分析和验证器集成等多种用途。数据集统计信息显示共有1292条轨迹，法官模型为Qwen/Qwen3-Coder-30B-A3B-Instruct，平台包括atcoder和leetcode，难度分布为简单316条、中等408条、困难568条。数据集结构详细描述了各个字段的含义和用途。

This dataset contains trajectories of an LLM judge verifying code solutions to programming problems. Each trajectory captures the complete evaluation process: problem, candidate solution, judge reasoning, predicted correctness score, and ground truth execution results. The dataset is suitable for various use cases such as training outcome reward models, best-of-N selection, calibration analysis, error analysis, and verifier ensembling. Dataset statistics show a total of 1292 trajectories, with the judge model being Qwen/Qwen3-Coder-30B-A3B-Instruct, platforms including atcoder and leetcode, and difficulty distribution of 316 easy, 408 medium, and 568 hard. The dataset structure details the meaning and usage of each field.

提供机构：

t2ance

5,000+

优质数据集

54 个

任务类型

进入经典数据集