AlienKevin/SWE-bench-Multilingual-trajectories
收藏Hugging Face2025-12-12 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/AlienKevin/SWE-bench-Multilingual-trajectories
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含在不同模型上评估的SWE-bench Multilingual (Rust)的轨迹。数据集分为几个部分,包括来自Qwen2.5-Coder-32B-Instruct、SWE-agent-LM-32B(来自SWE-smith)和Multi-SWE-agent-32B-Rust的轨迹。Multi-SWE-agent-32B-Rust是在Multi-SWE-smith的Rust任务实例子集上对142个轨迹进行采样并训练3个周期得到的。数据集包含以下列:id(任务实例ID,即子文件夹名称)、trajectory(*.traj文件中的JSON格式内容)、resolved(布尔值,表示任务是否解决)、errored(布尔值,表示任务执行是否出错)和completed(布尔值,表示任务是否完成)。
This dataset contains trajectories from different models evaluated on SWE-bench Multilingual (Rust). The dataset is divided into several splits, including trajectories from Qwen2.5-Coder-32B-Instruct, SWE-agent-LM-32B (from SWE-smith), and Multi-SWE-agent-32B-Rust, which was trained on 142 trajectories sampled from GLM 4.6 on a subset of Rust task instances from Multi-SWE-smith for 3 epochs. The dataset includes the following columns: id (the task instance ID, i.e., subfolder name), trajectory (the content of the *.traj file in JSON format), resolved (a boolean indicating if the task was resolved), errored (a boolean indicating if the task execution errored), and completed (a boolean indicating if the task was completed).
提供机构:
AlienKevin



