five

laion/CoderForge-Preview-v3-31600

收藏
Hugging Face2026-04-22 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/laion/CoderForge-Preview-v3-31600
下载链接
链接失效反馈
官方服务:
资源简介:
laion/CoderForge-Preview-v3-31600是`togethercomputer/CoderForge-Preview`数据集的`trajectories-tokenized_qwencoder`子集的行子集,包含31,600行数据。原始数据来自4个不同的源(R2E_Gym, SWE_Rebench, SWE_Smith, filtered_reward1),共155,144行。数据格式为Qwen3的本地预标记化数据,每行包含`input_ids`、`attention_mask`、`labels`等列。数据集适用于axolotl框架,支持跳过聊天模板渲染器。

laion/CoderForge-Preview-v3-31600 is a row-subset of the pre-tokenized trajectories in `togethercomputer/CoderForge-Preview` (`trajectories-tokenized_qwencoder` subset). It contains 31,600 rows, sourced from 155,144 rows across 4 slugs (R2E_Gym, SWE_Rebench, SWE_Smith, filtered_reward1). The data is in native pre-tokenized format for Qwen3, with per-row columns including `input_ids`, `attention_mask`, `labels`, etc. The dataset is designed for use with axolotl, which detects the pre-tokenized columns and skips the chat_template renderer.
提供机构:
laion
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作