laion/CoderForge-Preview-v3
收藏Hugging Face2026-04-22 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/laion/CoderForge-Preview-v3
下载链接
链接失效反馈官方服务:
资源简介:
laion/CoderForge-Preview-v3数据集是togethercomputer/CoderForge-Preview数据集中trajectories-tokenized_qwencoder子集的一个行子集,包含155,144行数据,来源于4个不同的源(R2E_Gym, SWE_Rebench, SWE_Smith, filtered_reward1)。数据格式为Qwen3原生预标记化数据,每行包含input_ids、attention_mask、labels、chat_template_applied、trajectory_id、reward和source列。数据集用于文本生成任务,支持axolotl框架,并配置了chat_template和sequence_len参数。
laion/CoderForge-Preview-v3 is a row-subset of the pre-tokenized trajectories in the togethercomputer/CoderForge-Preview datasets trajectories-tokenized_qwencoder subset. It contains 155,144 rows sourced from 4 different slugs (R2E_Gym, SWE_Rebench, SWE_Smith, filtered_reward1). The data format is native pre-tokenized data for Qwen3, with each row including columns such as input_ids, attention_mask, labels, chat_template_applied, trajectory_id, reward, and source. The dataset is intended for text-generation tasks and supports the axolotl framework with configured chat_template and sequence_len parameters.
提供机构:
laion



