laion/Sera-4.5A-Full-T1-v3-1000
收藏Hugging Face2026-04-22 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/laion/Sera-4.5A-Full-T1-v3-1000
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是allenai/Sera-4.5A-Full-T1的一个子集,包含1,000行数据(完整数据集有72,118行)。数据格式为原始JSONL,采用OpenAI原生消息布局,保留了原始字段如`messages`、`instance_id`、`rollout_patch`、`func_name`、`func_path`、`problem_statement`、`target_patch`和`docker_image`,并添加了指向父数据集的`source`字段。每条助手消息包含一个原生的`tool_calls`数组和一个`train: bool`标志,用于每条消息的损失掩码。数据集适用于axolotl工具,使用`type: chat_template`和`chat_template: chatml`等配置。
Subset of [allenai/Sera-4.5A-Full-T1](https://huggingface.co/datasets/allenai/Sera-4.5A-Full-T1). Size: 1,000 rows (full dataset: 72,118 rows). Format: Raw JSONL, OpenAI-native messages layout. Preserves the original `messages` field (as JSON string), `instance_id`, `rollout_patch`, `func_name`, `func_path`, `problem_statement`, `target_patch`, `docker_image`. Adds a `source` field pointing back to the parent dataset. Each assistant message carries a native `tool_calls` array (OpenAI tool-calling format) and a `train: bool` flag for per-message loss masking — these are **not** flattened into shareGPT. Intended for direct consumption by [axolotl](https://github.com/axolotl-ai-cloud/axolotl) with `type: chat_template`, `chat_template: chatml`, `message_field_training: train`.
提供机构:
laion



