laion/Sera-4.5A-Full-T1-v3-1000

Name: laion/Sera-4.5A-Full-T1-v3-1000
Creator: laion
Published: 2026-04-22 13:29:32
License: 暂无描述

Hugging Face2026-04-22 更新2026-04-26 收录

下载链接：

https://hf-mirror.com/datasets/laion/Sera-4.5A-Full-T1-v3-1000

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集是allenai/Sera-4.5A-Full-T1的一个子集，包含1,000行数据（完整数据集有72,118行）。数据格式为原始JSONL，采用OpenAI原生消息布局，保留了原始字段如`messages`、`instance_id`、`rollout_patch`、`func_name`、`func_path`、`problem_statement`、`target_patch`和`docker_image`，并添加了指向父数据集的`source`字段。每条助手消息包含一个原生的`tool_calls`数组和一个`train: bool`标志，用于每条消息的损失掩码。数据集适用于axolotl工具，使用`type: chat_template`和`chat_template: chatml`等配置。

Subset of [allenai/Sera-4.5A-Full-T1](https://huggingface.co/datasets/allenai/Sera-4.5A-Full-T1). Size: 1,000 rows (full dataset: 72,118 rows). Format: Raw JSONL, OpenAI-native messages layout. Preserves the original `messages` field (as JSON string), `instance_id`, `rollout_patch`, `func_name`, `func_path`, `problem_statement`, `target_patch`, `docker_image`. Adds a `source` field pointing back to the parent dataset. Each assistant message carries a native `tool_calls` array (OpenAI tool-calling format) and a `train: bool` flag for per-message loss masking — these are **not** flattened into shareGPT. Intended for direct consumption by [axolotl](https://github.com/axolotl-ai-cloud/axolotl) with `type: chat_template`, `chat_template: chatml`, `message_field_training: train`.

提供机构：

laion

5,000+

优质数据集

54 个

任务类型

进入经典数据集