laion/Sera-4.5A-Full-T1-v3-316

Name: laion/Sera-4.5A-Full-T1-v3-316
Creator: laion
Published: 2026-04-22 13:29:28
License: 暂无描述

Hugging Face2026-04-22 更新2026-04-26 收录

下载链接：

https://hf-mirror.com/datasets/laion/Sera-4.5A-Full-T1-v3-316

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集是allenai/Sera-4.5A-Full-T1的一个子集，包含316行数据（完整数据集有72,118行）。数据格式为原始JSONL，采用OpenAI原生消息布局，保留了原始字段如messages、instance_id等，并添加了指向父数据集的source字段。每个助手消息包含一个原生的tool_calls数组和一个train: bool标志，用于每条消息的损失掩码。数据集适用于直接通过axolotl使用，配置为type: chat_template和chat_template: chatml。采样方法为确定性随机，种子为42。

Subset of [allenai/Sera-4.5A-Full-T1](https://huggingface.co/datasets/allenai/Sera-4.5A-Full-T1). Size: 316 rows (full dataset: 72,118 rows). Format: Raw JSONL, OpenAI-native messages layout. Preserves the original `messages` field (as JSON string), `instance_id`, `rollout_patch`, `func_name`, `func_path`, `problem_statement`, `target_patch`, `docker_image`. Adds a `source` field pointing back to the parent dataset. Each assistant message carries a native `tool_calls` array (OpenAI tool-calling format) and a `train: bool` flag for per-message loss masking — these are **not** flattened into shareGPT. Intended for direct consumption by [axolotl](https://github.com/axolotl-ai-cloud/axolotl) with `type: chat_template`, `chat_template: chatml`, `message_field_training: train`. Sampling: deterministic random, seed=42, row-indexed into the full dataset.

提供机构：

laion

5,000+

优质数据集

54 个

任务类型

进入经典数据集