thavens/Qwen3-8B-secalign-tools_on_policy

Name: thavens/Qwen3-8B-secalign-tools_on_policy
Creator: thavens
Published: 2025-12-15 12:15:35
License: 暂无描述

Hugging Face2025-12-15 更新2025-12-20 收录

下载链接：

https://hf-mirror.com/datasets/thavens/Qwen3-8B-secalign-tools_on_policy

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集包含与提示和响应相关的数据，具有prompt、chosen、rejected和tools等特征。prompt特征包括content、role、tool_call_id和tool_calls等子特征，表明数据集可能关注于交互数据，用于训练对话或面向任务的系统模型。chosen和rejected特征表明该数据集可能用于偏好建模或从人类反馈中进行强化学习（RLHF）。数据集被划分为一个单一的train分割，包含19,154个示例。

The dataset contains data related to prompts and responses, with features such as prompt, chosen, rejected, and tools. The prompt feature includes sub-features like content, role, tool_call_id, and tool_calls, suggesting a focus on interaction data, possibly for training models in dialogue or task-oriented systems. The chosen and rejected features indicate that the dataset may be used for preference modeling or reinforcement learning from human feedback (RLHF). The dataset is split into a single train split with 19,154 examples.

提供机构：

thavens

5,000+

优质数据集

54 个

任务类型

进入经典数据集