thavens/Qwen3-8B-secalign-tools_on_policy
收藏Hugging Face2025-12-15 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/thavens/Qwen3-8B-secalign-tools_on_policy
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含与提示和响应相关的数据,具有prompt、chosen、rejected和tools等特征。prompt特征包括content、role、tool_call_id和tool_calls等子特征,表明数据集可能关注于交互数据,用于训练对话或面向任务的系统模型。chosen和rejected特征表明该数据集可能用于偏好建模或从人类反馈中进行强化学习(RLHF)。数据集被划分为一个单一的train分割,包含19,154个示例。
The dataset contains data related to prompts and responses, with features such as prompt, chosen, rejected, and tools. The prompt feature includes sub-features like content, role, tool_call_id, and tool_calls, suggesting a focus on interaction data, possibly for training models in dialogue or task-oriented systems. The chosen and rejected features indicate that the dataset may be used for preference modeling or reinforcement learning from human feedback (RLHF). The dataset is split into a single train split with 19,154 examples.
提供机构:
thavens



