five

thavens/Qwen3-8B-secalign-tools_on_policy

收藏
Hugging Face2025-12-15 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/thavens/Qwen3-8B-secalign-tools_on_policy
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集包含与提示和响应相关的数据,具有prompt、chosen、rejected和tools等特征。prompt特征包括content、role、tool_call_id和tool_calls等子特征,表明数据集可能关注于交互数据,用于训练对话或面向任务的系统模型。chosen和rejected特征表明该数据集可能用于偏好建模或从人类反馈中进行强化学习(RLHF)。数据集被划分为一个单一的train分割,包含19,154个示例。

The dataset contains data related to prompts and responses, with features such as prompt, chosen, rejected, and tools. The prompt feature includes sub-features like content, role, tool_call_id, and tool_calls, suggesting a focus on interaction data, possibly for training models in dialogue or task-oriented systems. The chosen and rejected features indicate that the dataset may be used for preference modeling or reinforcement learning from human feedback (RLHF). The dataset is split into a single train split with 19,154 examples.
提供机构:
thavens
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作