graf/medical-o1-reasoning-SFT-bonvoyage-19636_recall_0.4-qwen3-4b-medical-o1-ht-nothink

Name: graf/medical-o1-reasoning-SFT-bonvoyage-19636_recall_0.4-qwen3-4b-medical-o1-ht-nothink
Creator: graf
Published: 2026-04-05 23:51:23
License: 暂无描述

Hugging Face2026-04-05 更新2026-04-12 收录

下载链接：

https://hf-mirror.com/datasets/graf/medical-o1-reasoning-SFT-bonvoyage-19636_recall_0.4-qwen3-4b-medical-o1-ht-nothink

下载链接

链接失效反馈

官方服务：

资源简介：

--- dataset_info: features: - name: input struct: - name: Complex_CoT dtype: string - name: Question dtype: string - name: Response dtype: string - name: answer dtype: string - name: chat_template_prompt list: - name: content dtype: string - name: role dtype: string - name: prompt dtype: string - name: responses list: string - name: correctness list: int64 - name: prompt dtype: string - name: expert_response dtype: string - name: onpolicy_expert dtype: bool - name: target_score dtype: float64 splits: - name: train num_bytes: 641578255 num_examples: 19636 download_size: 234344194 dataset_size: 641578255 configs: - config_name: default data_files: - split: train path: data/train-* ---

数据集信息：特征： - 名称：input（输入），结构体如下： - Complex_CoT（复杂思维链）：字符串类型 - Question（问题）：字符串类型 - Response（响应）：字符串类型 - answer（答案）：字符串类型 - chat_template_prompt（聊天模板提示词）：列表类型，列表元素结构体包含： - content（内容）：字符串类型 - role（角色）：字符串类型 - prompt（提示词）：字符串类型 - 名称：responses（响应列表）：字符串列表类型 - 名称：correctness（正确性）：64位整数列表类型 - 名称：prompt（提示词）：字符串类型 - 名称：expert_response（专家响应）：字符串类型 - 名称：onpolicy_expert（在线策略专家标识）：布尔类型 - 名称：target_score（目标得分）：64位浮点类型数据划分： - 名称：train，字节数：641578255，样本数：19636 下载大小：234344194，数据集总大小：641578255 配置项： - 配置名称：default，数据文件： - 对应划分train的路径：data/train-*

提供机构：

graf