ChatDoctor-RL
收藏魔搭社区2025-12-05 更新2025-07-05 收录
下载链接:
https://modelscope.cn/datasets/Intelligent-Internet/ChatDoctor-RL
下载链接
链接失效反馈官方服务:
资源简介:
# Intelligent-Internet/ChatDoctor-Improved-Answer Dataset
This dataset represents a carefully curated subset derived from the original ChatDoctor-HealthCareMagic-100k[lavita/ChatDoctor-HealthCareMagic-100k] dataset, where we have undertaken significant improvements to enhance the quality and depth of the responses. The answers have been thoroughly refined to provide greater detail, clarity, and precision, while incorporating a heightened focus on safety awareness to ensure responsible and secure usage.
# Data Decontamination
To ensure the integrity and reliability of the dataset for RL training, a rigorous two-step decontamination process was applied:
## 8-grams Decontamination
Followed the open-r1 methodology to identify and eliminate overlap with evaluation datasets using 8-gram sequences.
This step ensures that the dataset does not contain sequences that could bias evaluation results.
## Fuzzy Decontamination
Applied the s1k method with a stringent 80% similarity threshold to further remove any near-duplicate or highly similar samples.
This additional step guarantees minimal overlap with evaluation datasets, preserving the dataset's purity.
# Intelligent-Internet/ChatDoctor-Improved-Answer 数据集
本数据集源自原始ChatDoctor-HealthCareMagic-100k[lavita/ChatDoctor-HealthCareMagic-100k]数据集的精心筛选子集,我们对其开展了大幅优化工作,以提升回复内容的质量与深度。所有答案均经过全面精修,以提供更丰富的细节、更清晰的逻辑与更精准的表述,同时进一步强化了安全意识相关内容,确保数据集可被负责任且安全地使用。
# 数据净化(Data Decontamination)
为确保用于强化学习(RL)训练的数据集的完整性与可靠性,我们采用了严格的两步数据净化流程:
## 8-gram 数据净化(8-grams Decontamination)
本次流程遵循open-r1方法学,通过8元组(8-gram)序列识别并移除与评测数据集存在重叠的内容。该步骤可确保数据集不存在可能导致评测结果出现偏差的序列片段。
## 模糊净化(Fuzzy Decontamination)
本次流程采用s1k方法,设置严格的80%相似度阈值,进一步移除所有近似重复或高度相似的样本。该额外步骤可确保数据集与评测数据集的重叠度极低,保障数据集的纯净性。
提供机构:
maas
创建时间:
2025-07-04



