Nitral-Archive/ToxistructThinkTag-Primer
收藏Hugging Face2025-06-27 更新2025-07-05 收录
下载链接:
https://hf-mirror.com/datasets/Nitral-Archive/ToxistructThinkTag-Primer
下载链接
链接失效反馈官方服务:
资源简介:
这个数据集包含了约10000条来自HydrusV2的条目和6800条来自Toxicity-ShareGPT的条目,所有条目前都加上了空的<think>标签。这些条目被随机打乱,用于专门的预训练或微调场景。该数据集旨在引导非推理模型生成<think>标签,以及在推理微调过程中作为反审查支架,以防止过度净化。
This dataset consists of approximately 10,000 entries from HydrusV2 and 6,800 entries from Toxicity-ShareGPT, all prefixed with empty <think> tags and randomly shuffled for specialized pre-training or fine-tuning scenarios. It is designed to gently guide non-reasoning models into producing <think> tags and to serve as an anti-censorship scaffold for reasoning models during the fine-tuning process to prevent over-sanitization.
提供机构:
Nitral-Archive



