UCSC-VLAA/STAR-1
收藏Hugging Face2025-04-04 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/UCSC-VLAA/STAR-1
下载链接
链接失效反馈官方服务:
资源简介:
STAR-1是一个高质量的安全数据集,旨在提高大型推理模型(LRMs)如DeepSeek-R1的安全对齐性。该数据集基于多样性、深思熟虑的推理和严格的过滤原则构建,整合和优化了多个来源的数据,提供了以政策为基础的推理样本。数据集中包含1000个经过精心挑选的示例,每个示例都通过基于GPT-4o的评价与最佳安全实践对齐。使用STAR-1进行微调可以在多个基准测试中显著提高安全性,同时对推理能力的影响最小。
STAR-1 is a high-quality safety dataset designed to enhance safety alignment in large reasoning models (LRMs) like DeepSeek-R1. It is built on the principles of diversity, deliberative reasoning, and rigorous filtering, integrating and refining data from multiple sources to provide policy-grounded reasoning samples. The dataset contains 1,000 carefully selected examples, each aligned with best safety practices through GPT-4o-based evaluation. Fine-tuning with STAR-1 leads to significant safety improvements across multiple benchmarks, with minimal impact on reasoning capabilities.
提供机构:
UCSC-VLAA



