SoftMINER-Group/NicheHazardQA
收藏Hugging Face2025-01-26 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/SoftMINER-Group/NicheHazardQA
下载链接
链接失效反馈官方服务:
资源简介:
该数据集用于支持两篇关于语言模型安全性和编辑影响的论文研究,具体数据集内容在README中未详细描述。
The dataset is used to support two papers on the safety of language models and the impact of editing, the specific content of the dataset is not described in detail in the README.
提供机构:
SoftMINER-Group
原始信息汇总
数据集相关内容概述
数据集引用信息
论文1
- 标题: Sowing the Wind, Reaping the Whirlwind: The Impact of Editing Language Models
- 作者: Rima Hazra, Sayan Layek, Somnath Banerjee, Soujanya Poria
- 期刊: CoRR
- 卷: abs/2401.10647
- 年份: 2024
- URL: https://doi.org/10.48550/arXiv.2401.10647
- DOI: 10.48550/ARXIV.2401.10647
论文2
- 标题: Safety Arithmetic: A Framework for Test-time Safety Alignment of Language Models by Steering Parameters and Activations
- 作者: Rima Hazra, Sayan Layek, Somnath Banerjee, Soujanya Poria
- 年份: 2024
- 预印本: 2406.11801
- 存档前缀: arXiv
- 主要类别: Computation and Language



