Zihao1/Moral-RolePlay
收藏Hugging Face2025-11-11 更新2025-11-15 收录
下载链接:
https://hf-mirror.com/datasets/Zihao1/Moral-RolePlay
下载链接
链接失效反馈官方服务:
资源简介:
Moral RolePlay 是一个用于评估大型语言模型(LLMs)在扮演道德模糊或反派角色时的真实性的数据集。该数据集包括一个四级的道德对齐量表和一个平衡的测试集用于严格的评估。分析显示,LLMs 在扮演具有与安全原则相悖的特征的角色时存在困难,例如欺骗和操纵。数据集还包括一个 VRP 排行榜,专门评估模型的反派角色扮演能力,表明通用聊天机器人的熟练程度并不能很好地预测这一能力。README 中还包括一个快速入门指南,说明如何访问和使用数据集,以及如何配置和运行使用不同 LLMs 的实验。
The Moral RolePlay dataset is designed to evaluate the authenticity of Large Language Models (LLMs) in role-playing morally ambiguous or villainous characters. It includes a four-level moral alignment scale and a balanced test set for evaluation. The analysis shows that LLMs struggle with portraying characters with traits that conflict with safety principles, such as deceit and manipulation. The dataset also features a VRP Leaderboard to assess models specifically on their villain role-playing capabilities, indicating that general chatbot proficiency is not a good predictor of this ability. The README includes a Quick Start Guide for accessing and using the dataset, as well as instructions for configuring and running experiments with different LLMs.
提供机构:
Zihao1



