Persona Biographies for Guardrail Sensitivity Analysis
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/vli31/llm-guardrail-sensitivity
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了超过225,000次对话模型的请求,使用了各种模拟用户角色来分析ChatGPT-3.5中的护栏敏感性。这些角色基于包括年龄、性别、种族和体育粉丝身份在内的各种人口统计信息生成,旨在评估身份对护栏响应的影响。该数据集的规模超过225,000条请求,任务是对基于用户身份的护栏敏感性进行分析。
This dataset contains over 225,000 requests submitted to conversational models, which adopt various simulated user personas to analyze the guardrail sensitivity of ChatGPT-3.5. These personas are generated based on diverse demographic information including age, gender, race, and sports fan identity, aiming to evaluate the impact of user identities on the model's guardrail responses. With a scale of over 225,000 requests, this dataset is designed for the analysis of guardrail sensitivity based on user identities.
提供机构:
Generated by the authors using ChatGPT



