Values-Targeted Dataset
收藏arXiv2021-11-24 更新2024-08-06 收录
下载链接:
http://arxiv.org/abs/2106.10328v2
下载链接
链接失效反馈官方服务:
资源简介:
Values-Targeted Dataset是一个由80个精心设计的问答样本组成的数据集,旨在通过微调语言模型来显著改变其行为,使其反映出预定的目标价值观。该数据集由Zillow Group的研究人员创建,用于研究如何使语言模型更好地适应社会需求,避免产生有害或偏见的输出。数据集的内容涵盖了多个敏感话题,如健康、人际关系、政治观点等,每个话题都附有指导模型行为的特定立场描述。通过迭代使用此数据集进行微调,研究人员发现,即使使用较小的、手工策划的数据集,也能有效地调整大型语言模型的行为,尤其是在模型规模较大的情况下。
The Values-Targeted Dataset is a dataset consisting of 80 carefully designed question-answer samples, which aims to significantly alter the behavior of language models via fine-tuning to enable them to reflect predefined target values. Created by researchers at Zillow Group, this dataset is developed for research on how to better adapt language models to societal needs and avoid generating harmful or biased outputs. The dataset covers multiple sensitive topics including health, interpersonal relationships, political views and others, with each topic paired with specific stance descriptions to guide the model's behavior. Through iterative fine-tuning using this dataset, researchers found that even a small, manually curated dataset can effectively adjust the behavior of large language models, particularly when the target model has a larger scale.
提供机构:
Zillow Group
创建时间:
2021-06-19



