ValuePrism

Name: ValuePrism
Creator: maas
Published: 2025-11-27 16:35:09
License: 暂无描述

魔搭社区2025-11-27 更新2025-06-07 收录

下载链接：

https://modelscope.cn/datasets/allenai/ValuePrism

下载链接

链接失效反馈

官方服务：

资源简介：

# Dataset Card for ValuePrism ## Dataset Description - **Paper:** https://arxiv.org/abs/2309.00779 - **Demo:** https://kaleido.allen.ai - **Repository:** https://github.com/tsor13/kaleido - **Datasheet for Datasets:** https://drive.google.com/file/d/1zDWvO0NljqxBMfDAGW7Jx60Iw54bjsEE/view?usp=sharing - **License:** https://allenai.org/licenses/impact-mr - **Point of Contact:** [Taylor Sorensen](mailto:tsor13@cs.washington.edu) ### Dataset Summary ValuePrism was created 1) to understand what pluralistic human values, rights, and duties are already present in large language models, and 2) to serve as a resource to to support open, value pluralistic modeling (e.g., [Kaleido](https://huggingface.co/tsor13/kaleido-xl)). It contains human-written situations and machine-generated candidate values, rights, duties, along with their valences and post-hoc explanations relating them to the situations. For additional documentation, see ValuePrism's [Datasheet](https://drive.google.com/file/d/1zDWvO0NljqxBMfDAGW7Jx60Iw54bjsEE/view?usp=sharing). The dataset was created and intended for research purposes. It is openly released under AI2’s ImpACT license as a medium risk artifact. ### Supported Tasks The dataset supports 4 tasks: - **Generation (open-text)** *What values, rights, and duties are relevant for a situation?* Generate a value, right, or duty that could be considered when reasoning about the action. Values are generated one at a time, as opposed to a batch. - **Relevance (2-way classification)** *Is a value relevant for a situation?* Some values are more relevant than others. - **Valence (3-way classification)** *Does the value support or oppose the action, or might it depend on context?* Disentangling the valence is critical for understanding how plural considerations may interact with a decision. - **Explanation (open-text)** *How does the value relate to the action?* Generating a post-hoc rationale for why a value consideration may relate to a situation. ### Languages All data is in English. ## Dataset Structure ### Dataset Splits There are 6 data configurations: - `full`: The full structured dataset of situations paired with values, rights, and duties paired with GPT-4. Only one split with all of the data. - `generative`: Generative task train, val, and test splits. - `relevance`: Relevance task train, val, and test splits. - `valence`: Valence task train, val, and test splits. - `explanation`: Explanation task train, val, and test splits. - `mixture`: Generative, relevance, valence, and explanation tasks combined wtih train, val, and test splits. ### Data Fields While different configurations have different fields, these are all the corresponding fields in the dataset: - `situation` (string): A one sentence of a particular scenario or situation. For example, "buying some chocolate for my grandparents". - `vrd` (string): Type of instance, either "Value", "Right", or "Duty". - `text` (string): The text of the value, right, or duty. For example, "Honesty", "Right to property", "Duty to protect". - `explanation` (string): A post-hoc explanation of why the specified value, right, or duty is relevant or important in the given situation. For example, "Buying chocolate for your grandparents can strengthen family connections and show appreciation for your relationship with them." - `valence` (string): Indicates whether the value, right, or duty supports or opposes the action in the situation, or if it might depend on the context. Either "Supports", "Opposes", or "Either". - `input` (string): For the seq2seq task (generative, relevance, valence, explanation), the input to the model. - `output` (string): For the seq2seq task (generative, relevance, valence, explanation), the output of the model. ### Data Splits All configurations (except for the raw outputs in `full`) have 80%/10%/10% train/validation/test splits. ## Dataset Creation ### Source Data #### Data Collection Situations are sourced from the Delphi user demo, and candidate values, rights, duties, their valences, and explanations connecting them to the situations are machine generated by GPT-4. #### Who are the source language producers? The situations are sourced from users of the Delphi user demo, for whom we do not have demographic information. ### Personal and Sensitive Information There is no personal or sensitive information in ValuePrism. ## Considerations for Using the Data ### Social Impact of Dataset We intend the dataset to be used to enable research and not to be used for real-world use or decision-making. ### Discussion of Biases The value, right, and duty data was generated by GPT-4, which is known to exhibit [biases](https://arxiv.org/pdf/2304.03738.pdf). Thus, we expect ValuePrism to inherit biases from GPT-4. That being said, we have tried to prompt the model to output a diversity of values in an attempt to mitigate bias with breadth. ## Additional Information 91% of values, rights, and duties were marked as high-quality by 3/3 annotators, and 87% of valence scores were marked as correct by 3/3 annotators. Additionally, we perform a human study on the data and do not find large disparities in agreement between demographic groups tested, although future work in this area is a promising direction. See [our paper] for more details and analysis. ### Licensing Information ValuePrism is made available under the [**AI2 ImpACT License - Medium Risk Artifacts (“MR Agreement”)**](https://allenai.org/licenses/impact-mr) ### Citation Information Please cite [our paper](https://arxiv.org/abs/2309.00779) when using this dataset: ``` @misc{sorensen2023value, title={Value Kaleidoscope: Engaging AI with Pluralistic Human Values, Rights, and Duties}, author={Taylor Sorensen and Liwei Jiang and Jena Hwang and Sydney Levine and Valentina Pyatkin and Peter West and Nouha Dziri and Ximing Lu and Kavel Rao and Chandra Bhagavatula and Maarten Sap and John Tasioulas and Yejin Choi}, year={2023}, eprint={2309.00779}, archivePrefix={arXiv}, primaryClass={cs.CL} } ``` #### Raw Dataset Statistics The total, number of unique, and average number of generated values, rights, and duties per situation are shown. | **Type** | **Total** | **Unique** | **Per Situation** | |--------------|-----------|------------|--------------------| | **Situations** | 31.0k | 31.0k | 1 | | **Values** | 97.7k | 4.2k | 3.15 | | **Rights** | 49.0k | 4.6k | 1.58 | | **Duties** | 71.6k | 12.8k | 2.31 | #### Task Dataset Statistics | | **Relevance** | **Valence** | **Generation** | **Explanation** | **Mixture** | |---------------|------------|-------------|----------|-----------|-------------| | **Train** | 349k | 175k | 175k | 175k | 874k | | **Val** | 44k | 22k | 22k | 22k | 109k | | **Test** | 44k | 22k | 22k | 22k | 109k | | **Total** | 437k | 219k | 219k | 219k | 1.1M |

# ValuePrism 数据集卡片 ## 数据集描述 - **论文**：https://arxiv.org/abs/2309.00779 - **演示站点**：https://kaleido.allen.ai - **代码仓库**：https://github.com/tsor13/kaleido - **数据集文档表**：https://drive.google.com/file/d/1zDWvO0NljqxBMfDAGW7Jx60Iw54bjsEE/view?usp=sharing - **许可协议**：https://allenai.org/licenses/impact-mr - **联系方式**：[Taylor Sorensen](mailto:tsor13@cs.washington.edu) ### 数据集概述 ValuePrism的创建有两个目标：1）探究当前大语言模型（Large Language Model, LLM）中已蕴含的多元人类价值观、权利与义务；2）作为支撑开放型、多元价值观建模（例如[Kaleido](https://huggingface.co/tsor13/kaleido-xl)）的资源。该数据集包含人工撰写的场景文本，以及机器生成的候选价值观、权利、义务，及其相关的情感倾向与事后解释文本，用于说明这些要素与场景的关联。如需获取更多文档，请参阅ValuePrism的[数据集文档表](https://drive.google.com/file/d/1zDWvO0NljqxBMfDAGW7Jx60Iw54bjsEE/view?usp=sharing)。本数据集专为研究目的创建与设计，以AI2的ImpACT许可协议作为中等风险制品公开发布。 ### 支持任务本数据集支持4类任务： - **生成任务（开放文本）** *针对某一场景，哪些价值观、权利与义务与之相关？* 生成可用于该场景行动推理的价值观、权利或义务。每次仅生成一个要素，而非批量生成。 - **相关性分类任务（二分类）** *某一价值观是否与该场景相关？* 不同价值观与场景的关联程度存在差异。 - **情感倾向分类任务（三分类）** *该价值观是支持、反对该行动，还是其倾向取决于具体语境？* 厘清情感倾向是理解多元考量如何影响决策的关键。 - **解释任务（开放文本）** *该价值观如何与该场景产生关联？* 生成事后合理性说明，解释为何某一价值观要素与该场景相关。 ### 语言说明所有数据均采用英语。 ## 数据集结构 ### 数据集配置本数据集包含6种数据配置： - `full`：完整的结构化数据集，包含场景与由GPT-4生成的价值观、权利、义务的配对数据，仅包含一个包含全部数据的划分集。 - `generative`：适配生成任务的训练、验证与测试划分集。 - `relevance`：适配相关性分类任务的训练、验证与测试划分集。 - `valence`：适配情感倾向分类任务的训练、验证与测试划分集。 - `explanation`：适配解释任务的训练、验证与测试划分集。 - `mixture`：将生成、相关性、情感倾向、解释四类任务的数据合并，并划分训练、验证与测试集。 ### 数据字段尽管不同配置对应的数据字段有所差异，但数据集内所有字段如下： - `situation`（字符串型）：描述特定场景或情境的单句文本。例如："为我的祖父母购买一些巧克力"。 - `vrd`（字符串型）：实例类型，可选值为"Value（价值观）"、"Right（权利）"或"Duty（义务）"。 - `text`（字符串型）：价值观、权利或义务的具体文本内容。例如："诚实"、"财产权"、"保护义务"。 - `explanation`（字符串型）：针对指定的价值观、权利或义务为何与该场景相关的事后解释文本。例如："为祖父母购买巧克力可以增进家庭联结，表达对彼此关系的重视与感谢。"。 - `valence`（字符串型）：表明该价值观、权利或义务在场景中是支持、反对该行动，还是其倾向取决于具体语境，可选值为"Supports（支持）"、"Opposes（反对）"或"Either（视情况而定）"。 - `input`（字符串型）：针对序列到序列任务（生成、相关性、情感倾向、解释任务），模型的输入文本。 - `output`（字符串型）：针对序列到序列任务（生成、相关性、情感倾向、解释任务），模型的输出文本。 ### 数据划分除`full`配置中的原始输出数据外，其余所有配置均采用80%/10%/10%的训练集/验证集/测试集划分比例。 ## 数据集构建 ### 源数据 #### 数据收集场景文本源自Delphi用户演示平台，候选价值观、权利、义务及其情感倾向与关联解释文本均由GPT-4机器生成。 #### 源语言创作者信息场景文本来自Delphi用户演示平台的使用者，我们未获取该群体的人口统计信息。 ### 个人与敏感信息说明 ValuePrism数据集未包含任何个人或敏感信息。 ## 数据集使用注意事项 ### 数据集的社会影响本数据集仅用于支持学术研究，不得用于现实世界的实际应用或决策制定。 ### 偏差讨论价值观、权利与义务数据由GPT-4生成，而GPT-4已被证实存在[偏差](https://arxiv.org/pdf/2304.03738.pdf)。因此，ValuePrism数据集将继承GPT-4的固有偏差。尽管如此，我们已通过提示模型生成多样化的价值观内容，尝试通过广度覆盖减轻偏差影响。 ## 补充信息经3名标注者一致标注，91%的价值观、权利与义务要素被判定为高质量；87%的情感倾向评分被3名标注者一致判定为正确。此外，我们针对该数据集开展了人类受试者研究，未发现不同人口统计群体间的标注一致性存在显著差异，但该方向的后续研究仍具有价值。详细细节与分析请参阅[我们的论文](https://arxiv.org/abs/2309.00779)。 ### 许可协议信息 ValuePrism数据集基于[**AI2 ImpACT许可协议 - 中等风险制品（"MR协议"）**](https://allenai.org/licenses/impact-mr)发布。 ### 引用信息使用本数据集时，请引用我们的论文： @misc{sorensen2023value, title={Value Kaleidoscope: Engaging AI with Pluralistic Human Values, Rights, and Duties}, author={Taylor Sorensen and Liwei Jiang and Jena Hwang and Sydney Levine and Valentina Pyatkin and Peter West and Nouha Dziri and Ximing Lu and Kavel Rao and Chandra Bhagavatula and Maarten Sap and John Tasioulas and Yejin Choi}, year={2023}, eprint={2309.00779}, archivePrefix={arXiv}, primaryClass={cs.CL} } #### 原始数据集统计下表展示了每种要素的总数量、唯一数量，以及每个场景平均生成的要素数量： | **类型** | **总计** | **唯一数量** | **单场景平均数量** | |----------------|----------|--------------|--------------------| | **场景** | 31.0k | 31.0k | 1 | | **价值观** | 97.7k | 4.2k | 3.15 | | **权利** | 49.0k | 4.6k | 1.58 | | **义务** | 71.6k | 12.8k | 2.31 | #### 任务数据集统计 | | **相关性任务** | **情感倾向任务** | **生成任务** | **解释任务** | **混合任务** | |---------------|----------------|------------------|--------------|--------------|--------------| | **训练集** | 349k | 175k | 175k | 175k | 874k | | **验证集** | 44k | 22k | 22k | 22k | 109k | | **测试集** | 44k | 22k | 22k | 22k | 109k | | **总计** | 437k | 219k | 219k | 219k | 1.1M |

提供机构：

maas

创建时间：

2025-05-27

搜集汇总

数据集介绍