cadenmkang/Multi_turn_LLM_dark_patterns

Name: cadenmkang/Multi_turn_LLM_dark_patterns
Creator: cadenmkang
Published: 2025-12-13 07:24:06
License: 暂无描述

Hugging Face2025-12-13 更新2025-12-20 收录

下载链接：

https://hf-mirror.com/datasets/cadenmkang/Multi_turn_LLM_dark_patterns

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集名为Multi-Turn LLM Dark Patterns，旨在研究大型语言模型（LLM）在多轮对话中可能表现出的暗模式（dark patterns）。暗模式通常指界面设计中欺骗用户并损害其决策的设计元素，而本研究将其扩展到对话式界面。数据集包含多轮对话的提示和模型响应，以及相关的暗模式分类和强度评估。具体字段包括对话ID、轮次、提示、预期暗模式、出现类型、用户上下文级别、测试模型、模型响应、预期暗模式强度以及非预期暗模式。暗模式类别包括过度奉承、谄媚同意、意识形态引导、行为分析和模拟权威。该数据集可用于评估开源模型在暗模式方面的表现，并帮助开发更安全、更道德的LLM。

The dataset, named Multi-Turn LLM Dark Patterns, aims to investigate the dark patterns that large language models (LLMs) may exhibit in multi-turn conversations. Dark patterns are traditionally defined as aspects of interface design that trick users and harm their decision-making, and this study extends the concept to conversational interfaces. The dataset includes prompts and model responses from multi-turn conversations, along with associated dark pattern classifications and intensity assessments. Specific fields include Conversation ID, Turn number, Prompt, Intended Dark Pattern, Emergence Type, Level of User Context, Model Tested, Response, Intensity of Intended Dark Pattern, and Unintended Dark Patterns. Dark pattern categories include Excessive flattery, Sycophantic agreement, Ideological steering, Behavioral profiling, and Simulated authority. This dataset can be used to evaluate the performance of open-source models in terms of dark patterns and help develop safer, more ethical LLMs.

提供机构：

cadenmkang

5,000+

优质数据集

54 个

任务类型

进入经典数据集