Simulating Survey Respondents with Large Language Models through Impulse Variables and Persona Conditioning

Name: Simulating Survey Respondents with Large Language Models through Impulse Variables and Persona Conditioning
Creator: Harvard Dataverse
Published: 2026-05-07 21:10:19
License: 暂无描述

DataCite Commons2026-05-07 更新2025-04-15 收录

下载链接：

https://dataverse.harvard.edu/citation?persistentId=doi:10.7910/DVN/RQGKHW

下载链接

链接失效反馈

官方服务：

资源简介：

Human survey samples are slow and expensive to recruit, and psychometric evidence on whether large language models (LLMs) can stand in for human respondents remains thin. This paper proposes a prompting framework that conditions the LLM on demographic variables, Big Five personality anchors, and what we call \emph{impulse variables}, a pragmatic device that assigns each synthetic persona a discrete (low/medium/high) level on the central constructs of a measurement model and counteracts the response homogeneity that frontier LLMs typically display. We validate the framework using GPT-4o-mini, ChatGPT (GPT-4o web interface), and Llama-3.1-70b on a published instrument measuring self-esteem and abusive supervision, and on a held-out networking instrument that was not present in any pre-training corpus. The synthetic data reproduce the directional structure of a recent meta-analysis with acceptable measurement fit, while still showing the inflated reliability and mild positive trait drift typical of LLM responses. The framework makes scale pre-testing and instrument refinement tractable at a fraction of human-sample recruitment cost, with explicit boundary conditions on where it should not be applied.

提供机构：

Harvard Dataverse

创建时间：

2024-09-13

5,000+

优质数据集

54 个

任务类型

进入经典数据集