Datasets and code of the paper: 'A personal model of trumpery: Linguistic deception detection in a real-world high-stakes setting'

Name: Datasets and code of the paper: 'A personal model of trumpery: Linguistic deception detection in a real-world high-stakes setting'
Creator: Erasmus University Rotterdam (EUR)
Published: 2025-10-24 13:36:43
License: 暂无描述

DataCite Commons2025-10-24 更新2024-07-13 收录

下载链接：

https://doi.org/10.34894/XPCJNJ

下载链接

链接失效反馈

官方服务：

资源简介：

Paper abstractLanguage use differs between truthful and deceptive statements, but not all differences are consistent across people and contexts, complicating the identification of deceit in individuals. By relying on fact-checked tweets, we show in three studies (Study 1: 469 tweets; Study 2: 484 tweets; Study 3: 24 models) how well personalized linguistic deception detection performs by developing the first deception model tailored to an individual: the 45th US president. First, we found substantial linguistic differences between factually correct and incorrect tweets. We developed a quantitative model and achieved 73% overall accuracy. Second, we tested out-of-sample prediction and achieved 74% overall accuracy. Third, we compared our personalized model to linguistic models previously reported in the literature. Our model outperformed existing models by 5pp, demonstrating the added value of personalized linguistic analysis in real-world settings. Our results indicate that factually incorrect tweets by the US president are not random mistakes of the sender. Additional detailsThe paper is published in Psychological Science (DOI 10.1177/09567976211015941).Datasets and R code are provided.Explanation on how to use the datasets and R code are provided in the methods section of the paper and the supplementary materials. Funded by: European Research Council Starting grant 638408 Bayesian Markets. For more details, see https://cordis.europa.eu/project/id/638408For a website with more background information on this paper, please see https://apersonalmodeloftrumpery.com/

论文摘要真实陈述与欺骗性陈述在语言使用上存在差异，但并非所有差异都能在不同个体与语境中保持一致，这为个体欺骗行为的识别带来了复杂挑战。本研究依托经过事实核查的推文，开展三项实验（实验1：469条推文；实验2：484条推文；实验3：24个模型），通过构建首个针对特定个体——美国第45任总统——的欺骗检测模型，验证了个性化语言欺骗检测的效能。首先，我们发现事实正确与错误的推文在语言特征上存在显著差异，据此构建的量化模型整体准确率达73%。其次，我们开展样本外预测测试，整体准确率达到74%。第三，我们将该个性化模型与此前文献中报道的语言模型进行对比，结果显示本模型较现有模型提升了5个百分点，证明了个性化语言分析在真实场景中的附加价值。研究结果表明，美国总统发布的事实错误推文并非发送者的随机失误。 补充详情本论文发表于《心理科学》（Psychological Science）期刊，DOI为10.1177/09567976211015941。本研究提供数据集与R语言代码。数据集及代码的使用说明详见论文的方法部分与补充材料。本研究受欧洲研究委员会启动基金（Bayesian Markets，项目编号638408）资助，更多详情可访问：https://cordis.europa.eu/project/id/638408。如需获取本论文更多背景信息，可访问网站：https://apersonalmodeloftrumpery.com/

提供机构：

Erasmus University Rotterdam (EUR)

创建时间：

2021-12-21