Building Trust with a Teachable Artificial Intelligence: The Case of Repeated Trust Games

Mendeley Data2026-04-09 收录

下载链接：

https://data.mendeley.com/datasets/p2mxm5vhfh/1

下载链接

链接失效反馈

官方服务：

资源简介：

This study explores the teaching of Artificial Intelligence (AI) systems in a repeated trust game. We evaluate whether participants trust the AI and teach it to adopt the most beneficial strategies among a set of four options with different levels of benefit. Results indicate that participants are initially cautious with the AI but increase their trust throughout the experiment, especially those with initially low trust. Participants imperfectly teach the AI, initially adopting beneficial strategies, then progressively learning the most advantageous ones while discarding those that offer minimal or no benefit. We also observe that participants inefficiently choose to avoid positive learning when it involves a risk of large losses. In additional tasks, we observe that participants naturally seek ways to exploit the AI, although a significant portion maintains human fairness in the interaction. Moreover, they effectively transfer prior knowledge to similar tasks. We conclude that participants trust an AI they can teach to generate significant benefits, although the risk of loss limits this trust and the efficiency of teaching.

本研究围绕重复信任博弈场景下的人工智能（Artificial Intelligence）系统教学展开探究，评估参与者是否会信任该人工智能，并教导其在四组收益水平各异的可选策略中选取收益最高的策略。实验结果显示，参与者最初对人工智能持谨慎态度，但在实验过程中逐步提升信任水平，尤其是初始信任度较低的参与者。参与者对人工智能的教学并不完美：初始阶段会采用有益策略，后续逐步学习并选用收益最优的策略，同时摒弃收益微薄或无收益的策略选项。此外，我们观察到，当存在大额损失风险时，参与者会低效地选择回避正向学习过程。在附加任务中，我们发现参与者会自然地寻求利用人工智能的途径，但仍有相当比例的参与者在交互中坚守人类公平准则。此外，参与者能够有效地将先验知识迁移至相似任务场景中。综上，参与者会信任那些可被教导以产生显著收益的人工智能，尽管损失风险会限制这种信任的建立以及教学的效率。

5,000+

优质数据集

54 个

任务类型

进入经典数据集