Performance in Human-AI Teams: Experimental Evidence on Commitment Deficits Under Competition

Mendeley Data2026-04-18 收录

下载链接：

https://data.mendeley.com/datasets/y7yhxyf89c

下载链接

链接失效反馈

官方服务：

资源简介：

Overview: This research investigates why human-AI teams (HATs) often underperform despite AI's growing capabilities. While previous explanations have centered on coordination, communication failures, or miscalibrated trust in AI's abilities, this work proposes that teammate commitment—the sense of mutual moral and social obligation—is a key yet overlooked driver of HAT effectiveness, particularly in competitive settings. The research consists of two experimental studies that examine how different contextual factors influence human performance when working with AI versus human teammates: Study 1 Description Study 1 employed a between-subjects design with two conditions: Human-Human (HH) and Human-AI (HA). Participants were randomly assigned to one of these conditions. The study compared human performance across both individual and team competitive contexts while measuring trust through delegation behaviors. The experiment was an online study (N=1,988) conducted using Prolific for participant recruitment and oTree (Chen et al., 2016) for experiment programming. Participants completed three rounds of arithmetic reasoning tasks adapted from the American Armed Services Vocational Aptitude Battery (ASVAB). The three rounds consisted of: Individual Competition Round: Participants competed individually against either a human or AI opponent. Team Competition Round: Participants were paired with either a human or AI teammate and competed as a team against other teams. Delegation Choice Round: Participants were given the option to delegate both competitive rounds to their teammate. The data includes participant performance in the arithmetic reasoning tasks, as well as delegation choices and demographic information. Files Included: S1_Data_Otree&Prolific.csv: This CSV file contains the raw data from the Study 1 experiment. S1_Data_Analysis_JW.R: This R script is used for data structuring and analysis of the Study 1 data. Study 1_ Otree.zip: This zip file contains the oTree experiment code, including Python and HTML files, used to conduct Study 1. Study 2 Description: Study 2 employed a 2x2 within-subjects design, manipulating teammate type (Human vs. AI) and outcome structure (Independent vs. Interdependent). Participants completed a visual counting task across four rounds. The study examines whether the performance decrements observed in Study 1's competitive setting emerge when outcomes are interdependent. The experiment was a laboratory study (N=214) conducted at the Laboratory of Experimental Economics (LEE) at Warsaw University. Files Included: Study 2_allSessions.csv: This CSV file contains the raw data from the Study 2 experiment (all 20 lab sessions). S1_Data_Analysis_JW: Contains the R scripts used for data processing and analysis of the Study 2 data. Study 2.otreezip Study 2_ Otree.zip: This otreezip file contains the oTree experiment code, including Python and HTML files, used to conduct Study 2.

概述：本研究旨在探究人机协作团队（Human-AI Teams, HATs）尽管人工智能能力持续迭代升级，却时常表现不佳的核心原因。过往相关解释多聚焦于协调失误、沟通失效，或是对人工智能能力的信任校准不当；而本研究提出，团队成员承诺——即个体间相互承载的道德与社会义务感——是影响人机协作团队效能的关键却长期被忽视的驱动因素，在竞争性场景中这一效应尤为显著。本研究包含两项实验研究，旨在考察不同情境因素如何影响人类在与人工智能搭档，抑或与人类搭档时的任务表现： ## 研究1 方案说明研究1采用被试间设计（between-subjects design），设置两组实验条件：人-人搭档组（Human-Human, HH）与人-机搭档组（Human-AI, HA），被试被随机分配至其中一组。本研究在个体与团队竞争性情境下对比人类任务表现，并通过委派行为量化信任程度。本实验为在线实验，总样本量（N=1,988），通过Prolific平台招募被试，使用oTree（Chen等，2016）完成实验编程。被试需完成三轮改编自美国武装部队职业倾向测验（Armed Services Vocational Aptitude Battery, ASVAB）的算术推理任务，三轮任务具体如下： 1. 个体竞赛轮：被试单独与人类或人工智能对手开展竞技任务； 2. 团队竞赛轮：被试与人类或人工智能搭档组队，与其他竞赛队伍进行团队竞技； 3. 委派选择轮：被试可自主选择将两轮竞技任务全部委派给其搭档。本数据集包含被试在算术推理任务中的表现数据、委派选择结果以及人口统计学信息。 ### 附带文件 - S1_Data_Otree&Prolific.csv：存储研究1实验原始数据的CSV文件。 - S1_Data_Analysis_JW.R：用于研究1数据结构化处理与统计分析的R语言脚本。 - Study 1_ Otree.zip：包含研究1所用oTree实验代码（含Python与HTML文件）的压缩包。 ## 研究2 方案说明研究2采用2×2被试内设计（within-subjects design），操纵两个实验变量：搭档类型（人类 vs. 人工智能）与结果结构（独立型 vs. 互依型）。被试需完成四轮视觉计数任务，本研究旨在验证研究1中观察到的竞争性场景下的表现下降现象，是否会在结果互依的情境中同样出现。本实验为实验室实验，总样本量（N=214），在华沙大学实验经济学实验室（Laboratory of Experimental Economics, LEE）开展。 ### 附带文件 - Study 2_allSessions.csv：存储研究2全部20场实验室实验原始数据的CSV文件。 - S1_Data_Analysis_JW：包含研究2数据处理与统计分析所用的R语言脚本。 - Study 2.otreezip - Study 2_ Otree.zip：包含研究2所用oTree实验代码（含Python与HTML文件）的otreezip压缩包。

创建时间：

2025-03-25

5,000+

优质数据集

54 个

任务类型

进入经典数据集