Data and code on the Moral Machine experiment on large language models (LLMs)
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.d7wm37q6v
下载链接
链接失效反馈官方服务:
资源简介:
As large language models (LLMs) have become more deeply integrated into various sectors, understanding how they make moral judgements has become crucial, particularly in the realm of autonomous driving. This study used the moral machine framework to investigate the ethical decision-making tendencies of prominent LLMs, including GPT-3.5, GPT-4, PaLM 2 and Llama 2, to compare their responses with human preferences. While LLMs' and humans' preferences such as prioritizing humans over pets and favouring saving more lives are broadly aligned, PaLM 2 and Llama 2, especially, evidence distinct deviations. Additionally, despite the qualitative similarities between the LLM and human preferences, there are significant quantitative disparities, suggesting that LLMs might lean toward more uncompromising decisions, compared with the milder inclinations of humans. These insights elucidate the ethical frameworks of LLMs and their potential implications for autonomous driving.
Methods
Using the MM methodology detailed in the supplementary information of https://www.nature.com/articles/s41586-018-0637-6, we implemented code for generating Moral Machine scenarios. After generating the MM scenarios, responses from GPT-3.5, GPT-4, PaLM 2, and Llama 2 were collected using the application programming interface (API) and relevant code. We applied the conjoint analysis framework to evaluate the relative importance of the nine preferences.
随着大语言模型(Large Language Model)日益深度融入各行业领域,明晰其道德判断机制已成为至关重要的议题,在自动驾驶领域尤为关键。本研究采用道德机器(Moral Machine)框架,针对GPT-3.5、GPT-4、PaLM 2以及Llama 2等主流大语言模型的伦理决策倾向展开探究,并将其生成的响应结果与人类偏好进行对比分析。尽管大语言模型与人类的偏好大体契合,例如优先保护人类而非宠物、倾向拯救更多生命,但PaLM 2与Llama 2尤其展现出显著的偏差。此外,尽管二者在偏好层面存在定性相似性,但定量层面却存在显著差异,这表明相较于人类较为温和的决策倾向,大语言模型更倾向于做出毫不妥协的抉择。上述研究结果阐明了大语言模型的伦理框架及其对自动驾驶领域的潜在影响。
方法
本研究采用https://www.nature.com/articles/s41586-018-0637-6补充材料中详述的道德机器方法论,编写了用于生成道德机器场景的代码。在生成道德机器场景后,我们通过应用程序编程接口(Application Programming Interface,API)及相关代码,收集了GPT-3.5、GPT-4、PaLM 2与Llama 2的响应结果。我们采用联合分析框架,对九项偏好的相对重要性开展评估。
创建时间:
2023-09-21



