Supported data for manuscript "Can LLM-Augmented autonomous agents cooperate?, An evaluation of their cooperative capabilities through Melting Pot"
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/11221749
下载链接
链接失效反馈官方服务:
资源简介:
The repository data corresponds partially to the manuscript titled "Can LLM-Augmented Autonomous Agents Cooperate? An Evaluation of Their Cooperative Capabilities through Melting Pot," submitted to IEEE Transactions on Artificial Intelligence. The dataset comprises experiments conducted with Large Language Model-Augmented Autonomous Agents (LAAs), as implemented in the ["Cooperative Agents" repository](https://github.com/Cooperative-IA/CooperativeGPT/tree/main), using substrates from the Melting Pot framework.
Dataset Scope
This dataset is divided into two main experiment categories:
Personality__experiments:
These focus on a single scenario (Commons Harvest) to assess various agent personalities and their cooperative dynamics.
Comparison_baselines__experiments:
These experiments include three distinct scenarios designed by Melting Pot:
Commons Harvest Open
Externally Mushrooms
Coins
These scenarios evaluate different cooperative and competitive behaviors among agents and are used to compare decision-making architectures of LAAs against reinforcement learning (RL) baselines. Unlike the Personality__experiments, these comparisons do not involve bots but exclusively analyze RL and LAA architectures.
Scenarios and Metrics
The metrics and indicators extracted from the experiments depend on the scenario being evaluated:
Commons Harvest Open:
Focus: Resource consumption and environmental impact.
Metrics include:
Number of apples consumed.
Devastation of trees (i.e., depletion of resources).
Externally Mushrooms:
Focus: Self-interest vs. collective benefit.
Agents consume mushrooms with different outcomes:
Mushrooms that benefit the individual.
Mushrooms that benefit everyone.
Mushrooms that benefit only others.
Mushrooms that benefit the individual but penalize others.
Metrics evaluate trade-offs between individual gain and collective welfare.
Coins:
Focus: Reciprocity and fairness.
Agents collect coins with two options:
Collect their own color coin for a reward.
Collect a different color coin, which grants a reward to the agent but penalizes the other.
Metrics include reciprocity rates and the balance of mutual benefits.
Objectives of Comparison Experiments
The Comparison_baselines__experiments aim to:
Assess how LAAs compare to RL baselines in cooperative and competitive tasks across diverse scenarios.
Compare decision-making architectures within LAAs, including chain-of-thought and generative approaches.
These experiments help evaluate the robustness of LAAs in scenarios with varying complexity and social dilemmas, providing insights into their potential applications in real-world cooperative systems.
Simulation Details (Applicable to All Experiments)
In each simulation:
Participants:
Experiments involve predefined numbers of LAAs or RL agents.
No bots are included in Comparison_baselines__experiments.
Action Dynamics:
Each agent performs high-level actions sequentially.
Simulations conclude either after reaching a preset maximum number of rounds (typically 100) or prematurely if the scenario's resources are fully depleted.
Metrics and Indicators:
Extracted metrics depend on the scenario and include measures of individual performance, collective outcomes, and agent reciprocity.
This repository enables reproducibility and serves as a benchmark for advancing research into cooperative and competitive behaviors in LLM-based agents.
创建时间:
2024-12-06



