ai_society
收藏魔搭社区2026-01-06 更新2025-09-06 收录
下载链接:
https://modelscope.cn/datasets/camel-ai/ai_society
下载链接
链接失效反馈官方服务:
资源简介:
# **CAMEL: Communicative Agents for “Mind” Exploration of Large Scale Language Model Society**
- **Github:** https://github.com/lightaime/camel
- **Website:** https://www.camel-ai.org/
- **Arxiv Paper:** https://arxiv.org/abs/2303.17760
## Dataset Summary
AI Society dataset is composed of 25K conversations between two gpt-3.5-turbo agents. This dataset is obtained by running role-playing for a combination of 50 user roles and 50 assistant roles with each combination running over 10 tasks.
We provide two formats, one is "chat" format which is `ai_society_chat.tar.gz` file containing the conversational instruction following format. The other format is "instruction" format which is `ai_society_instructions.json`.
## Data Fields
**The data fields for instructions format (`ai_society_instructions.json`) are as follows:**
* `id`: {assistant\_role\_index}\_{user\_role\_index}\_{task\_index}, for example 001_002_003 refers to assistant role 1, user role 2, and task 3 from our text assistant role names, user role names and task text files.
* `role_1`: assistant role
* `role_2`: user role
* `original_task`: the general assigned task for the assistant and user to cooperate on.
* `specified_task`: the task after task specifier, this task is more specific than the original task.
* `role_1_response`: user response text before the instruction.
* `role_1_message_id`: message ID in the full raw conversation.
* `instruction`: describes the task the assistant is supposed to perform.
* `input`: provides further context or information for the requested instruction.
* `output`: the answer to the instruction as generated by 'gpt-3.5-turbo'
* `termination_reason`: refers to the reason of termination of the chat.
**The data fields for chat format (`ai_society_chat.tar.gz`) are as follows:**
* `input`: {assistant\_role\_index}\_{user\_role\_index}\_{task\_index}, for example 001_002_003 refers to assistant role 1, user role 2, and task 3 from our text assistant role names, user role names and task text files.
* `role_1`: assistant role
* `role_2`: user role
* `original_task`: the general assigned task for the assistant and user to cooperate on.
* `specified_task`: the task after task specifier, this task is more specific than the original task.
* `message_k`: refers to the k<sup>_th_</sup> message of the conversation.
* `role_type`: refers to whether the agent is an assistant or a user.
* `role_name`: refers to the assigned assistant/user role.
* `role`: refers to the role of the agent during the message for openai api. [usually not needed]
* `content`: refers to the content of the message.
* `termination_reason`: refers to the reason of termination of the chat.
* `num_messages`: refers to the total number of messages in the chat.
**Download in python**
```
from huggingface_hub import hf_hub_download
hf_hub_download(repo_id="camel-ai/ai_society", repo_type="dataset", filename="ai_society_chat.tar.gz",
local_dir="datasets/", local_dir_use_symlinks=False)
```
### Citation
```
@misc{li2023camel,
title={CAMEL: Communicative Agents for "Mind" Exploration of Large Scale Language Model Society},
author={Guohao Li and Hasan Abed Al Kader Hammoud and Hani Itani and Dmitrii Khizbullin and Bernard Ghanem},
year={2023},
eprint={2303.17760},
archivePrefix={arXiv},
primaryClass={cs.AI}
}
```
## Disclaimer:
This data was synthetically generated by gpt-3.5-turbo and might contain incorrect information. The dataset is there only for research purposes.
---
license: cc-by-nc-4.0
---
# **CAMEL:面向大规模语言模型社群心智探索的对话式智能体**
- **Github:** https://github.com/lightaime/camel
- **Website:** https://www.camel-ai.org/
- **arXiv 论文:** https://arxiv.org/abs/2303.17760
## 数据集概述
AI Society数据集包含2.5万组由两个gpt-3.5-turbo智能体生成的对话。该数据集通过角色扮演生成:共涵盖50种用户角色与50种助手角色的组合,且每一组角色组合均对应超过10项任务。
我们提供两种数据格式:一种为「对话(chat)」格式,对应文件为`ai_society_chat.tar.gz`,内含遵循对话指令格式的会话数据;另一种为「指令(instruction)」格式,对应文件为`ai_society_instructions.json`。
## 数据字段
### 指令格式(`ai_society_instructions.json`)的数据字段如下:
* `id`:格式为`{助手角色索引}_{用户角色索引}_{任务索引}`,例如`001_002_003`代表来自角色名称文本文件、用户角色名称文本文件与任务文本文件中的助手角色1、用户角色2与任务3。
* `role_1`:助手角色
* `role_2`:用户角色
* `original_task`:分配给助手与用户协作完成的通用任务
* `specified_task`:经任务细化后的具体任务,比原始任务更明确
* `role_1_response`:指令生成前的用户回复文本
* `role_1_message_id`:完整原始对话中的消息ID
* `instruction`:描述助手应执行的任务
* `input`:为请求的指令提供进一步的上下文或信息
* `output`:由gpt-3.5-turbo生成的指令对应答案
* `termination_reason`:对话终止的原因
### 对话格式(`ai_society_chat.tar.gz`)的数据字段如下:
* `input`:格式为`{助手角色索引}_{用户角色索引}_{任务索引}`,例如`001_002_003`代表来自角色名称文本文件、用户角色名称文本文件与任务文本文件中的助手角色1、用户角色2与任务3。
* `role_1`:助手角色
* `role_2`:用户角色
* `original_task`:分配给助手与用户协作完成的通用任务
* `specified_task`:经任务细化后的具体任务,比原始任务更明确
* `message_k`:代表对话中的第k条消息
* `role_type`:标识智能体为助手或用户
* `role_name`:分配的助手/用户角色名称
* `role`:该消息中智能体在OpenAI API中的角色标识[通常无需使用]
* `content`:消息的具体内容
* `termination_reason`:对话终止的原因
* `num_messages`:对话中的总消息数
## Python 下载方式
from huggingface_hub import hf_hub_download
hf_hub_download(repo_id="camel-ai/ai_society", repo_type="dataset", filename="ai_society_chat.tar.gz",
local_dir="datasets/", local_dir_use_symlinks=False)
### 引用格式
@misc{li2023camel,
title={CAMEL: Communicative Agents for "Mind" Exploration of Large Scale Language Model Society},
author={Guohao Li and Hasan Abed Al Kader Hammoud and Hani Itani and Dmitrii Khizbullin and Bernard Ghanem},
year={2023},
eprint={2303.17760},
archivePrefix={arXiv},
primaryClass={cs.AI}
}
## 免责声明
本数据集由gpt-3.5-turbo合成生成,可能包含错误信息,仅用于研究用途。
---
许可证:CC BY-NC 4.0
---
提供机构:
maas
创建时间:
2025-09-04



