five

ai_society

收藏
魔搭社区2026-01-06 更新2025-09-06 收录
下载链接:
https://modelscope.cn/datasets/camel-ai/ai_society
下载链接
链接失效反馈
官方服务:
资源简介:
# **CAMEL: Communicative Agents for “Mind” Exploration of Large Scale Language Model Society** - **Github:** https://github.com/lightaime/camel - **Website:** https://www.camel-ai.org/ - **Arxiv Paper:** https://arxiv.org/abs/2303.17760 ## Dataset Summary AI Society dataset is composed of 25K conversations between two gpt-3.5-turbo agents. This dataset is obtained by running role-playing for a combination of 50 user roles and 50 assistant roles with each combination running over 10 tasks. We provide two formats, one is "chat" format which is `ai_society_chat.tar.gz` file containing the conversational instruction following format. The other format is "instruction" format which is `ai_society_instructions.json`. ## Data Fields **The data fields for instructions format (`ai_society_instructions.json`) are as follows:** * `id`: {assistant\_role\_index}\_{user\_role\_index}\_{task\_index}, for example 001_002_003 refers to assistant role 1, user role 2, and task 3 from our text assistant role names, user role names and task text files. * `role_1`: assistant role * `role_2`: user role * `original_task`: the general assigned task for the assistant and user to cooperate on. * `specified_task`: the task after task specifier, this task is more specific than the original task. * `role_1_response`: user response text before the instruction. * `role_1_message_id`: message ID in the full raw conversation. * `instruction`: describes the task the assistant is supposed to perform. * `input`: provides further context or information for the requested instruction. * `output`: the answer to the instruction as generated by 'gpt-3.5-turbo' * `termination_reason`: refers to the reason of termination of the chat. **The data fields for chat format (`ai_society_chat.tar.gz`) are as follows:** * `input`: {assistant\_role\_index}\_{user\_role\_index}\_{task\_index}, for example 001_002_003 refers to assistant role 1, user role 2, and task 3 from our text assistant role names, user role names and task text files. * `role_1`: assistant role * `role_2`: user role * `original_task`: the general assigned task for the assistant and user to cooperate on. * `specified_task`: the task after task specifier, this task is more specific than the original task. * `message_k`: refers to the k<sup>_th_</sup> message of the conversation. * `role_type`: refers to whether the agent is an assistant or a user. * `role_name`: refers to the assigned assistant/user role. * `role`: refers to the role of the agent during the message for openai api. [usually not needed] * `content`: refers to the content of the message. * `termination_reason`: refers to the reason of termination of the chat. * `num_messages`: refers to the total number of messages in the chat. **Download in python** ``` from huggingface_hub import hf_hub_download hf_hub_download(repo_id="camel-ai/ai_society", repo_type="dataset", filename="ai_society_chat.tar.gz", local_dir="datasets/", local_dir_use_symlinks=False) ``` ### Citation ``` @misc{li2023camel, title={CAMEL: Communicative Agents for "Mind" Exploration of Large Scale Language Model Society}, author={Guohao Li and Hasan Abed Al Kader Hammoud and Hani Itani and Dmitrii Khizbullin and Bernard Ghanem}, year={2023}, eprint={2303.17760}, archivePrefix={arXiv}, primaryClass={cs.AI} } ``` ## Disclaimer: This data was synthetically generated by gpt-3.5-turbo and might contain incorrect information. The dataset is there only for research purposes. --- license: cc-by-nc-4.0 ---

# **CAMEL:面向大规模语言模型社群心智探索的对话式智能体** - **Github:** https://github.com/lightaime/camel - **Website:** https://www.camel-ai.org/ - **arXiv 论文:** https://arxiv.org/abs/2303.17760 ## 数据集概述 AI Society数据集包含2.5万组由两个gpt-3.5-turbo智能体生成的对话。该数据集通过角色扮演生成:共涵盖50种用户角色与50种助手角色的组合,且每一组角色组合均对应超过10项任务。 我们提供两种数据格式:一种为「对话(chat)」格式,对应文件为`ai_society_chat.tar.gz`,内含遵循对话指令格式的会话数据;另一种为「指令(instruction)」格式,对应文件为`ai_society_instructions.json`。 ## 数据字段 ### 指令格式(`ai_society_instructions.json`)的数据字段如下: * `id`:格式为`{助手角色索引}_{用户角色索引}_{任务索引}`,例如`001_002_003`代表来自角色名称文本文件、用户角色名称文本文件与任务文本文件中的助手角色1、用户角色2与任务3。 * `role_1`:助手角色 * `role_2`:用户角色 * `original_task`:分配给助手与用户协作完成的通用任务 * `specified_task`:经任务细化后的具体任务,比原始任务更明确 * `role_1_response`:指令生成前的用户回复文本 * `role_1_message_id`:完整原始对话中的消息ID * `instruction`:描述助手应执行的任务 * `input`:为请求的指令提供进一步的上下文或信息 * `output`:由gpt-3.5-turbo生成的指令对应答案 * `termination_reason`:对话终止的原因 ### 对话格式(`ai_society_chat.tar.gz`)的数据字段如下: * `input`:格式为`{助手角色索引}_{用户角色索引}_{任务索引}`,例如`001_002_003`代表来自角色名称文本文件、用户角色名称文本文件与任务文本文件中的助手角色1、用户角色2与任务3。 * `role_1`:助手角色 * `role_2`:用户角色 * `original_task`:分配给助手与用户协作完成的通用任务 * `specified_task`:经任务细化后的具体任务,比原始任务更明确 * `message_k`:代表对话中的第k条消息 * `role_type`:标识智能体为助手或用户 * `role_name`:分配的助手/用户角色名称 * `role`:该消息中智能体在OpenAI API中的角色标识[通常无需使用] * `content`:消息的具体内容 * `termination_reason`:对话终止的原因 * `num_messages`:对话中的总消息数 ## Python 下载方式 from huggingface_hub import hf_hub_download hf_hub_download(repo_id="camel-ai/ai_society", repo_type="dataset", filename="ai_society_chat.tar.gz", local_dir="datasets/", local_dir_use_symlinks=False) ### 引用格式 @misc{li2023camel, title={CAMEL: Communicative Agents for "Mind" Exploration of Large Scale Language Model Society}, author={Guohao Li and Hasan Abed Al Kader Hammoud and Hani Itani and Dmitrii Khizbullin and Bernard Ghanem}, year={2023}, eprint={2303.17760}, archivePrefix={arXiv}, primaryClass={cs.AI} } ## 免责声明 本数据集由gpt-3.5-turbo合成生成,可能包含错误信息,仅用于研究用途。 --- 许可证:CC BY-NC 4.0 ---
提供机构:
maas
创建时间:
2025-09-04
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作