five

ai_society_translated

收藏
魔搭社区2026-01-08 更新2025-09-06 收录
下载链接:
https://modelscope.cn/datasets/camel-ai/ai_society_translated
下载链接
链接失效反馈
官方服务:
资源简介:
# **CAMEL: Communicative Agents for “Mind” Exploration of Large Scale Language Model Society** - **Github:** https://github.com/lightaime/camel - **Website:** https://www.camel-ai.org/ - **Arxiv Paper:** https://arxiv.org/abs/2303.17760 ## Dataset Summary The original AI Society dataset is in English and is composed of 25K conversations between two gpt-3.5-turbo agents. The dataset is obtained by running role-playing for a combination of 50 user roles and 50 assistant roles with each combination running over 10 tasks. We provide translated versions of the original English dataset into ten languages: Arabic, Chinese, Korean, Japanese, Hindi, Russian, Spanish, French, German, and Italian in ".zip" format. The dataset was translated by a prompting gpt-3.5-turbo to translate presented sentences into a particular language. **Note:** Sometimes gpt decides not to translate particular keywords such as "Instruction", "Input", and "Solution". Therefore, cleaning might be needed depended on your use case. ## Data Fields **The data fields for chat format (`ai_society_chat_{language}.zip`) are as follows:** * `input`: {assistant\_role\_index}\_{user\_role\_index}\_{task\_index}, for example 001_002_003 refers to assistant role 1, user role 2, and task 3 from our text assistant role names, user role names and task text files. * `role_1`: assistant role * `role_2`: user role * `original_task`: the general assigned task for the assistant and user to cooperate on. * `specified_task`: the task after task specifier, this task is more specific than the original task. * `message_k`: refers to the k<sup>_th_</sup> message of the conversation. * `role_type`: refers to whether the agent is an assistant or a user. * `role_name`: refers to the assigned assistant/user role. * `role`: refers to the role of the agent during the message for openai api. [usually not needed] * `content`: refers to the content of the message. * `termination_reason`: refers to the reason of termination of the chat. * `num_messages`: refers to the total number of messages in the chat. **Download in python** ``` from huggingface_hub import hf_hub_download # replace {language} by one of the following: ar, zh, ko, ja, hi, ru, es, fr, de, it hf_hub_download(repo_id="camel-ai/ai_society_translated", repo_type="dataset", filename="ai_society_chat_{language}.zip", local_dir="datasets/", local_dir_use_symlinks=False) ``` ### Citation ``` @misc{li2023camel, title={CAMEL: Communicative Agents for "Mind" Exploration of Large Scale Language Model Society}, author={Guohao Li and Hasan Abed Al Kader Hammoud and Hani Itani and Dmitrii Khizbullin and Bernard Ghanem}, year={2023}, eprint={2303.17760}, archivePrefix={arXiv}, primaryClass={cs.AI} } ``` ## Disclaimer: This data was synthetically generated by gpt-3.5-turbo and might contain incorrect information. The dataset is there only for research purposes. --- license: cc-by-nc-4.0 ---

# **CAMEL:面向大规模语言模型社群心智探索的对话式智能体(Communicative Agents for "Mind" Exploration of Large Scale Language Model Society)** - **Github:** https://github.com/lightaime/camel - **官方网站:** https://www.camel-ai.org/ - **Arxiv论文:** https://arxiv.org/abs/2303.17760 ## 数据集概述 原AI社群数据集为英文版本,由25000轮GPT-3.5-turbo智能体间的对话组成。该数据集通过让50种用户角色与50种助手角色开展角色扮演交互生成,每种角色组合需完成超过10项任务。 我们提供了原始英文数据集的10种语言翻译版本,涵盖阿拉伯语、中文、韩语、日语、印地语、俄语、西班牙语、法语、德语及意大利语,均以".zip"格式封装。 该数据集的翻译通过提示GPT-3.5-turbo将原文语句翻译至目标语言完成。 **注意:** 部分关键词如"Instruction""Input"及"Solution"有时不会被GPT翻译,因此根据你的使用场景,可能需要进行数据清洗。 ## 数据字段 **聊天格式数据集(`ai_society_chat_{language}.zip`)的数据字段如下:** * `input`:格式为`{assistant_role_index}_{user_role_index}_{task_index}`,例如`001_002_003`代表对应助手角色名称、用户角色名称及任务文本文件中的第1号助手角色、第2号用户角色及第3号任务 * `role_1`:助手角色 * `role_2`:用户角色 * `original_task`:分配给助手与用户协作完成的通用任务 * `specified_task`:经过任务说明细化后的具体任务,相较于原始任务更为明确 * `message_k`:代表对话的第k条消息 * `role_type`:标识智能体的身份为助手或用户 * `role_name`:分配的助手/用户角色名称 * `role`:用于OpenAI API的智能体角色标识,通常无需使用 * `content`:消息内容 * `termination_reason`:对话终止的原因 * `num_messages`:对话中的总消息数 ## Python 下载代码 from huggingface_hub import hf_hub_download # 将 {language} 替换为以下任一语言代码:ar, zh, ko, ja, hi, ru, es, fr, de, it hf_hub_download(repo_id="camel-ai/ai_society_translated", repo_type="dataset", filename="ai_society_chat_{language}.zip", local_dir="datasets/", local_dir_use_symlinks=False) ### 引用格式 @misc{li2023camel, title={CAMEL: Communicative Agents for "Mind" Exploration of Large Scale Language Model Society}, author={Guohao Li and Hasan Abed Al Kader Hammoud and Hani Itani and Dmitrii Khizbullin and Bernard Ghanem}, year={2023}, eprint={2303.17760}, archivePrefix={arXiv}, primaryClass={cs.AI} } ## 免责声明 本数据集由GPT-3.5-turbo合成生成,可能包含错误信息,仅用于科研用途。 --- license: cc-by-nc-4.0 ---
提供机构:
maas
创建时间:
2025-09-04
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作