five

UKPLab/MAGneT

收藏
Hugging Face2026-04-23 更新2026-04-05 收录
下载链接:
https://hf-mirror.com/datasets/UKPLab/MAGneT
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: gpl-2.0 language: - en size_categories: - n<1K --- # Dataset Card Paper: [MAGneT: Coordinated Multi-Agent Generation of Synthetic Multi-Turn Mental Health Counseling Sessions](https://arxiv.org/abs/2509.04183) Language(s) (NLP): English license: gpl-2.0 # Dataset Summary MAGneT is a synthetic counseling session dataset generated using a novel Multi-Agent framework including: specialized response agents (reflection, questioning, solutions, normalizing, psycho-education), a technique agent, a CBT agent, and a response generation agent. The generations are conditioned on client profiles taken from [Lee et al., 2024](https://aclanthology.org/2024.findings-emnlp.832/)). Unlike prior single-agent approaches, MAGneT better captures the structure and nuance of real counseling. Empirically, MAGneT substantially outperforms existing methods: experts prefer MAGneT-generated sessions in 77.2% of cases, and sessions generated by MAGneT yield 3.2% higher general counseling skills and 4.3% higher CBT-specific skills on cognitive therapy rating scale (CTRS). A open source Llama3-8B-Instruct model fine-tuned on MAGneT-generated data also outperforms models fine-tuned using baseline synthetic datasets by 6.9% on average on CTRS. ![MAGneT Framework](MAGneT.png) *An overview of MAGneT Counselor response is generated using specialized response agents (reflection, questioning, solutions, normalizing, psycho-education), a technique agent, a CBT agent, and a response generation agent.* --- ## Dataset Structure Each session is a JSON object with the following fields: ```json { "AI_client": { "name": "...", "age": "...", "presenting_problem": "...", "reason_for_counseling": "..." }, "AI_counselor": { "CBT": "...", "client_information": "...", "init_history_counselor": "...", "Response": "..." }, "dialogue": [ { "role": "counselor", "message": "..." }, { "role": "client", "message": "..." } ] } ``` ### Client Profiles Client profiles are sourced from the **CACTUS** dataset: > *CACTUS: Counseling and Cognitive resTrUcturing Simulation*. In *Findings of EMNLP 2024*. [[ACL Anthology]](https://aclanthology.org/2024.findings-emnlp.832/) --- ## Usage ```python from datasets import load_dataset ds = load_dataset("UKPLab/MAGneT") ``` ## Citation If you use MAGneT or this dataset in your work, please cite: ```bibtex @misc{mandal2025magnetcoordinatedmultiagentgeneration, title={MAGneT: Coordinated Multi-Agent Generation of Synthetic Multi-Turn Mental Health Counseling Sessions}, author={Aishik Mandal and Tanmoy Chakraborty and Iryna Gurevych}, year={2025}, eprint={2509.04183}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2509.04183}, } ``` ## Contact For questions or feedback regarding this dataset, please contact: [aishik.mandal@tu-darmstadt.de](mailto:aishik.mandal@tu-darmstadt.de) ## Ethical Considerations - All client profiles and session dialogues are **synthetically generated** and do not correspond to real individuals. - This dataset is intended strictly for research purposes. It should not be used to deploy autonomous mental health interventions without appropriate clinical oversight. - Models fine-tuned on this dataset should be evaluated carefully before any clinical application.

许可证:GPL-2.0 语言: - 英语 数据规模类别: - 样本量小于1000 --- # 数据集卡片 论文: (自然语言处理)使用语言:英语 许可证:GPL-2.0 # 数据集概述 MAGneT是一款合成心理咨询会话数据集,采用全新的多智能体框架(Multi-Agent Framework)生成,该框架涵盖专属响应智能体(反思类、提问类、解决方案类、正常化类、心理教育类)、技巧智能体、认知行为疗法(Cognitive Behavior Therapy, CBT)智能体以及响应生成智能体。数据集的生成基于Lee等人2024年发布的来访者档案[Lee et al., 2024](https://aclanthology.org/2024.findings-emnlp.832/)。与此前的单智能体方法不同,MAGneT能够更精准地捕捉真实心理咨询的结构与细微差异。实验结果表明,MAGneT的性能显著优于现有基准方法:在77.2%的测试案例中,专家更偏好MAGneT生成的咨询会话;在认知疗法评定量表(CTRS, Cognitive Therapy Rating Scale)上,MAGneT生成的会话可使通用心理咨询技能提升3.2%,认知行为疗法专项技能提升4.3%。基于MAGneT生成数据微调的开源Llama3-8B-Instruct模型,在CTRS上的平均表现也比基于基线合成数据集微调的模型高出6.9%。 ![MAGneT Framework](MAGneT.png) *MAGneT咨询师响应生成流程概览:通过专属响应智能体(反思类、提问类、解决方案类、正常化类、心理教育类)、技巧智能体、认知行为疗法智能体及响应生成智能体生成咨询师响应。* --- ## 数据集结构 每个会话为一个JSON对象,包含以下字段: json { "AI_client": { "name": "...", "age": "...", "presenting_problem": "...", "reason_for_counseling": "..." }, "AI_counselor": { "CBT": "...", "client_information": "...", "init_history_counselor": "...", "Response": "..." }, "dialogue": [ { "role": "counselor", "message": "..." }, { "role": "client", "message": "..." } ] } ### 来访者档案 来访者档案源自**CACTUS**数据集: > *CACTUS:心理咨询与认知重构模拟*。收录于《EMNLP 2024 研究成果集》(*Findings of EMNLP 2024*)。[[ACL学术文库]](https://aclanthology.org/2024.findings-emnlp.832/) --- ## 使用方式 python from datasets import load_dataset ds = load_dataset("UKPLab/MAGneT") ## 引用 若您在研究工作中使用MAGneT或本数据集,请引用以下文献: bibtex @misc{mandal2025magnetcoordinatedmultiagentgeneration, title={MAGneT: Coordinated Multi-Agent Generation of Synthetic Multi-Turn Mental Health Counseling Sessions}, author={Aishik Mandal and Tanmoy Chakraborty and Iryna Gurevych}, year={2025}, eprint={2509.04183}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2509.04183}, } ## 联系方式 若您对本数据集有任何疑问或反馈,请联系:[aishik.mandal@tu-darmstadt.de](mailto:aishik.mandal@tu-darmstadt.de) ## 伦理考量 - 所有来访者档案与会话对话均为**合成生成**,不对应任何真实个体。 - 本数据集仅用于学术研究目的。未经适当的临床监管,不得用于部署自主心理健康干预方案。 - 在将基于本数据集微调的模型应用于临床场景前,需进行充分的评估与验证。
提供机构:
UKPLab
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作