UKPLab/MAGneT

Name: UKPLab/MAGneT
Creator: UKPLab
Published: 2026-04-23 10:30:51
License: 暂无描述

Hugging Face2026-04-23 更新2026-04-05 收录

下载链接：

https://hf-mirror.com/datasets/UKPLab/MAGneT

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: gpl-2.0 language: - en size_categories: - n<1K --- # Dataset Card Paper: [MAGneT: Coordinated Multi-Agent Generation of Synthetic Multi-Turn Mental Health Counseling Sessions](https://arxiv.org/abs/2509.04183) Language(s) (NLP): English license: gpl-2.0 # Dataset Summary MAGneT is a synthetic counseling session dataset generated using a novel Multi-Agent framework including: specialized response agents (reflection, questioning, solutions, normalizing, psycho-education), a technique agent, a CBT agent, and a response generation agent. The generations are conditioned on client profiles taken from [Lee et al., 2024](https://aclanthology.org/2024.findings-emnlp.832/)). Unlike prior single-agent approaches, MAGneT better captures the structure and nuance of real counseling. Empirically, MAGneT substantially outperforms existing methods: experts prefer MAGneT-generated sessions in 77.2% of cases, and sessions generated by MAGneT yield 3.2% higher general counseling skills and 4.3% higher CBT-specific skills on cognitive therapy rating scale (CTRS). A open source Llama3-8B-Instruct model fine-tuned on MAGneT-generated data also outperforms models fine-tuned using baseline synthetic datasets by 6.9% on average on CTRS. ![MAGneT Framework](MAGneT.png) *An overview of MAGneT Counselor response is generated using specialized response agents (reflection, questioning, solutions, normalizing, psycho-education), a technique agent, a CBT agent, and a response generation agent.* --- ## Dataset Structure Each session is a JSON object with the following fields: ```json { "AI_client": { "name": "...", "age": "...", "presenting_problem": "...", "reason_for_counseling": "..." }, "AI_counselor": { "CBT": "...", "client_information": "...", "init_history_counselor": "...", "Response": "..." }, "dialogue": [ { "role": "counselor", "message": "..." }, { "role": "client", "message": "..." } ] } ``` ### Client Profiles Client profiles are sourced from the **CACTUS** dataset: > *CACTUS: Counseling and Cognitive resTrUcturing Simulation*. In *Findings of EMNLP 2024*. [[ACL Anthology]](https://aclanthology.org/2024.findings-emnlp.832/) --- ## Usage ```python from datasets import load_dataset ds = load_dataset("UKPLab/MAGneT") ``` ## Citation If you use MAGneT or this dataset in your work, please cite: ```bibtex @misc{mandal2025magnetcoordinatedmultiagentgeneration, title={MAGneT: Coordinated Multi-Agent Generation of Synthetic Multi-Turn Mental Health Counseling Sessions}, author={Aishik Mandal and Tanmoy Chakraborty and Iryna Gurevych}, year={2025}, eprint={2509.04183}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2509.04183}, } ``` ## Contact For questions or feedback regarding this dataset, please contact: [aishik.mandal@tu-darmstadt.de](mailto:aishik.mandal@tu-darmstadt.de) ## Ethical Considerations - All client profiles and session dialogues are **synthetically generated** and do not correspond to real individuals. - This dataset is intended strictly for research purposes. It should not be used to deploy autonomous mental health interventions without appropriate clinical oversight. - Models fine-tuned on this dataset should be evaluated carefully before any clinical application.

许可证：GPL-2.0 语言： - 英语数据规模类别： - 样本量小于1000 --- # 数据集卡片论文：（自然语言处理）使用语言：英语许可证：GPL-2.0 # 数据集概述 MAGneT是一款合成心理咨询会话数据集，采用全新的多智能体框架（Multi-Agent Framework）生成，该框架涵盖专属响应智能体（反思类、提问类、解决方案类、正常化类、心理教育类）、技巧智能体、认知行为疗法（Cognitive Behavior Therapy, CBT）智能体以及响应生成智能体。数据集的生成基于Lee等人2024年发布的来访者档案[Lee et al., 2024](https://aclanthology.org/2024.findings-emnlp.832/)。与此前的单智能体方法不同，MAGneT能够更精准地捕捉真实心理咨询的结构与细微差异。实验结果表明，MAGneT的性能显著优于现有基准方法：在77.2%的测试案例中，专家更偏好MAGneT生成的咨询会话；在认知疗法评定量表（CTRS, Cognitive Therapy Rating Scale）上，MAGneT生成的会话可使通用心理咨询技能提升3.2%，认知行为疗法专项技能提升4.3%。基于MAGneT生成数据微调的开源Llama3-8B-Instruct模型，在CTRS上的平均表现也比基于基线合成数据集微调的模型高出6.9%。 ![MAGneT Framework](MAGneT.png) *MAGneT咨询师响应生成流程概览：通过专属响应智能体（反思类、提问类、解决方案类、正常化类、心理教育类）、技巧智能体、认知行为疗法智能体及响应生成智能体生成咨询师响应。* --- ## 数据集结构每个会话为一个JSON对象，包含以下字段： json { "AI_client": { "name": "...", "age": "...", "presenting_problem": "...", "reason_for_counseling": "..." }, "AI_counselor": { "CBT": "...", "client_information": "...", "init_history_counselor": "...", "Response": "..." }, "dialogue": [ { "role": "counselor", "message": "..." }, { "role": "client", "message": "..." } ] } ### 来访者档案来访者档案源自**CACTUS**数据集： > *CACTUS：心理咨询与认知重构模拟*。收录于《EMNLP 2024 研究成果集》（*Findings of EMNLP 2024*）。[[ACL学术文库]](https://aclanthology.org/2024.findings-emnlp.832/) --- ## 使用方式 python from datasets import load_dataset ds = load_dataset("UKPLab/MAGneT") ## 引用若您在研究工作中使用MAGneT或本数据集，请引用以下文献： bibtex @misc{mandal2025magnetcoordinatedmultiagentgeneration, title={MAGneT: Coordinated Multi-Agent Generation of Synthetic Multi-Turn Mental Health Counseling Sessions}, author={Aishik Mandal and Tanmoy Chakraborty and Iryna Gurevych}, year={2025}, eprint={2509.04183}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2509.04183}, } ## 联系方式若您对本数据集有任何疑问或反馈，请联系：[aishik.mandal@tu-darmstadt.de](mailto:aishik.mandal@tu-darmstadt.de) ## 伦理考量 - 所有来访者档案与会话对话均为**合成生成**，不对应任何真实个体。 - 本数据集仅用于学术研究目的。未经适当的临床监管，不得用于部署自主心理健康干预方案。 - 在将基于本数据集微调的模型应用于临床场景前，需进行充分的评估与验证。

提供机构：

UKPLab

5,000+

优质数据集

54 个

任务类型

进入经典数据集