UKPLab/MAGneT
收藏Hugging Face2026-04-23 更新2026-04-05 收录
下载链接:
https://hf-mirror.com/datasets/UKPLab/MAGneT
下载链接
链接失效反馈官方服务:
资源简介:
---
license: gpl-2.0
language:
- en
size_categories:
- n<1K
---
# Dataset Card
Paper: [MAGneT: Coordinated Multi-Agent Generation of Synthetic Multi-Turn Mental Health Counseling Sessions](https://arxiv.org/abs/2509.04183)
Language(s) (NLP): English
license: gpl-2.0
# Dataset Summary
MAGneT is a synthetic counseling session dataset generated using a novel Multi-Agent framework including: specialized response agents (reflection, questioning, solutions, normalizing, psycho-education), a technique agent, a CBT agent, and a response generation agent. The generations are conditioned on client profiles taken from [Lee et al., 2024](https://aclanthology.org/2024.findings-emnlp.832/)). Unlike prior single-agent approaches, MAGneT better captures the structure and nuance of real counseling. Empirically, MAGneT substantially outperforms existing methods: experts prefer MAGneT-generated sessions in 77.2% of cases, and sessions generated by MAGneT yield 3.2% higher general counseling skills and 4.3% higher CBT-specific skills on cognitive therapy rating scale (CTRS). A open source Llama3-8B-Instruct model fine-tuned on MAGneT-generated data also outperforms models fine-tuned using baseline synthetic datasets by 6.9% on average on CTRS.

*An overview of MAGneT Counselor response is generated using specialized response agents (reflection, questioning, solutions, normalizing, psycho-education), a technique agent, a CBT agent, and a response generation agent.*
---
## Dataset Structure
Each session is a JSON object with the following fields:
```json
{
"AI_client": {
"name": "...",
"age": "...",
"presenting_problem": "...",
"reason_for_counseling": "..."
},
"AI_counselor": {
"CBT": "...",
"client_information": "...",
"init_history_counselor": "...",
"Response": "..."
},
"dialogue": [
{ "role": "counselor", "message": "..." },
{ "role": "client", "message": "..." }
]
}
```
### Client Profiles
Client profiles are sourced from the **CACTUS** dataset:
> *CACTUS: Counseling and Cognitive resTrUcturing Simulation*. In *Findings of EMNLP 2024*. [[ACL Anthology]](https://aclanthology.org/2024.findings-emnlp.832/)
---
## Usage
```python
from datasets import load_dataset
ds = load_dataset("UKPLab/MAGneT")
```
## Citation
If you use MAGneT or this dataset in your work, please cite:
```bibtex
@misc{mandal2025magnetcoordinatedmultiagentgeneration,
title={MAGneT: Coordinated Multi-Agent Generation of Synthetic Multi-Turn Mental Health Counseling Sessions},
author={Aishik Mandal and Tanmoy Chakraborty and Iryna Gurevych},
year={2025},
eprint={2509.04183},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2509.04183},
}
```
## Contact
For questions or feedback regarding this dataset, please contact: [aishik.mandal@tu-darmstadt.de](mailto:aishik.mandal@tu-darmstadt.de)
## Ethical Considerations
- All client profiles and session dialogues are **synthetically generated** and do not correspond to real individuals.
- This dataset is intended strictly for research purposes. It should not be used to deploy autonomous mental health interventions without appropriate clinical oversight.
- Models fine-tuned on this dataset should be evaluated carefully before any clinical application.
许可证:GPL-2.0
语言:
- 英语
数据规模类别:
- 样本量小于1000
---
# 数据集卡片
论文:
(自然语言处理)使用语言:英语
许可证:GPL-2.0
# 数据集概述
MAGneT是一款合成心理咨询会话数据集,采用全新的多智能体框架(Multi-Agent Framework)生成,该框架涵盖专属响应智能体(反思类、提问类、解决方案类、正常化类、心理教育类)、技巧智能体、认知行为疗法(Cognitive Behavior Therapy, CBT)智能体以及响应生成智能体。数据集的生成基于Lee等人2024年发布的来访者档案[Lee et al., 2024](https://aclanthology.org/2024.findings-emnlp.832/)。与此前的单智能体方法不同,MAGneT能够更精准地捕捉真实心理咨询的结构与细微差异。实验结果表明,MAGneT的性能显著优于现有基准方法:在77.2%的测试案例中,专家更偏好MAGneT生成的咨询会话;在认知疗法评定量表(CTRS, Cognitive Therapy Rating Scale)上,MAGneT生成的会话可使通用心理咨询技能提升3.2%,认知行为疗法专项技能提升4.3%。基于MAGneT生成数据微调的开源Llama3-8B-Instruct模型,在CTRS上的平均表现也比基于基线合成数据集微调的模型高出6.9%。

*MAGneT咨询师响应生成流程概览:通过专属响应智能体(反思类、提问类、解决方案类、正常化类、心理教育类)、技巧智能体、认知行为疗法智能体及响应生成智能体生成咨询师响应。*
---
## 数据集结构
每个会话为一个JSON对象,包含以下字段:
json
{
"AI_client": {
"name": "...",
"age": "...",
"presenting_problem": "...",
"reason_for_counseling": "..."
},
"AI_counselor": {
"CBT": "...",
"client_information": "...",
"init_history_counselor": "...",
"Response": "..."
},
"dialogue": [
{ "role": "counselor", "message": "..." },
{ "role": "client", "message": "..." }
]
}
### 来访者档案
来访者档案源自**CACTUS**数据集:
> *CACTUS:心理咨询与认知重构模拟*。收录于《EMNLP 2024 研究成果集》(*Findings of EMNLP 2024*)。[[ACL学术文库]](https://aclanthology.org/2024.findings-emnlp.832/)
---
## 使用方式
python
from datasets import load_dataset
ds = load_dataset("UKPLab/MAGneT")
## 引用
若您在研究工作中使用MAGneT或本数据集,请引用以下文献:
bibtex
@misc{mandal2025magnetcoordinatedmultiagentgeneration,
title={MAGneT: Coordinated Multi-Agent Generation of Synthetic Multi-Turn Mental Health Counseling Sessions},
author={Aishik Mandal and Tanmoy Chakraborty and Iryna Gurevych},
year={2025},
eprint={2509.04183},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2509.04183},
}
## 联系方式
若您对本数据集有任何疑问或反馈,请联系:[aishik.mandal@tu-darmstadt.de](mailto:aishik.mandal@tu-darmstadt.de)
## 伦理考量
- 所有来访者档案与会话对话均为**合成生成**,不对应任何真实个体。
- 本数据集仅用于学术研究目的。未经适当的临床监管,不得用于部署自主心理健康干预方案。
- 在将基于本数据集微调的模型应用于临床场景前,需进行充分的评估与验证。
提供机构:
UKPLab



