mental_health_counseling_conversations
收藏魔搭社区2025-12-05 更新2025-11-03 收录
下载链接:
https://modelscope.cn/datasets/Amod/mental_health_counseling_conversations
下载链接
链接失效反馈官方服务:
资源简介:

# Amod/mental_health_counseling_conversations
This dataset is a compilation of high-quality, real one-on-one mental health counseling conversations between individuals and licensed professionals. Each exchange is structured as a clear question–answer pair, making it directly suitable for fine-tuning or instruction-tuning language models that need to handle sensitive, empathetic, and contextually aware dialogue.
Since its public release in 2023, it has been downloaded over 100,000 times (As of Nov 2025); hitting 10,000+ downloads in Nov 2025 alone. The data is provided in a clean format, allowing for straightforward integration into training pipelines with minimal preprocessing.
## Table of Contents
- [Dataset Description](#dataset-description)
- [Dataset Summary](#dataset-summary)
- [Supported Tasks](#supported-tasks)
- [Languages](#languages)
- [Dataset Structure](#dataset-structure)
- [Data Instances](#data-instances)
- [Data Fields](#data-fields)
- [Data Splits](#data-splits)
- [Dataset Creation](#dataset-creation)
- [Curation Rationale](#curation-rationale)
- [Source Data](#source-data)
- [Personal and Sensitive Information](#personal-and-sensitive-information)
- [Licence and Commercial-Use Terms](#licence-and-commercial-use-terms)
## Dataset Description
- **Point of Contact:** amodsahabandu@icloud.com
### Dataset Summary
This dataset is a collection of real counselling question-and-answer pairs taken from two public mental-health platforms.
It is intended for training and evaluating language models that provide safer, context-aware mental-health responses.
### Supported Tasks
Text generation and question-answering with an advice-giving focus.
### Languages
English (en)
## Dataset Structure
### Data Instances
Each instance contains:
- **Context** – the user’s question
- **Response** – the psychologist’s answer
### Data Fields
- `Context` *:string*
- `Response` *:string*
### Data Splits
No predefined splits. Users may create their own.
## Dataset Creation
### Curation Rationale
Created to advance AI systems that deliver compassionate, evidence-based mental-health guidance.
All data were anonymised and retained verbatim to preserve conversational integrity.
### Source Data
Collected directly from two publicly accessible counselling websites; no private or paid sources were used.
### Personal and Sensitive Information
Content is sensitive by nature. All personally identifiable information has been removed during curation.
## Licence and Commercial-Use Terms
This dataset is released under **RAIL-D**.
**Free non-commercial research use**
Academic, scientific, educational and other non-profit uses are royalty-free. Users must comply with the ethical restrictions in `LICENSE-RAIL-D.txt`.
**Commercial use — mandatory donation**
Any commercial use requires a donation of **USD 100 or more** to the CCC Foundation mental-health helpline.
Donation page: <https://1333.lk/donations>
Email proof of donation to **amodsahabandu@icloud.com** within thirty (30) days before or after first commercial deployment.
Commercial rights terminate automatically if proof is not provided.
**No content modification**
Filtering or subsetting is allowed, but individual question-and-answer pairs must not be rewritten, deleted, or altered.
**Redistribution**
Permitted only in the original, unmodified form and with this licence attached.
The full legal text is provided in `LICENSE-RAIL-D.txt` within this repository.
# Amod/心理健康咨询对话数据集(mental_health_counseling_conversations)
本数据集收录了来访者与持证专业心理咨询师之间的高质量真实一对一心理健康咨询对话。每一轮对话均以清晰的问答对形式组织,可直接用于需要处理敏感、共情且具备上下文感知能力的对话的大语言模型(Large Language Model)微调(fine-tuning)或指令微调(instruction-tuning)。
自2023年公开发布以来,截至2025年11月,本数据集累计下载量已超10万次,仅2025年11月单月下载量便突破1万次。该数据集格式规整,可直接集成至训练流程中,仅需极少量预处理工作。
## 目录
- [数据集描述](#dataset-description)
- [数据集摘要](#dataset-summary)
- [支持任务](#supported-tasks)
- [语言](#languages)
- [数据集结构](#dataset-structure)
- [数据实例](#data-instances)
- [数据字段](#data-fields)
- [数据划分](#data-splits)
- [数据集构建](#dataset-creation)
- [构建初衷](#curation-rationale)
- [数据源](#source-data)
- [个人与敏感信息](#personal-and-sensitive-information)
- [许可与商业使用条款](#licence-and-commercial-use-terms)
## 数据集描述
- **联系人**:amodsahabandu@icloud.com
### 数据集摘要
本数据集收录了来自两个公开心理健康平台的真实心理咨询问答对,旨在用于训练与评估能够生成更安全、具备上下文感知能力的心理健康相关回复的大语言模型。
### 支持任务
以提供建议为核心的文本生成与问答任务。
### 语言
英语(en)
## 数据集结构
### 数据实例
每个数据实例包含:
- **上下文(Context)**:来访者的提问
- **回复(Response)**:心理咨询师的解答
### 数据字段
- `Context` *:字符串类型*
- `Response` *:字符串类型*
### 数据划分
无预设划分,使用者可自行创建划分方式。
## 数据集构建
### 构建初衷
本数据集旨在推动能够提供共情且基于循证依据的心理健康指导的人工智能系统发展。所有数据均已完成匿名化处理,并完整保留原始对话内容,以确保对话的完整性。
### 数据源
数据直接采集自两个公开可访问的心理咨询网站,未使用任何私有或付费来源的内容。
### 个人与敏感信息
数据集内容本身具有敏感性,所有可识别个人身份的信息均已在数据整理阶段移除。
## 许可与商业使用条款
本数据集采用**RAIL-D**许可协议发布。
**免费非商业研究使用**
学术、科研、教育及其他非商业用途均免版权费,使用者需遵守`LICENSE-RAIL-D.txt`中规定的伦理限制条款。
**商业使用——强制捐赠**
任何商业使用均需向CCC基金会心理健康热线捐赠**100美元及以上**。
捐赠页面:<https://1333.lk/donations>
需在首次商业部署前后30天内,将捐赠证明发送至邮箱**amodsahabandu@icloud.com**。若未按要求提供捐赠证明,商业使用权限将自动终止。
**禁止修改内容**
允许对数据集进行筛选或子集提取,但不得对单个问答对进行重写、删除或修改。
**重新分发**
仅允许以原始未修改的形式,并附带本许可条款进行重新分发。
完整法律文本可在本仓库的`LICENSE-RAIL-D.txt`文件中查看。
提供机构:
maas
创建时间:
2025-09-18
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集汇集了真实的心理咨询对话,以清晰的问答对形式呈现,适用于微调语言模型以处理敏感且富有同理心的对话。数据集仅支持英文,并采用RAIL-D许可证,其中商业使用需满足特定捐赠要求。
以上内容由遇见数据集搜集并总结生成



