five

dzur658/therapy-conversations-full-small

收藏
Hugging Face2026-03-21 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/dzur658/therapy-conversations-full-small
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: apache-2.0 task_categories: - token-classification language: - en tags: - medical size_categories: - 1K<n<10K --- # Therapy Conversations Full Small ## About the dataset This dataset contains 3161 unique, synthetically generated examples, of multi-turn conversations between a patient and a therapist. Minimax M 2.5 was used to generate the transcripts, but there is also associated meta data attached. ## Understanding an Object in the Dataset #### Top Level Keys - `conversation` contains the particular conversation in json format - `fingerprint` contains metadata about the conversation (demographic, personality traits of both parties, etc) - `full text` contains the full text of the transcript in the way it woul be ingested into the [Knowledge graph genrator API](https://github.com/dzur658/therapy-bert/blob/main/knowledge_graph_api.py) - `knowledge_graph_shard` contains the `entities` and `relations` MiniMax M 2.5 extracted #### `conversation` `conversation` contains a `turns` object. The `turns` object contains sub-objects that are each "turn" of the conversation. Each turn is made up of 2 keys `speaker` and `text`. Use this if your trying to inspect lines individually within a conversation, it is pre-parsed for you #### `fingerprint` `fingerprint` contains the following metadata about the conversation: - `age`: 18-75 - `gender`: male, female, transgender woman, transgender man, non-binary - `occupation`: software engineer, teacher, nurse, artist, salesperson, retired, student, unemployed, entrepreneur, stay-at-home parent `presenting_issue`: anxiety, depression, relationship issues, work stress, grief, self-esteem issues, sexuality issues, trauma, substance abuse, eating disorders, chronic illness, gender dysphoria, identity issues, family conflict, life transitions, body dysmorphia, obsessive-compulsive disorder, phobias, sleep disorders, anger management issues - `relationship_status`: single, in a relationship, married, divorced, widowed - `living_situation`: living alone, living with family, living with roommates, living with partner, living in a group home, living in a shelter - `therapy_modality`: cognitive-behavioral therapy, psychodynamic therapy, humanistic therapy, integrative therapy, mindfulness-based therapy, art therapy, dialectical behavior therapy, acceptance and commitment therapy, eye movement desensitization and reprocessing (EMDR), exposure therapy - `patient_speaking_style`: verbose, concise, emotional, logical, intellectualizing, narrative, disorganized, rambling, focused, tangential, metaphorical, literal - `therapist_speaking_style`: empathetic, direct, analytical, supportive, challenging, reflective, encouraging, neutral, collaborative, authoritative, offensive, dismissive, condescending, patronizing, invalidating - `session`: 1-20 (how many sessions the patient and therapists have been seeing each other) - `conversation_length`: the requested conversation length (although this most likely does not match as it was not strictly enforced during generation)

许可证:Apache-2.0 任务类别:令牌分类(token-classification) 语言:英语 标签:医疗 样本规模区间:1000 < n < 10000 # 完整小型治疗对话数据集(Therapy Conversations Full Small) ## 数据集概况 本数据集包含3161条独特的合成生成样例,均为患者与心理治疗师之间的多轮对话。数据集转录稿由Minimax M 2.5生成,同时附带相关元数据。 ## 数据集对象说明 ### 顶层键值 - `conversation`:以JSON格式存储的单条具体对话内容 - `fingerprint`:包含对话相关元数据(如双方人口统计学信息、人格特质等) - `full text`:转录稿的完整文本,可直接输入至[知识图谱生成器API](https://github.com/dzur658/therapy-bert/blob/main/knowledge_graph_api.py) - `knowledge_graph_shard`:包含MiniMax M 2.5提取的`实体(entities)`与`关系(relations)` ### `conversation`字段 该字段包含一个`turns`对象,`turns`对象由若干对话轮次的子对象组成。每一轮对话均包含`speaker`(发言者)与`text`(发言内容)两个键值对。若需单独查看对话中的单条语句,可直接调用该字段,其内容已预先完成解析。 ### `fingerprint`字段 该字段包含对话的如下元数据: - 年龄:18-75岁 - 性别:男性、女性、跨性别女性、跨性别男性、非二元性别 - 职业:软件工程师、教师、护士、艺术家、销售人员、退休人员、学生、失业人员、创业者、全职家长 - 就诊主诉:焦虑、抑郁、人际关系问题、工作压力、悲伤情绪、自尊问题、性取向问题、创伤经历、物质滥用、进食障碍、慢性疾病、性别焦虑、身份认同问题、家庭冲突、人生转折、躯体变形障碍、强迫症、恐惧症、睡眠障碍、愤怒管理问题 - 婚恋状况:单身、恋爱中、已婚、离异、丧偶 - 居住情况:独居、与家人同住、与室友同住、与伴侣同住、集体宿舍、收容所 - 治疗取向:认知行为疗法、精神动力学疗法、人本主义疗法、整合疗法、正念疗法、艺术疗法、辩证行为疗法、接纳与承诺疗法、眼动脱敏与再加工疗法(EMDR)、暴露疗法 - 患者发言风格:冗长型、简洁型、情绪化型、逻辑型、理智化型、叙事型、混乱型、絮叨型、专注型、离题型、隐喻型、直白型 - 治疗师发言风格:共情型、直接型、分析型、支持型、挑战型、反思型、鼓励型、中立型、协作型、权威型、冒犯型、漠视型、居高临下型、傲慢施恩型、否定感受型 - 会谈次数:1-20(表示患者与治疗师已进行的会谈总次数) - 对话长度:预设的对话长度(由于生成过程中未严格执行预设限制,实际长度可能与预设值存在偏差)
提供机构:
dzur658
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作