SomyaSaraswati/psychoanalysis-dataset-100k

Name: SomyaSaraswati/psychoanalysis-dataset-100k
Creator: SomyaSaraswati
Published: 2025-09-26 10:57:21
License: 暂无描述

Hugging Face2025-09-26 更新2025-10-25 收录

下载链接：

https://hf-mirror.com/datasets/SomyaSaraswati/psychoanalysis-dataset-100k

下载链接

链接失效反馈

官方服务：

资源简介：

Psychoanalysis Synthetic Instruction Dataset (100k)是一个心理分析合成指令数据集，包含10万行数据，分为10个片段，每个片段包含1万行JSONL格式数据。数据集适用于心理分析反思/治疗风格对话领域，涵盖英语和印度语境下的Hinglish（拉丁文写的印地语）。数据集的结构包括聊天风格的messages，以及instruction/input/output，safety和metadata字段。该数据集仅供教育用途，不提供临床建议。数据集只有一个训练集，验证集需要通过train_test_split方法创建。数据生成采用了合成模板和插槽填充方式，不包含诊断/用药指导，并包括升级触发器。

Psychoanalysis Synthetic Instruction Dataset (v1, 100k) is a dataset for psychoanalytic reflection/therapy-style dialogues, containing 100,000 rows of data; 10 shards × 10k JSONL each. The dataset is designed for the domain of psychoanalytic reflection and therapy-style dialogues in the context of English and Hinglish (India context). The schema of the dataset includes chat-style messages, instruction/input/output, safety, and metadata. The dataset is for educational purposes only and does not provide clinical advice. There is only a train split, and validation sets need to be created downstream with train_test_split. Data generation uses synthetic templates and slot-filling, with no diagnosis/medication guidance included, and includes escalation triggers.

提供机构：

SomyaSaraswati

5,000+

优质数据集

54 个

任务类型

进入经典数据集