five

ecnu-icalk/PsychEval

收藏
Hugging Face2026-01-13 更新2026-02-07 收录
下载链接:
https://hf-mirror.com/datasets/ecnu-icalk/PsychEval
下载链接
链接失效反馈
官方服务:
资源简介:
PsychEval是一个综合性基准测试,旨在评估大型语言模型(LLMs)在心理咨询场景中的表现。与现有基准测试不同,PsychEval强调纵向、多会话的咨询过程和多疗法能力。数据集包含6-10次会话的完整咨询周期,分为三个阶段:案例概念化、核心干预和巩固。数据集支持不同治疗方法(如CBT、SFBT)的评估,并包含大量专业技能的标注(677种元技能和4577种原子技能)。此外,PsychEval引入了多代理评估框架,包括客户模拟器和监督代理,以确保评估的可靠性和真实性。

PsychEval is a comprehensive benchmark designed to evaluate Large Language Models (LLMs) in the context of psychological counseling. Unlike existing benchmarks, PsychEval emphasizes longitudinal, multi-session counseling processes and multi-therapy capabilities. The dataset contains full counseling cycles spanning 6-10 sessions per case, divided into three stages: case conceptualization, core intervention, and consolidation. It supports evaluation across different therapeutic approaches (e.g., CBT, SFBT) and includes extensive professional skill annotations (677 meta-skills and 4577 atomic skills). Additionally, PsychEval introduces a multi-agent evaluation framework involving a Client Simulator and a Supervisor Agent to ensure reliable and realistic assessments.
提供机构:
ecnu-icalk
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作