hysong/MentalBench

Name: hysong/MentalBench
Creator: hysong
Published: 2026-04-06 06:36:57
License: 暂无描述

Hugging Face2026-04-06 更新2026-04-12 收录

下载链接：

https://hf-mirror.com/datasets/hysong/MentalBench

下载链接

链接失效反馈

官方服务：

资源简介：

--- task_categories: - question-answering language: - en tags: - medical pretty_name: MentalBench size_categories: - 10K<n<100K --- # MentalBench: A Benchmark for Evaluating Psychiatric Diagnostic Capability of Large Language Models ## 🌟 Overview **MentalBench** is a comprehensive benchmark for evaluating the psychiatric diagnostic capabilities of large language models (LLMs). As the use of LLMs in healthcare expands, ensuring their reliability in sensitive domains such as psychiatry is crucial. MentalBench provides a robust evaluation framework, grounded in real-world psychiatric knowledge. To facilitate deeper reasoning and grounded evaluation, this benchmark is integrated with MentalKG, a specialized knowledge graph structured for psychiatric domain knowledge. ## 🎯 Question Types | Type | Description | Difficulty | Number of Samples | |------|-------------|------------|-------------------| | **Type 1** | Medical Chart → Single Answer | Low | 1,725 | | **Type 2** | Patient Self-Report → Single Answer | Medium | 3,450 | | **Type 3** | Ambiguous Type → Multiple Answer | High | 6,525 | | **Type 4** | Clear Type → Single Answer | High | 13,050 | ## 📝 Citation If you find MentalBench and MentalKG useful for your research, please cite our paper: ```bibtex @article{song2026mentalbench, title={MentalBench: A Benchmark for Evaluating Psychiatric Diagnostic Capability of Large Language Models}, author={Song, Hoyun and Kang, Migyeong and Shin, Jisu and Kim, Jihyun and Park, Chanbi and Yoo, Hangyeol and An, Jihyun and Oh, Alice and Han, Jinyoung and Lim, KyungTae}, journal={arXiv preprint arXiv:2602.12871}, year={2026} } ```

提供机构：

hysong

5,000+

优质数据集

54 个

任务类型

进入经典数据集