THUMedInfo/RareArena
收藏Hugging Face2026-02-26 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/THUMedInfo/RareArena
下载链接
链接失效反馈官方服务:
资源简介:
RareArena是一个全面的罕见疾病诊断数据集,涵盖近50,000名患者和超过4,000种疾病。数据集仍在开发中,相关论文即将发布。
RareArena is a comprehensive rare disease diagnostic dataset with nearly 50,000 patients covering more than 4000 diseases. The dataset is primarily used for rare disease diagnosis, with evaluation methods including generating the top 5 diagnosis results using a model, then evaluating using GPT-4o, and finally parsing the evaluation output to calculate top-1 and top-5 recall. The data collection process requires reproducing PMC-Patients first, followed by following the pipeline described in the paper.
提供机构:
THUMedInfo
搜集汇总
数据集介绍

背景与挑战
背景概述
THUMedInfo/RareArena是一个专注于罕见疾病诊断的综合数据集,包含近50,000名患者数据,覆盖超过4,000种疾病。该数据集基于PMC病例报告构建,使用GPT-4进行数据处理,适用于问答任务,采用CC BY-NC-SA 4.0许可协议。
以上内容由遇见数据集搜集并总结生成



