five

Responses of seven tested models for real clinical questions

收藏
NIAID Data Ecosystem2026-05-10 收录
下载链接:
https://figshare.com/articles/dataset/Responses_of_seven_tested_models_for_real_clinical_questions/30889556
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset contains 14 files, each corresponding to the model responses to 14 question groups, two of which belong to the AE (accident and emergency) specialty, and the other 12 belonging to 12 separate specialties (from cardiology to rare diseases). Each file (in JSON format) contains dozens of items, each corresponding to responses for a specific clinical question. Each question has a unique ID (QID) and is one of two question types, OSCE or realCases. The "modelResponses" field typically is a dictionary with seven keys pointing to seven responses. The seven keys, which is in the format of 6-letter random alphabet followed by a four-letter suffix, represent the seven models tested. The key suffices help identify the models that produced the responses, with explanations below: CG4L: ChatGPT-4oL (first released May 13, 2024) QM212: QwenMAX (Release Date [API Availability]: January 29, 2025) RDS3: DS32B (DeepSeek-R1-Distill-Qwen-32B, Released January 20, 2025) G2FT: Gemini2F (gemini-2.0-flash-thinking-exp-01-21, Released January 21, 2025) Q3B: Qwen32B (Qwen3 32B or QwQ-32B, March 6, 2025) SR67: DS-r1 671B ( Released January 20, 2025) R17: DeS70B (DeepSeek-R1-Distill-Llama-70B, January 20, 2025) hallu-sept-4-2025.xlsx contains the 40 alleged hallucinations.
创建时间:
2025-12-16
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作