uclanlp/TemMed-Bench

Name: uclanlp/TemMed-Bench
Creator: uclanlp
Published: 2025-10-12 11:58:40
License: 暂无描述

Hugging Face2025-10-12 更新2026-01-03 收录

下载链接：

https://hf-mirror.com/datasets/uclanlp/TemMed-Bench

下载链接

链接失效反馈

官方服务：

资源简介：

--- language: - en license: cc-by-4.0 size_categories: - 1K<n<10K task_categories: - visual-question-answering - multiple-choice - text-generation pretty_name: TemMed-Bench configs: - config_name: Image Pair Selection data_files: - split: test path: TestSet_ImagePairSelection.json - config_name: VQA & Report Generation data_files: - split: test path: TestSet_VQA_ReportGeneration.json - config_name: VQA_Selected_2000 data_files: - split: test path: TestSet_SelectedVQA_2000.json - config_name: TrainSet KnowledgeCorpus data_files: - split: train path: TrainSet_KnowledgeCorpus.json --- # TemMed-Bench: Evaluating Temporal Medical Image Reasoning in Vision-Language Models [**🌐 Homepage**](https://temmedbench.github.io/) | [**🐱 Github**](https://github.com/Levi-ZJY/TemMed-Bench) | [**📖 Paper**](https://arxiv.org/abs/2509.25143) ## Intro <img src="./misc/Teaser_Figure.png" width="750" /> TemMed-Bench features three primary highlights. - **Temporal reasoning focus:** Each sample in TemMed-Bench includes historical condition information, which challenges models to analyze changes in patient conditions over time. - **Multi-image input:** Each sample in TemMed-Bench contains multiple images from different visits as input, emphasizing the need for models to process and reason over multiple images. - **Diverse task suite:** TemMed-Bench comprises three tasks, including VQA, report generation, and image-pair selection. Additionally, TemMed-Bench includes a knowledge corpus with more than 17,000 instances to support retrieval-augmented generation (RAG). ## Benchmark Overview - **Examples of the three tasks in TemMed-Bench:** <img src="./misc/Task_Figure.png" width="700" /> - **Key statistics of TemMed-Bench:** <img src="./misc/Data_Amount.png" width="330" /> ## Load Dataset Please refer to [**🐱 Github**](https://github.com/Levi-ZJY/TemMed-Bench) ## Contact * Junyi Zhang: JunyiZhang2002@g.ucla.edu ## Citation ``` @misc{zhang2025temmedbenchevaluatingtemporalmedical, title={TemMed-Bench: Evaluating Temporal Medical Image Reasoning in Vision-Language Models}, author={Junyi Zhang and Jia-Chen Gu and Wenbo Hu and Yu Zhou and Robinson Piramuthu and Nanyun Peng}, year={2025}, eprint={2509.25143}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2509.25143}, } ```

提供机构：

uclanlp

5,000+

优质数据集

54 个

任务类型

进入经典数据集