five

uclanlp/TemMed-Bench

收藏
Hugging Face2025-10-12 更新2026-01-03 收录
下载链接:
https://hf-mirror.com/datasets/uclanlp/TemMed-Bench
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: - en license: cc-by-4.0 size_categories: - 1K<n<10K task_categories: - visual-question-answering - multiple-choice - text-generation pretty_name: TemMed-Bench configs: - config_name: Image Pair Selection data_files: - split: test path: TestSet_ImagePairSelection.json - config_name: VQA & Report Generation data_files: - split: test path: TestSet_VQA_ReportGeneration.json - config_name: VQA_Selected_2000 data_files: - split: test path: TestSet_SelectedVQA_2000.json - config_name: TrainSet KnowledgeCorpus data_files: - split: train path: TrainSet_KnowledgeCorpus.json --- # TemMed-Bench: Evaluating Temporal Medical Image Reasoning in Vision-Language Models [**🌐 Homepage**](https://temmedbench.github.io/) | [**🐱 Github**](https://github.com/Levi-ZJY/TemMed-Bench) | [**📖 Paper**](https://arxiv.org/abs/2509.25143) ## Intro <img src="./misc/Teaser_Figure.png" width="750" /> TemMed-Bench features three primary highlights. - **Temporal reasoning focus:** Each sample in TemMed-Bench includes historical condition information, which challenges models to analyze changes in patient conditions over time. - **Multi-image input:** Each sample in TemMed-Bench contains multiple images from different visits as input, emphasizing the need for models to process and reason over multiple images. - **Diverse task suite:** TemMed-Bench comprises three tasks, including VQA, report generation, and image-pair selection. Additionally, TemMed-Bench includes a knowledge corpus with more than 17,000 instances to support retrieval-augmented generation (RAG). ## Benchmark Overview - **Examples of the three tasks in TemMed-Bench:** <img src="./misc/Task_Figure.png" width="700" /> - **Key statistics of TemMed-Bench:** <img src="./misc/Data_Amount.png" width="330" /> ## Load Dataset Please refer to [**🐱 Github**](https://github.com/Levi-ZJY/TemMed-Bench) ## Contact * Junyi Zhang: JunyiZhang2002@g.ucla.edu ## Citation ``` @misc{zhang2025temmedbenchevaluatingtemporalmedical, title={TemMed-Bench: Evaluating Temporal Medical Image Reasoning in Vision-Language Models}, author={Junyi Zhang and Jia-Chen Gu and Wenbo Hu and Yu Zhou and Robinson Piramuthu and Nanyun Peng}, year={2025}, eprint={2509.25143}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2509.25143}, } ```
提供机构:
uclanlp
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作