AIFEG/BenchLMM

Name: AIFEG/BenchLMM
Creator: AIFEG
Published: 2023-12-06 18:02:22
License: 暂无描述

Hugging Face2023-12-06 更新2024-03-04 收录

下载链接：

https://hf-mirror.com/datasets/AIFEG/BenchLMM

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: apache-2.0 task_categories: - visual-question-answering language: - en pretty_name: BenchLMM size_categories: - n<1K --- # Dataset Card for BenchLMM BenchLMM is a benchmarking dataset focusing on the cross-style visual capability of large multimodal models. It evaluates these models' performance in various visual contexts. ## Dataset Details ### Dataset Description - **Curated by:** Rizhao Cai, Zirui Song, Dayan Guan, Zhenhao Chen, Xing Luo, Chenyu Yi, and Alex Kot. - **Funded by :** Supported in part by the Rapid-Rich Object Search (ROSE) Lab of Nanyang Technological University and the NTU-PKU Joint Research Institute. - **Shared by :** AIFEG. - **Language(s) (NLP):** English. - **License:** Apache-2.0. ### Dataset Sources - **Repository:** [GitHub - AIFEG/BenchLMM](https://github.com/AIFEG/BenchLMM) - **Paper :** Cai, R., Song, Z., Guan, D., et al. (2023). BenchLMM: Benchmarking Cross-style Visual Capability of Large Multimodal Models. arXiv:2312.02896. ## Uses ### Direct Use The dataset can be used to benchmark large multimodal models, especially focusing on their capability to interpret and respond to different visual styles. ## Dataset Structure - **Directory Structure:** - `baseline/`: Baseline code for LLaVA and InstructBLIP. - `evaluate/`: Python code for model evaluation. - `evaluate_results/`: Evaluation results of baseline models. - `jsonl/`: JSONL files with questions, image locations, and answers. ## Dataset Creation ### Curation Rationale Developed to assess large multimodal models' performance in diverse visual contexts, helping to understand their capabilities and limitations. ### Source Data #### Data Collection and Processing The dataset consists of various visual questions and corresponding answers, structured to evaluate multimodal model performance. ## Bias, Risks, and Limitations Users should consider the specific visual contexts and question types included in the dataset when interpreting model performance. ## Citation **BibTeX:** @misc{cai2023benchlmm, title={BenchLMM: Benchmarking Cross-style Visual Capability of Large Multimodal Models}, author={Rizhao Cai and Zirui Song and Dayan Guan and Zhenhao Chen and Xing Luo and Chenyu Yi and Alex Kot}, year={2023}, eprint={2312.02896}, archivePrefix={arXiv}, primaryClass={cs.CV} } **APA:** Cai, R., Song, Z., Guan, D., Chen, Z., Luo, X., Yi, C., & Kot, A. (2023). BenchLMM: Benchmarking Cross-style Visual Capability of Large Multimodal Models. arXiv preprint arXiv:2312.02896. ## Acknowledgements This research is supported in part by the Rapid-Rich Object Search (ROSE) Lab of Nanyang Technological University and the NTU-PKU Joint Research Institute.

提供机构：

AIFEG

原始信息汇总

数据集概述

名称： BenchLMM

目的： 专注于评估大型多模态模型在跨风格视觉能力方面的性能。

功能： 用于评估模型在多种视觉情境下的表现。

5,000+

优质数据集

54 个

任务类型

进入经典数据集