lmms-lab/VisitBench

Name: lmms-lab/VisitBench
Creator: lmms-lab
Published: 2024-03-08 05:15:32
License: 暂无描述

Hugging Face2024-03-08 更新2024-06-22 收录

下载链接：

https://hf-mirror.com/datasets/lmms-lab/VisitBench

下载链接

链接失效反馈

官方服务：

资源简介：

--- dataset_info: features: - name: instruction_category dtype: string - name: instruction dtype: string - name: reference_output dtype: string - name: is_multiple_images dtype: bool - name: image_0 dtype: image - name: image_1 dtype: image - name: image_2 dtype: image - name: image_3 dtype: image - name: image_4 dtype: image - name: image_5 dtype: image - name: image_6 dtype: image - name: image_7 dtype: image - name: image_8 dtype: image - name: image_9 dtype: image - name: image_info dtype: string - name: human_ratings_gpt4_correct dtype: bool - name: human_ratings_problem_in_caption dtype: bool - name: human_ratings_problem_in_gpt4 dtype: bool - name: public_images_metadata dtype: string splits: - name: multi_images num_bytes: 408530373.0 num_examples: 678 - name: single_image num_bytes: 408530373.0 num_examples: 678 download_size: 813204656 dataset_size: 817060746.0 configs: - config_name: default data_files: - split: multi_images path: data/multi_images-* - split: single_image path: data/single_image-* --- # Dataset Card for "VisitBench" <p align="center" width="100%"> <img src="https://i.postimg.cc/g0QRgMVv/WX20240228-113337-2x.png" width="100%" height="80%"> </p> # Large-scale Multi-modality Models Evaluation Suite > Accelerating the development of large-scale multi-modality models (LMMs) with `lmms-eval` 🏠 [Homepage](https://lmms-lab.github.io/) | 📚 [Documentation](docs/README.md) | 🤗 [Huggingface Datasets](https://huggingface.co/lmms-lab) # This Dataset This is a formatted version of [VistBench](https://visit-bench.github.io/). It is used in our `lmms-eval` pipeline to allow for one-click evaluations of large multi-modality models. ``` @article{bitton2023visit, title={Visit-bench: A benchmark for vision-language instruction following inspired by real-world use}, author={Bitton, Yonatan and Bansal, Hritik and Hessel, Jack and Shao, Rulin and Zhu, Wanrong and Awadalla, Anas and Gardner, Josh and Taori, Rohan and Schimdt, Ludwig}, journal={arXiv preprint arXiv:2308.06595}, year={2023} } ``` Including visit_bench_single.csv and visit_bench_multi.csv, in total 1.2k items. Some of them are with `reference_output`, directly copied from [here](https://docs.google.com/spreadsheets/d/1hi8rGXf2WYufkFvGJ2MZ92JNChliM1QEJwZxNboUFlE/edit#gid=696111549). For each split, please follow the steps here to submit to VisitBench. ## Leaderboard The link to our public leaderboard is present [here](https://visit-bench.github.io/). ## How to add new models to the Leaderboard? 1. You can access the single-image and multiple-image datasets above. 2. For every instance (row) in the dataset csv, you would have your model's predictions. 3. Create a `predictions.csv` with 4 mandatory columns `instruction`, `instruction_category`, `image` (single-image case) / `images` (multi-image case), `<model name> prediction`. Here, `<model name>`should be your model name with version if multiple-versions are available. 4. Send a `prediction.csv` to us on `yonatanbitton1@gmail.com`. 5. We will use our internal prompting sandbox with reference-free GPT-4 as an evaluator. 6. We will add your model to the leaderboard once we receive all the pairwise judgments from the sandbox. 7. You will receive a confirmation email as soon as your model has been added to the leaderboard. 8. Estimated time from Step 4-7 would be 1-2 weeks, however, we will try to work on your prediction files as soon as they are sent. Please include in your email 1) a name for your model, 2) your team name (including your affiliation), and optionally, 3) a github repo or paper link. [More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)

提供机构：

lmms-lab

原始信息汇总

数据集概述

数据集信息

特征列表：
- instruction_category：字符串类型
- instruction：字符串类型
- reference_output：字符串类型
- is_multiple_images：布尔类型
- image_0 至 image_9：图像类型
- image_info：字符串类型
- human_ratings_gpt4_correct：布尔类型
- human_ratings_problem_in_caption：布尔类型
- human_ratings_problem_in_gpt4：布尔类型
- public_images_metadata：字符串类型
数据分割：
- multi_images：
  - 字节数：408530373.0
  - 样本数：678
- single_image：
  - 字节数：408530373.0
  - 样本数：678
数据集大小：
- 下载大小：813204656
- 数据集大小：817060746.0

配置信息

默认配置：
- 数据文件：
  - multi_images：data/multi_images-*
  - single_image：data/single_image-*

5,000+

优质数据集

54 个

任务类型

进入经典数据集