five

lmms-lab/VisitBench

收藏
Hugging Face2024-03-08 更新2024-06-22 收录
下载链接:
https://hf-mirror.com/datasets/lmms-lab/VisitBench
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: features: - name: instruction_category dtype: string - name: instruction dtype: string - name: reference_output dtype: string - name: is_multiple_images dtype: bool - name: image_0 dtype: image - name: image_1 dtype: image - name: image_2 dtype: image - name: image_3 dtype: image - name: image_4 dtype: image - name: image_5 dtype: image - name: image_6 dtype: image - name: image_7 dtype: image - name: image_8 dtype: image - name: image_9 dtype: image - name: image_info dtype: string - name: human_ratings_gpt4_correct dtype: bool - name: human_ratings_problem_in_caption dtype: bool - name: human_ratings_problem_in_gpt4 dtype: bool - name: public_images_metadata dtype: string splits: - name: multi_images num_bytes: 408530373.0 num_examples: 678 - name: single_image num_bytes: 408530373.0 num_examples: 678 download_size: 813204656 dataset_size: 817060746.0 configs: - config_name: default data_files: - split: multi_images path: data/multi_images-* - split: single_image path: data/single_image-* --- # Dataset Card for "VisitBench" <p align="center" width="100%"> <img src="https://i.postimg.cc/g0QRgMVv/WX20240228-113337-2x.png" width="100%" height="80%"> </p> # Large-scale Multi-modality Models Evaluation Suite > Accelerating the development of large-scale multi-modality models (LMMs) with `lmms-eval` 🏠 [Homepage](https://lmms-lab.github.io/) | 📚 [Documentation](docs/README.md) | 🤗 [Huggingface Datasets](https://huggingface.co/lmms-lab) # This Dataset This is a formatted version of [VistBench](https://visit-bench.github.io/). It is used in our `lmms-eval` pipeline to allow for one-click evaluations of large multi-modality models. ``` @article{bitton2023visit, title={Visit-bench: A benchmark for vision-language instruction following inspired by real-world use}, author={Bitton, Yonatan and Bansal, Hritik and Hessel, Jack and Shao, Rulin and Zhu, Wanrong and Awadalla, Anas and Gardner, Josh and Taori, Rohan and Schimdt, Ludwig}, journal={arXiv preprint arXiv:2308.06595}, year={2023} } ``` Including visit_bench_single.csv and visit_bench_multi.csv, in total 1.2k items. Some of them are with `reference_output`, directly copied from [here](https://docs.google.com/spreadsheets/d/1hi8rGXf2WYufkFvGJ2MZ92JNChliM1QEJwZxNboUFlE/edit#gid=696111549). For each split, please follow the steps here to submit to VisitBench. ## Leaderboard The link to our public leaderboard is present [here](https://visit-bench.github.io/). ## How to add new models to the Leaderboard? 1. You can access the single-image and multiple-image datasets above. 2. For every instance (row) in the dataset csv, you would have your model's predictions. 3. Create a `predictions.csv` with 4 mandatory columns `instruction`, `instruction_category`, `image` (single-image case) / `images` (multi-image case), `<model name> prediction`. Here, `<model name>`should be your model name with version if multiple-versions are available. 4. Send a `prediction.csv` to us on `yonatanbitton1@gmail.com`. 5. We will use our internal prompting sandbox with reference-free GPT-4 as an evaluator. 6. We will add your model to the leaderboard once we receive all the pairwise judgments from the sandbox. 7. You will receive a confirmation email as soon as your model has been added to the leaderboard. 8. Estimated time from Step 4-7 would be 1-2 weeks, however, we will try to work on your prediction files as soon as they are sent. Please include in your email 1) a name for your model, 2) your team name (including your affiliation), and optionally, 3) a github repo or paper link. [More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
提供机构:
lmms-lab
原始信息汇总

数据集概述

数据集信息

  • 特征列表

    • instruction_category:字符串类型
    • instruction:字符串类型
    • reference_output:字符串类型
    • is_multiple_images:布尔类型
    • image_0image_9:图像类型
    • image_info:字符串类型
    • human_ratings_gpt4_correct:布尔类型
    • human_ratings_problem_in_caption:布尔类型
    • human_ratings_problem_in_gpt4:布尔类型
    • public_images_metadata:字符串类型
  • 数据分割

    • multi_images
      • 字节数:408530373.0
      • 样本数:678
    • single_image
      • 字节数:408530373.0
      • 样本数:678
  • 数据集大小

    • 下载大小:813204656
    • 数据集大小:817060746.0

配置信息

  • 默认配置
    • 数据文件:
      • multi_imagesdata/multi_images-*
      • single_imagedata/single_image-*
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作