lmms-lab/VizWiz-Caps

Name: lmms-lab/VizWiz-Caps
Creator: lmms-lab
Published: 2024-03-08 05:00:14
License: 暂无描述

Hugging Face2024-03-08 更新2024-06-22 收录

下载链接：

https://hf-mirror.com/datasets/lmms-lab/VizWiz-Caps

下载链接

链接失效反馈

官方服务：

资源简介：

--- dataset_info: features: - name: image_id dtype: string - name: image dtype: image - name: caption dtype: string - name: text_detected dtype: bool splits: - name: val num_bytes: 3496788418.0 num_examples: 7750 - name: test num_bytes: 3981888752.0 num_examples: 8000 download_size: 7445881828 dataset_size: 7478677170.0 --- # Dataset Card for "VizWiz-Caps" <p align="center" width="100%"> <img src="https://i.postimg.cc/g0QRgMVv/WX20240228-113337-2x.png" width="100%" height="80%"> </p> # Large-scale Multi-modality Models Evaluation Suite > Accelerating the development of large-scale multi-modality models (LMMs) with `lmms-eval` 🏠 [Homepage](https://lmms-lab.github.io/) | 📚 [Documentation](docs/README.md) | 🤗 [Huggingface Datasets](https://huggingface.co/lmms-lab) # This Dataset This is a formatted version of [VizWiz-Caps](https://arxiv.org/abs/2002.08565v2). It is used in our `lmms-eval` pipeline to allow for one-click evaluations of large multi-modality models. ``` @inproceedings{gurari2020captioning, title={Captioning images taken by people who are blind}, author={Gurari, Danna and Zhao, Yinan and Zhang, Meng and Bhattacharya, Nilavra}, booktitle={Computer Vision--ECCV 2020: 16th European Conference, Glasgow, UK, August 23--28, 2020, Proceedings, Part XVII 16}, pages={417--434}, year={2020}, organization={Springer} } ```

提供机构：

lmms-lab

原始信息汇总