five

lmms-lab/VizWiz-Caps

收藏
Hugging Face2024-03-08 更新2024-06-22 收录
下载链接:
https://hf-mirror.com/datasets/lmms-lab/VizWiz-Caps
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: features: - name: image_id dtype: string - name: image dtype: image - name: caption dtype: string - name: text_detected dtype: bool splits: - name: val num_bytes: 3496788418.0 num_examples: 7750 - name: test num_bytes: 3981888752.0 num_examples: 8000 download_size: 7445881828 dataset_size: 7478677170.0 --- # Dataset Card for "VizWiz-Caps" <p align="center" width="100%"> <img src="https://i.postimg.cc/g0QRgMVv/WX20240228-113337-2x.png" width="100%" height="80%"> </p> # Large-scale Multi-modality Models Evaluation Suite > Accelerating the development of large-scale multi-modality models (LMMs) with `lmms-eval` 🏠 [Homepage](https://lmms-lab.github.io/) | 📚 [Documentation](docs/README.md) | 🤗 [Huggingface Datasets](https://huggingface.co/lmms-lab) # This Dataset This is a formatted version of [VizWiz-Caps](https://arxiv.org/abs/2002.08565v2). It is used in our `lmms-eval` pipeline to allow for one-click evaluations of large multi-modality models. ``` @inproceedings{gurari2020captioning, title={Captioning images taken by people who are blind}, author={Gurari, Danna and Zhao, Yinan and Zhang, Meng and Bhattacharya, Nilavra}, booktitle={Computer Vision--ECCV 2020: 16th European Conference, Glasgow, UK, August 23--28, 2020, Proceedings, Part XVII 16}, pages={417--434}, year={2020}, organization={Springer} } ```
提供机构:
lmms-lab
原始信息汇总

数据集概述

数据集信息

特征

  • image_id: 字符串类型
  • image: 图像类型
  • caption: 字符串类型
  • text_detected: 布尔类型

数据分割

  • val:
    • 字节数: 3496788418.0
    • 样本数: 7750
  • test:
    • 字节数: 3981888752.0
    • 样本数: 8000

数据大小

  • 下载大小: 7445881828
  • 数据集大小: 7478677170.0
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作