lmms-lab/TextCaps
收藏Hugging Face2024-03-08 更新2024-06-22 收录
下载链接:
https://hf-mirror.com/datasets/lmms-lab/TextCaps
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: question_id
dtype: string
- name: question
dtype: string
- name: image
dtype: image
- name: image_id
dtype: string
- name: image_classes
sequence: string
- name: flickr_original_url
dtype: string
- name: flickr_300k_url
dtype: string
- name: image_width
dtype: int64
- name: image_height
dtype: int64
- name: set_name
dtype: string
- name: image_name
dtype: string
- name: image_path
dtype: string
- name: caption_id
sequence: int64
- name: caption_str
sequence: string
- name: reference_strs
sequence: string
splits:
- name: train
num_bytes: 6201208209.0
num_examples: 21953
- name: val
num_bytes: 919878416.0
num_examples: 3166
- name: test
num_bytes: 959971875.0
num_examples: 3289
download_size: 8064165124
dataset_size: 8081058500.0
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
- split: val
path: data/val-*
- split: test
path: data/test-*
---
<p align="center" width="100%">
<img src="https://i.postimg.cc/g0QRgMVv/WX20240228-113337-2x.png" width="100%" height="80%">
</p>
# Large-scale Multi-modality Models Evaluation Suite
> Accelerating the development of large-scale multi-modality models (LMMs) with `lmms-eval`
🏠 [Homepage](https://lmms-lab.github.io/) | 📚 [Documentation](docs/README.md) | 🤗 [Huggingface Datasets](https://huggingface.co/lmms-lab)
# This Dataset
This is a formatted version of [TextCaps](https://textvqa.org/textcaps/). It is used in our `lmms-eval` pipeline to allow for one-click evaluations of large multi-modality models.
```
@inproceedings{sidorov2019textcaps,
title={TextCaps: a Dataset for Image Captioningwith Reading Comprehension},
author={Sidorov, Oleksii and Hu, Ronghang and Rohrbach, Marcus and Singh, Amanpreet},
journal={European Conference on Computer Vision},
year={2020}
}
```
提供机构:
lmms-lab
原始信息汇总
数据集概述
数据集信息
特征
- question_id: 字符串类型
- question: 字符串类型
- image: 图像类型
- image_id: 字符串类型
- image_classes: 字符串序列
- flickr_original_url: 字符串类型
- flickr_300k_url: 字符串类型
- image_width: 64位整数类型
- image_height: 64位整数类型
- set_name: 字符串类型
- image_name: 字符串类型
- image_path: 字符串类型
- caption_id: 64位整数序列
- caption_str: 字符串序列
- reference_strs: 字符串序列
数据分割
- train:
- 字节数: 6201208209.0
- 样本数: 21953
- val:
- 字节数: 919878416.0
- 样本数: 3166
- test:
- 字节数: 959971875.0
- 样本数: 3289
数据集大小
- 下载大小: 8064165124
- 数据集大小: 8081058500.0
配置
- config_name: default
- 数据文件:
- train: data/train-*
- val: data/val-*
- test: data/test-*
- 数据文件:



