VikramSingh178/Products-10k-BLIP-captions
收藏Hugging Face2024-05-26 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/VikramSingh178/Products-10k-BLIP-captions
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: image
dtype: image
- name: text
dtype: string
splits:
- name: test
num_bytes: 1024849819
num_examples: 10000
download_size: 1018358664
dataset_size: 1024849819
configs:
- config_name: default
data_files:
- split: test
path: data/test-*
license: mit
language:
- en
tags:
- art
size_categories:
- 1K<n<10K
task_categories:
- visual-question-answering
- question-answering
- text-to-image
---
## Dataset Description
The **Products-10k BLIP CAPTIONS** dataset consists of 10000 images of various products along with their automatically generated captions. The captions are generated using the BLIP (Bootstrapping Language-Image Pre-training) model. This dataset aims to aid in tasks related to image captioning, visual recognition, and product classification.
## Dataset Summary
- **Dataset Name**: Products-10k
- **Generated Captions Model**: Salesforce/blip-image-captioning-large
- **Number of Images**: 10,000
- **Image Formats**: JPEG, PNG
- **Captioning Prompt**: "Photography of"
- **Source**: The images are sourced from a variety of product categories.
## Dataset Structure
The dataset is structured as follows:
- **image**: Contains the product images in RGB format.
- **text**: Contains the generated captions for each product image.
## Usage
You can load and use this dataset with the Hugging Face `datasets` library as follows:
```python
from datasets import load_dataset
dataset = load_dataset("VikramSingh178/Products-10k-BLIP-captions", split="test")
# Display an example
example = dataset[0]
image = example["image"]
caption = example["text"]
image.show()
print("Caption:", caption)
```
```
author = {Yalong Bai, Yuxiang Chen, Wei Yu, Linfang Wang, Wei Zhang},
title = {Products-10K: A Large-scale Product Recognition Dataset},
journal = {arXiv},
year = {2024},
url = {https://arxiv.org/abs/2008.10545}
```
提供机构:
VikramSingh178
原始信息汇总
数据集概述
数据集名称
- 名称: Products-10k
数据集内容
- 类型: 包含图像和文本数据
- 图像: 10,000张产品图像,格式为JPEG和PNG
- 文本: 自动生成的图像标题,使用BLIP模型
数据集结构
- 特征:
- image: 产品图像,RGB格式
- text: 每个产品图像的生成标题
数据集详情
- 生成标题模型: Salesforce/blip-image-captioning-large
- 图像数量: 10,000
- 标题生成提示: "Photography of"
- 来源: 图像来自多种产品类别
数据集使用
-
加载方式: 使用Hugging Face
datasets库加载,示例代码如下: python from datasets import load_datasetdataset = load_dataset("VikramSingh178/Products-10k-BLIP-captions", split="test") example = dataset[0] image = example["image"] caption = example["text"] image.show() print("Caption:", caption)
数据集配置
- 配置名称: default
- 数据文件路径: data/test-*
数据集属性
- 许可证: MIT
- 语言: 英语
- 标签: 艺术
- 大小类别: 1K<n<10K
- 任务类别: 视觉问答, 问答, 文本到图像
数据集分割
- 测试集:
- 示例数量: 10,000
- 字节数: 1024849819
- 下载大小: 1018358664
- 数据集大小: 1024849819



