ptx0/comicstrips-gpt4o-blip3
收藏Hugging Face2024-05-26 更新2024-06-15 收录
下载链接:
https://hf-mirror.com/datasets/ptx0/comicstrips-gpt4o-blip3
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mit
configs:
- config_name: default
data_files:
- split: Combined
path: "train.combined.parquet"
- split: GPT4
path: "train.gpt4o.parquet"
- split: BLIP3
path: "train.blip3.parquet"
---
# Comic Strips
## Dataset Details
### Dataset Description
This dataset contains indie comics from Reddit, then captioned with GPT4o and BLIP3.
Currently, only the GPT4o captions are available in this repository. The BLIP3 captions will be uploaded soon.
Roughly 1400 images were captioned at a cost of ~$11 using GPT4o (25 May 2024 version).
- **Curated by:** @pseudoterminalx
- **Funded by** @pseudoterminalx
- **License:** MIT
### Dataset Sources
Unlike other free-to-use datasets released by me, this contains numerous samples of unknown license. This repository relies on the license granted to the user by Reddit.
## Dataset Structure
- caption (str) the GPT4o caption for the sample
- filename (str) the filename for the captioned image
- width, height (int) the size of the image
提供机构:
ptx0
原始信息汇总
漫画条数据集
数据集详情
数据集描述
该数据集包含来自Reddit的独立漫画,并使用GPT4o和BLIP3进行标注。
目前,该仓库中仅提供GPT4o标注。BLIP3标注将很快上传。
大约1400张图像使用GPT4o(2024年5月25日版本)进行标注,成本约为$11。
- 策划者: @pseudoterminalx
- 资助者: @pseudoterminalx
- 许可证: MIT
数据集来源
与其他免费使用的数据集不同,该数据集包含许多未知许可证的样本。该仓库依赖于Reddit授予用户的许可证。
数据集结构
- caption (str) 样本的GPT4o标注
- filename (str) 标注图像的文件名
- width, height (int) 图像的尺寸



