m6fras/laion2b-aesthetic-squareish-captions
收藏Hugging Face2026-04-03 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/m6fras/laion2b-aesthetic-squareish-captions
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- en
size_categories:
- 100K<n<1M
---
This dataset contains image captions generated from [LAION2B-en-aesthetic-square](https://huggingface.co/opendiffusionai/lain2b-en-aesthetic-square).
We started with ~300K images after size filtering (2.5k max w/h), a portion of the images were skipped due to inaccessible URLs.<br>
The captions were generated over ~30 hours using [Qwen3-VL-30B-A3B-Instruct](https://huggingface.co/Qwen/Qwen3-VL-30B-A3B-Instruct) on 1xH100 running SGLang with the prompt `Describe the content of the provided image in detail, in plaintext. Do not make assumptions. Do not use special formatting. Avoid purple prose.`

Total samples: 209,141<br>
Average caption length: 284.6 characters<br>
Median caption length: 231.0 characters
语言:
- 英语
样本量范围:
- 10万 < 样本量 < 100万
---
本数据集包含源自[LAION2B-en-aesthetic-square](https://huggingface.co/opendiffusionai/lain2b-en-aesthetic-square)的图像字幕(image caption)。
我们首先对约30万张图像进行尺寸筛选(最大宽高为2500像素),由于部分图片的URL无法访问,最终跳过了这些样本。<br>
本数据集的图像字幕通过[Qwen3-VL-30B-A3B-Instruct](https://huggingface.co/Qwen/Qwen3-VL-30B-A3B-Instruct)生成,共耗时约30小时,运行环境为单张H100显卡,搭载SGLang框架,所用提示词为:「详细描述所提供图像的内容,以纯文本形式输出。请勿进行主观臆断,请勿使用特殊格式,避免华而不实的辞藻。」

总样本量:209,141<br>
平均字幕长度:284.6个字符<br>
中位数字幕长度:231.0个字符
提供机构:
m6fras



