moondream2-coyo-5M-captions
收藏魔搭社区2026-01-06 更新2024-06-08 收录
下载链接:
https://modelscope.cn/datasets/swift/moondream2-coyo-5M-captions
下载链接
链接失效反馈官方服务:
资源简介:
## Moondream2 COYO-700M 5M subset captions
A 5-million image, text pair subset of COYO-700M dataset, captioned with Moondream2 (rev=`2024-05-08`). Captioning question is `Write a long caption for this image given the alt text: {alt_text}`.
### Sampling conditions
Randomly sampled from 5 million images from COYO-700M images that fit to the following filters:
```
filters = [
("width", ">=", 256),
("height", ">=", 256),
("aesthetic_score_laion_v2", ">=", 5.2),
("watermark_score", "<=", 0.40),
("clip_similarity_vitl14", ">=", 0.1),
]
```
## Moondream2 COYO-700M 5M 子集字幕数据集
该数据集为COYO-700M数据集的500万图像-文本对子集,由Moondream2(版本修订号=`2024-05-08`)生成字幕。用于生成字幕的提示指令为`Write a long caption for this image given the alt text: {alt_text}`,即`针对该图像,结合备选文本(alt text)撰写一段详细字幕:{alt_text}`。
### 采样条件
该子集从COYO-700M的500万张符合以下筛选条件的图像中随机采样获得:
筛选条件 = [
("图像宽度", ">=", 256),
("图像高度", ">=", 256),
("LAION V2美学评分", ">=", 5.2),
("水印评分", "<=", 0.40),
("CLIP(Contrastive Language-Image Pre-training)ViT-L/14相似度", ">=", 0.1),
]
提供机构:
maas
创建时间:
2024-06-06



