PJMixers-Images/Castollux-Long-Parquet
收藏Hugging Face2025-02-27 更新2025-10-18 收录
下载链接:
https://hf-mirror.com/datasets/PJMixers-Images/Castollux-Long-Parquet
下载链接
链接失效反馈官方服务:
资源简介:
Castollux数据集包含使用Gemini API生成的图像及其对应的描述。描述主要是由模型`gemini-2.0-flash-thinking-exp-01-21`生成,少部分由`gemini-2.0-flash-thinking-exp-1219`生成。该数据集适用于图像到文本的任务,包含的图像可能具有较低的分辨率或压缩痕迹,这些痕迹有助于提高图像到文本的性能。但若用于文本到图像的训练,建议根据分辨率和压缩痕迹进行额外过滤。
The Castollux Dataset consists of images and their corresponding captions generated using the Gemini API. The captions are predominantly generated by the model `gemini-2.0-flash-thinking-exp-01-21`, with a smaller portion generated by `gemini-2.0-flash-thinking-exp-1219`. The dataset is intended for image-to-text tasks and includes images that may have lower resolution or compression artifacts, which are beneficial for improving image-to-text performance. However, for text-to-image training, additional filtering based on resolution and compression artifacts is recommended.
提供机构:
PJMixers-Images



