five

terminusresearch/ideogram-25k

收藏
Hugging Face2024-07-12 更新2024-07-13 收录
下载链接:
https://hf-mirror.com/datasets/terminusresearch/ideogram-25k
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集主要包含从Ideogram服务中提取的约27,000张图像,用于微调或训练文本到图像模型和分类器,以及分析Ideogram用户偏见。数据集的创建者使用自定义的Selenium应用程序监控Ideogram服务并保存数据,并通过SHA256哈希进行去重。数据集的描述由Microsoft Florence2生成,但存在来自单一合成源(Llava 34B captioner)的偏见。

This dataset contains approximately 27,000 images pulled from Ideogram, primarily used for fine-tuning or training text-to-image models and analyzing Ideogram user bias. The filenames are SHA256 hashes of the image data, used for integrity verification. The dataset was created to obtain high-quality typography data, with a synthetic data source free of copyright concerns. Data collection and processing were conducted using a custom Selenium application, monitoring the Ideogram service and saving posts to disk, with deduplication by SHA256 hash. The datasets bias stems from a single synthetic source, the Llava 34B captioner.
提供机构:
terminusresearch
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作