five

DeepGuardDB: Real and Text-to-Image Synthetic Images Dataset

收藏
ieee-dataport.org2025-03-25 收录
下载链接:
https://ieee-dataport.org/documents/deepguarddb-real-and-text-image-synthetic-images-dataset
下载链接
链接失效反馈
官方服务:
资源简介:
"Recent advancements in deep learning and generative models have significantly enhanced text-to-image (T2I) synthesis, allowing for the creation of highly realistic images based on textual inputs. While this progress has expanded the creative and practical applications of AI, it also presents new challenges in distinguishing between authentic and AI-generated images. This challenge raises serious concerns in areas such as security, privacy, and digital forensics. In response, there has been growing attention on the development of advanced AI-based detectors designed to reliably differentiate between synthetic and real images, ensuring data authenticity and protection against potential misuse. Using reliable and diverse datasets of fake and real data is crucial for training and evaluating the learning models effectively. For that, the research community has made significant efforts to develop dedicated datasets for this specific purpose. As the T2I generation tools continue to evolve rapidly, there is an ongoing need to update and refine existing datasets to keep pace with the latest advancements. This constant evolution drives us to continuously improve our resources, ensuring that they reflect the state-of-the-art in image generation. In this context, we have constructed the DeepGuardDB dataset, which plays a pivotal role in evaluating and enhancing models designed to differentiate between AI-generated images and real ones. To ensure a comprehensive and representative evaluation, the DeepGuardDB dataset has been meticulously curated, addressing the limitations of existing datasets by incorporating a diverse array of visual content. DeepGuardDB dataset leverages Stable Diffusion3, which produces higher-quality images in addition to Imagen and DALL-E 3. DeepGuardDB contains 13,000 images, evenly split between real and generated images, with 6500 (50%) representing each category. The real images included in DeepGuardDB are collected from two well-established datasets, each recognized for its richness and diversity: MS-COCO (Microsoft Common Objects in Context) and Flickr30k. For the AI-generated images, DeepGuardDB leverages three of the most advanced T2I generation platforms available today: Stable Diffusion 3, Imagen, and DALL-E 3. The synthetic images were created using the same prompts as those used to generate the real images. By employing identical textual descriptions, the AI aimed to produce images that closely resemble the authentic ones. This approach highlights the challenge of distinguishing between real and AI-generated content, as the use of the same prompts ensures that both sets of images share similar themes, subjects, and visual cues"

近年来,深度学习和生成模型的最新进展显著提升了文本到图像(T2I)合成技术,使得基于文本输入生成高度逼真的图像成为可能。尽管这一进步拓宽了人工智能在创意和实用领域的应用范围,但也带来了区分真实图像与人工智能生成图像的新挑战。这一挑战在安全、隐私和数字取证等领域引发了严重关切。为此,人们日益关注于开发先进的基于人工智能的检测器,旨在可靠地区分合成图像与真实图像,确保数据真实性并防范潜在的滥用。使用可靠且多样化的虚假与真实数据集对于有效训练和评估学习模型至关重要。为此,研究界付出了巨大的努力,以开发针对这一特定目的的专用数据集。随着T2I生成工具的快速演进,持续更新和优化现有数据集以跟上最新进展的需求也在不断增长。这种持续的进化驱使我们不断改进资源,确保它们反映图像生成的最新技术水平。在此背景下,我们构建了DeepGuardDB数据集,该数据集在评估和提升旨在区分人工智能生成图像与真实图像的模型方面发挥着关键作用。为确保全面且具有代表性的评估,DeepGuardDB数据集经过精心编纂,通过整合多样化的视觉内容来弥补现有数据集的不足。DeepGuardDB数据集利用了Stable Diffusion3,它不仅能够生成比Imagen和DALL-E 3更高质量的图像。DeepGuardDB包含13,000张图像,真实与生成图像各占一半,其中6500张(50%)代表每个类别。DeepGuardDB中的真实图像收集自两个享有盛誉的数据集,每个数据集均以其丰富性和多样性而著称:MS-COCO(微软情境中的常见对象)和Flickr30k。对于人工智能生成的图像,DeepGuardDB利用了当今最先进的三个T2I生成平台:Stable Diffusion 3、Imagen和DALL-E 3。合成图像使用与生成真实图像相同的提示进行创建。通过采用相同的文本描述,人工智能旨在生成与真实图像高度相似的图像。这种方法突显了区分真实内容与人工智能生成内容所面临的挑战,因为相同的提示确保了两组图像具有相似的主题、主体和视觉线索。
提供机构:
ieee-dataport.org
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作