Multi-Modal CelebA-HQ

Name: Multi-Modal CelebA-HQ
Creator: TediGAN research team
License: 暂无描述

arXiv2025-09-30 收录

下载链接：

https://github.com/weihaox/TediGAN

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集是一个大规模的人脸图像集合，包含了30,000张高分辨率的人脸图片，每张图片都配备了高质量的分割遮罩、素描以及描述性文本。此外，该数据集支持文本引导的多模态合成，并广泛用于评估图像的质量、多样性、准确性和真实性。这一数据集的规模属于大型，其任务重点在于文本引导的图像生成与操作。

This dataset is a large-scale collection of facial images, containing 30,000 high-resolution human face photographs, each equipped with high-quality segmentation masks, sketches, and descriptive textual annotations. Additionally, this dataset supports text-guided multimodal synthesis and is extensively utilized to assess the quality, diversity, accuracy, and authenticity of images. Given its substantial scale, the core tasks of this dataset revolve around text-guided image generation and manipulation.

提供机构：

TediGAN research team

搜集汇总

背景与挑战

背景概述

Multi-Modal CelebA-HQ是一个大规模的人脸图像数据集，包含30,000张高分辨率图片，每张图片都配有分割遮罩、素描和描述性文本，支持文本引导的多模态合成。该数据集主要用于评估图像质量、多样性、准确性和真实性，任务重点在于文本引导的图像生成与操作。

以上内容由遇见数据集搜集并总结生成

5,000+

优质数据集

54 个

任务类型

进入经典数据集