five

student/FFHQ

收藏
Hugging Face2022-04-16 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/student/FFHQ
下载链接
链接失效反馈
官方服务:
资源简介:
FFHQ 70000张png图片 链接:https://pan.baidu.com/s/1XDfTKWOhtwAAQQJ0KBU4RQ 提取码:bowj ## Flickr-Faces-HQ Dataset (FFHQ) ![Python 3.6](https://img.shields.io/badge/python-3.6-green.svg?style=plastic) ![License CC](https://img.shields.io/badge/license-CC-green.svg?style=plastic) ![Format PNG](https://img.shields.io/badge/format-PNG-green.svg?style=plastic) ![Resolution 1024&times;1024](https://img.shields.io/badge/resolution-1024&times;1024-green.svg?style=plastic) ![Images 70000](https://img.shields.io/badge/images-70,000-green.svg?style=plastic) ![Teaser image](./ffhq-teaser.png) Flickr-Faces-HQ (FFHQ) is a high-quality image dataset of human faces, originally created as a benchmark for generative adversarial networks (GAN): > **A Style-Based Generator Architecture for Generative Adversarial Networks**<br> > Tero Karras (NVIDIA), Samuli Laine (NVIDIA), Timo Aila (NVIDIA)<br> > http://stylegan.xyz/paper The dataset consists of 70,000 high-quality PNG images at 1024&times;1024 resolution and contains considerable variation in terms of age, ethnicity and image background. It also has good coverage of accessories such as eyeglasses, sunglasses, hats, etc. The images were crawled from [Flickr](https://www.flickr.com/), thus inheriting all the biases of that website, and automatically aligned and cropped using [dlib](http://dlib.net/). Only images under permissive licenses were collected. Various automatic filters were used to prune the set, and finally [Amazon Mechanical Turk](https://www.mturk.com/) was used to remove the occasional statues, paintings, or photos of photos. For business inquiries, please contact [researchinquiries@nvidia.com](mailto:researchinquiries@nvidia.com) For press and other inquiries, please contact Hector Marinez at [hmarinez@nvidia.com](mailto:hmarinez@nvidia.com) ## Licenses The individual images were published in Flickr by their respective authors under either [Creative Commons BY 2.0](https://creativecommons.org/licenses/by/2.0/), [Creative Commons BY-NC 2.0](https://creativecommons.org/licenses/by-nc/2.0/), [Public Domain Mark 1.0](https://creativecommons.org/publicdomain/mark/1.0/), [Public Domain CC0 1.0](https://creativecommons.org/publicdomain/zero/1.0/), or [U.S. Government Works](http://www.usa.gov/copyright.shtml) license. All of these licenses allow **free use, redistribution, and adaptation for non-commercial purposes**. However, some of them require giving **appropriate credit** to the original author, as well as **indicating any changes** that were made to the images. The license and original author of each image are indicated in the metadata. * [https://creativecommons.org/licenses/by/2.0/](https://creativecommons.org/licenses/by/2.0/) * [https://creativecommons.org/licenses/by-nc/2.0/](https://creativecommons.org/licenses/by-nc/2.0/) * [https://creativecommons.org/publicdomain/mark/1.0/](https://creativecommons.org/publicdomain/mark/1.0/) * [https://creativecommons.org/publicdomain/zero/1.0/](https://creativecommons.org/publicdomain/zero/1.0/) * [http://www.usa.gov/copyright.shtml](http://www.usa.gov/copyright.shtml) The dataset itself (including JSON metadata, download script, and documentation) is made available under [Creative Commons BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/) license by NVIDIA Corporation. You can **use, redistribute, and adapt it for non-commercial purposes**, as long as you (a) give appropriate credit by **citing our paper**, (b) **indicate any changes** that you've made, and (c) distribute any derivative works **under the same license**. * [https://creativecommons.org/licenses/by-nc-sa/4.0/](https://creativecommons.org/licenses/by-nc-sa/4.0/) ## Overview All data is hosted on Google Drive: | Path | Size | Files | Format | Description | :--- | :--: | ----: | :----: | :---------- | [ffhq-dataset](https://drive.google.com/open?id=1u2xu7bSrWxrbUxk-dT-UvEJq8IjdmNTP) | 2.56 TB | 210,014 | | Main folder | &boxvr;&nbsp; [ffhq-dataset-v1.json](https://drive.google.com/open?id=1IB0BFbN_eRZx9UkJqLHSgJiQhqX-PrI6) | 254 MB | 1 | JSON | Metadata including copyright info, URLs, etc. | &boxvr;&nbsp; [images1024x1024](https://drive.google.com/open?id=1u3Hbfn3Q6jsTlte3BY85CGwId77H-OOu) | 89.1 GB | 70,000 | PNG | Aligned and cropped images at 1024&times;1024 | &boxvr;&nbsp; [thumbnails128x128](https://drive.google.com/open?id=1uJkWCpLUM-BnXW3H_IgVMdfENeNDFNmC) | 1.95 GB | 70,000 | PNG | Thumbnails at 128&times;128 | &boxvr;&nbsp; [in-the-wild-images](https://drive.google.com/open?id=1YyuocbwILsHAjTusSUG-_zL343jlVBhf) | 955 GB | 70,000 | PNG | Original images from Flickr | &boxvr;&nbsp; [tfrecords](https://drive.google.com/open?id=1LTBpJ0W_WLjqza3zdayligS8Dh1V1gA6) | 273 GB | 9 | tfrecords | Multi-resolution data for [StyleGAN](http://stylegan.xyz/code) and [ProGAN](https://github.com/tkarras/progressive_growing_of_gans) | &boxur;&nbsp; [zips](https://drive.google.com/open?id=1WocxvZ4GEZ1DI8dOz30aSj2zT6pkATYS) | 1.28 TB | 4 | ZIP | Contents of each folder as a ZIP archive. High-level statistics: ![Pie charts](./ffhq-piecharts.png) For use cases that require separate training and validation sets, we have appointed the first 60,000 images to be used for training and the remaining 10,000 for validation. In the [StyleGAN paper](http://stylegan.xyz/paper), however, we used all 70,000 images for training. We have explicitly made sure that there are no duplicate images in the dataset itself. However, please note that the `in-the-wild` folder may contain multiple copies of the same image in cases where we extracted several different faces from the same image. ## Download script You can either grab the data directly from Google Drive or use the provided [download script](./download_ffhq.py). The script makes things considerably easier by automatically downloading all the requested files, verifying their checksums, retrying each file several times on error, and employing multiple concurrent connections to maximize bandwidth. ``` > python download_ffhq.py -h usage: download_ffhq.py [-h] [-j] [-s] [-i] [-t] [-w] [-r] [-a] [--num_threads NUM] [--status_delay SEC] [--timing_window LEN] [--chunk_size KB] [--num_attempts NUM] Download Flickr-Face-HQ (FFHQ) dataset to current working directory. optional arguments: -h, --help show this help message and exit -j, --json download metadata as JSON (254 MB) -s, --stats print statistics about the dataset -i, --images download 1024x1024 images as PNG (89.1 GB) -t, --thumbs download 128x128 thumbnails as PNG (1.95 GB) -w, --wilds download in-the-wild images as PNG (955 GB) -r, --tfrecords download multi-resolution TFRecords (273 GB) -a, --align recreate 1024x1024 images from in-the-wild images --num_threads NUM number of concurrent download threads (default: 32) --status_delay SEC time between download status prints (default: 0.2) --timing_window LEN samples for estimating download eta (default: 50) --chunk_size KB chunk size for each download thread (default: 128) --num_attempts NUM number of download attempts per file (default: 10) ``` ``` > python ..\download_ffhq.py --json --images Downloading JSON metadata... \ 100.00% done 1/1 files 0.25/0.25 GB 43.21 MB/s ETA: done Parsing JSON metadata... Downloading 70000 files... | 100.00% done 70000/70000 files 89.19 GB/89.19 GB 59.87 MB/s ETA: done ``` The script also serves as a reference implementation of the automated scheme that we used to align and crop the images. Once you have downloaded the in-the-wild images with `python download_ffhq.py --wilds`, you can run `python download_ffhq.py --align` to reproduce exact replicas of the aligned 1024&times;1024 images using the facial landmark locations included in the metadata. ## Metadata The `ffhq-dataset-v1.json` file contains the following information for each image in a machine-readable format: ``` { "0": { # Image index "category": "training", # Training or validation "metadata": { # Info about the original Flickr photo: "photo_url": "https://www.flickr.com/photos/...", # - Flickr URL "photo_title": "DSCF0899.JPG", # - File name "author": "Jeremy Frumkin", # - Author "country": "", # - Country where the photo was taken "license": "Attribution-NonCommercial License", # - License name "license_url": "https://creativecommons.org/...", # - License detail URL "date_uploaded": "2007-08-16", # - Date when the photo was uploaded to Flickr "date_crawled": "2018-10-10" # - Date when the photo was crawled from Flickr }, "image": { # Info about the aligned 1024x1024 image: "file_url": "https://drive.google.com/...", # - Google Drive URL "file_path": "images1024x1024/00000.png", # - Google Drive path "file_size": 1488194, # - Size of the PNG file in bytes "file_md5": "ddeaeea6ce59569643715759d537fd1b", # - MD5 checksum of the PNG file "pixel_size": [1024, 1024], # - Image dimensions "pixel_md5": "47238b44dfb87644460cbdcc4607e289", # - MD5 checksum of the raw pixel data "face_landmarks": [...] # - 68 face landmarks reported by dlib }, "thumbnail": { # Info about the 128x128 thumbnail: "file_url": "https://drive.google.com/...", # - Google Drive URL "file_path": "thumbnails128x128/00000.png", # - Google Drive path "file_size": 29050, # - Size of the PNG file in bytes "file_md5": "bd3e40b2ba20f76b55dc282907b89cd1", # - MD5 checksum of the PNG file "pixel_size": [128, 128], # - Image dimensions "pixel_md5": "38d7e93eb9a796d0e65f8c64de8ba161" # - MD5 checksum of the raw pixel data }, "in_the_wild": { # Info about the in-the-wild image: "file_url": "https://drive.google.com/...", # - Google Drive URL "file_path": "in-the-wild-images/00000.png", # - Google Drive path "file_size": 3991569, # - Size of the PNG file in bytes "file_md5": "1dc0287e73e485efb0516a80ce9d42b4", # - MD5 checksum of the PNG file "pixel_size": [2016, 1512], # - Image dimensions "pixel_md5": "86b3470c42e33235d76b979161fb2327", # - MD5 checksum of the raw pixel data "face_rect": [667, 410, 1438, 1181], # - Axis-aligned rectangle of the face region "face_landmarks": [...], # - 68 face landmarks reported by dlib "face_quad": [...] # - Aligned quad of the face region } }, ... } ``` ## Acknowledgements We thank Jaakko Lehtinen, David Luebke, and Tuomas Kynk&auml;&auml;nniemi for in-depth discussions and helpful comments; Janne Hellsten, Tero Kuosmanen, and Pekka J&auml;nis for compute infrastructure and help with the code release. We also thank Vahid Kazemi and Josephine Sullivan for their work on automatic face detection and alignment that enabled us to collect the data in the first place: > **One Millisecond Face Alignment with an Ensemble of Regression Trees**<br> > Vahid Kazemi, Josephine Sullivan<br> > Proc. CVPR 2014<br> > https://www.cv-foundation.org/openaccess/content_cvpr_2014/papers/Kazemi_One_Millisecond_Face_2014_CVPR_paper.pdf

FFHQ 数据集包含70000张PNG格式图像 下载链接:https://pan.baidu.com/s/1XDfTKWOhtwAAQQJ0KBU4RQ 提取码:bowj ## Flickr-Faces-HQ 数据集(FFHQ) ![Python 3.6](https://img.shields.io/badge/python-3.6-green.svg?style=plastic) 适配Python 3.6 ![License CC](https://img.shields.io/badge/license-CC-green.svg?style=plastic) 许可证:知识共享(Creative Commons, CC) ![Format PNG](https://img.shields.io/badge/format-PNG-green.svg?style=plastic) 格式:PNG ![Resolution 1024×1024](https://img.shields.io/badge/resolution-1024×1024-green.svg?style=plastic) 分辨率:1024×1024 ![Images 70000](https://img.shields.io/badge/images-70,000-green.svg?style=plastic) 图像数量:70000张 ![预览图](./ffhq-teaser.png) Flickr-Faces-HQ(FFHQ)是高质量的人类面部图像数据集,最初作为生成式对抗网络(Generative Adversarial Networks, GAN)的基准测试数据集构建: > **《生成式对抗网络的风格化生成器架构》**<br> > 特罗·卡拉斯(Tero Karras, NVIDIA)、萨穆利·莱内(Samuli Laine, NVIDIA)、蒂莫·艾拉(Timo Aila, NVIDIA)<br> > http://stylegan.xyz/paper 该数据集包含70000张分辨率为1024×1024的高质量PNG图像,在年龄、种族与图像背景方面具备丰富多样性,同时覆盖了眼镜、太阳镜、帽子等各类配饰。这些图像从Flickr网站爬取,因此继承了该平台的所有固有偏差,并通过dlib工具自动完成对齐与裁剪。仅收集了持有宽松许可证的图像,通过多种自动过滤机制筛选数据集,最终使用亚马逊机械 Turk(Amazon Mechanical Turk)移除了其中偶尔出现的雕像、画作或翻拍照片。 商务咨询请联系 [researchinquiries@nvidia.com](mailto:researchinquiries@nvidia.com) 媒体及其他咨询请联系赫克托·马里内斯(Hector Marinez),邮箱:[hmarinez@nvidia.com](mailto:hmarinez@nvidia.com) ## 许可证 单张图像由其作者在Flickr发布,采用以下协议之一: [知识共享署名2.0协议(CC BY 2.0)](https://creativecommons.org/licenses/by/2.0/)、[知识共享署名-非商业性使用2.0协议(CC BY-NC 2.0)](https://creativecommons.org/licenses/by-nc/2.0/)、[公共领域标记1.0协议(Public Domain Mark 1.0)](https://creativecommons.org/publicdomain/mark/1.0/)、[公共领域CC0 1.0协议(CC0 1.0)](https://creativecommons.org/publicdomain/zero/1.0/)或[美国政府作品协议(U.S. Government Works)](http://www.usa.gov/copyright.shtml)。所有这些许可证均允许**免费使用、再分发以及非商业性改编**,但部分协议要求向原作者**注明适当的署名**,并**说明对图像所做的修改**。每张图像的许可证与原作者信息均包含在元数据中。 * [https://creativecommons.org/licenses/by/2.0/](https://creativecommons.org/licenses/by/2.0/) * [https://creativecommons.org/licenses/by-nc/2.0/](https://creativecommons.org/licenses/by-nc/2.0/) * [https://creativecommons.org/publicdomain/mark/1.0/](https://creativecommons.org/publicdomain/mark/1.0/) * [https://creativecommons.org/publicdomain/zero/1.0/](https://creativecommons.org/publicdomain/zero/1.0/) * [http://www.usa.gov/copyright.shtml](http://www.usa.gov/copyright.shtml) 该数据集本身(包含JSON元数据、下载脚本与文档)由NVIDIA公司以[知识共享署名-非商业性使用-相同方式共享4.0协议(CC BY-NC-SA 4.0)](https://creativecommons.org/licenses/by-nc-sa/4.0/)发布。您可以**使用、再分发并进行非商业性改编**,但需满足以下条件:(a) 通过**引用本论文**注明适当署名,(b) **说明所做的修改**,以及(c) 将衍生作品以**相同许可证协议**分发。 * [https://creativecommons.org/licenses/by-nc-sa/4.0/](https://creativecommons.org/licenses/by-nc-sa/4.0/) ## 数据集概览 所有数据托管于Google Drive: | 路径 | 大小 | 文件数 | 格式 | 描述 | :--- | :--: | ----: | :----: | :---------- | [ffhq-dataset](https://drive.google.com/open?id=1u2xu7bSrWxrbUxk-dT-UvEJq8IjdmNTP) | 2.56 TB | 210,014 | | 主文件夹 | ├─&nbsp; [ffhq-dataset-v1.json](https://drive.google.com/open?id=1IB0BFbN_eRZx9UkJqLHSgJiQhqX-PrI6) | 254 MB | 1 | JSON | 包含版权信息、URL等内容的元数据文件 | ├─&nbsp; [images1024x1024](https://drive.google.com/open?id=1u3Hbfn3Q6jsTlte3BY85CGwId77H-OOu) | 89.1 GB | 70,000 | PNG | 分辨率为1024×1024的对齐裁剪后图像 | ├─&nbsp; [thumbnails128x128](https://drive.google.com/open?id=1uJkWCpLUM-BnXW3H_IgVMdfENeNDFNmC) | 1.95 GB | 70,000 | PNG | 分辨率为128×128的缩略图 | ├─&nbsp; [in-the-wild-images](https://drive.google.com/open?id=1YyuocbwILsHAjTusSUG-_zL343jlVBhf) | 955 GB | 70,000 | PNG | 从Flickr爬取的原始未处理图像 | ├─&nbsp; [tfrecords](https://drive.google.com/open?id=1LTBpJ0W_WLjqza3zdayligS8Dh1V1gA6) | 273 GB | 9 | tfrecords | 适用于[StyleGAN](http://stylegan.xyz/code)与[ProGAN](https://github.com/tkarras/progressive_growing_of_gans)的多分辨率数据 | └─&nbsp; [zips](https://drive.google.com/open?id=1WocxvZ4GEZ1DI8dOz30aSj2zT6pkATYS) | 1.28 TB | 4 | ZIP | 各文件夹内容的ZIP压缩包 高级统计信息: ![饼状图](./ffhq-piecharts.png) 针对需要独立训练集与验证集的使用场景,我们指定前60000张图像用于训练,剩余10000张用于验证。但在[StyleGAN论文](http://stylegan.xyz/paper)中,我们使用全部70000张图像进行训练。 我们已确保数据集本身不存在重复图像。但请注意,`in-the-wild`文件夹中可能存在同一图像的多个副本,因为我们会从同一张原始图像中提取多张不同的人脸。 ## 下载脚本 您可以直接从Google Drive获取数据,或使用提供的[下载脚本](./download_ffhq.py)。该脚本可自动下载所需文件、校验校验和、出错时自动重试,并通过多并发连接最大化带宽利用率,大幅简化下载流程。 > python download_ffhq.py -h usage: download_ffhq.py [-h] [-j] [-s] [-i] [-t] [-w] [-r] [-a] [--num_threads NUM] [--status_delay SEC] [--timing_window LEN] [--chunk_size KB] [--num_attempts NUM] 将Flickr人脸高清(FFHQ)数据集下载至当前工作目录。 可选参数: -h, --help 显示帮助信息并退出 -j, --json 下载JSON格式元数据(254 MB) -s, --stats 打印数据集统计信息 -i, --images 下载分辨率为1024×1024的PNG格式图像(89.1 GB) -t, --thumbs 下载分辨率为128×128的PNG格式缩略图(1.95 GB) -w, --wilds 下载原始野外图像(in-the-wild)的PNG格式文件(955 GB) -r, --tfrecords 下载多分辨率TFRecords数据(273 GB) -a, --align 从野外原始图像重新生成1024×1024的对齐图像 --num_threads NUM 并发下载线程数(默认值:32) --status_delay SEC 下载状态打印间隔时间(默认值:0.2秒) --timing_window LEN 用于估算下载剩余时间的采样窗口大小(默认值:50) --chunk_size KB 每个下载线程的块大小(默认值:128 KB) --num_attempts NUM 单个文件的最大重试次数(默认值:10) > python ..download_ffhq.py --json --images 正在下载JSON元数据... 100.00% 完成 1/1 文件 0.25/0.25 GB 43.21 MB/s 预计完成时间:已完成 正在解析JSON元数据... 正在下载70000个文件... | 100.00% 完成 70000/70000 文件 89.19 GB/89.19 GB 59.87 MB/s 预计完成时间:已完成 该脚本同时作为我们用于自动对齐与裁剪图像的自动化方案的参考实现。在使用`python download_ffhq.py --wilds`下载野外原始图像后,您可以运行`python download_ffhq.py --align`,通过元数据中包含的面部地标位置,复现与对齐后1024×1024图像完全一致的版本。 ## 元数据 `ffhq-dataset-v1.json`文件以机器可读格式为每张图像提供以下信息: { "0": { # 图像索引 "category": "training", # 训练集或验证集 "metadata": { # 原始Flickr照片信息: "photo_url": "https://www.flickr.com/photos/...", # - Flickr 照片URL "photo_title": "DSCF0899.JPG", # - 文件名 "author": "Jeremy Frumkin", # - 作者 "country": "", # - 照片拍摄国家 "license": "Attribution-NonCommercial License", # - 许可证名称 "license_url": "https://creativecommons.org/...", # - 许可证详情URL "date_uploaded": "2007-08-16", # - 照片上传至Flickr的日期 "date_crawled": "2018-10-10" # - 从Flickr爬取的日期 }, "image": { # 对齐后1024×1024图像信息: "file_url": "https://drive.google.com/...", # - Google Drive 下载URL "file_path": "images1024x1024/00000.png", # - Google Drive 路径 "file_size": 1488194, # - PNG文件字节大小 "file_md5": "ddeaeea6ce59569643715759d537fd1b", # - PNG文件MD5校验和 "pixel_size": [1024, 1024], # - 图像分辨率 "pixel_md5": "47238b44dfb87644460cbdcc4607e289", # - 原始像素数据MD5校验和 "face_landmarks": [...] # - dlib识别的68个面部地标 }, "thumbnail": { # 128×128缩略图信息: "file_url": "https://drive.google.com/...", # - Google Drive 下载URL "file_path": "thumbnails128x128/00000.png", # - Google Drive 路径 "file_size": 29050, # - PNG文件字节大小 "file_md5": "bd3e40b2ba20f76b55dc282907b89cd1", # - PNG文件MD5校验和 "pixel_size": [128, 128], # - 图像分辨率 "pixel_md5": "38d7e93eb9a796d0e65f8c64de8ba161" # - 原始像素数据MD5校验和 }, "in_the_wild": { # 野外原始图像信息: "file_url": "https://drive.google.com/...", # - Google Drive 下载URL "file_path": "in-the-wild-images/00000.png", # - Google Drive 路径 "file_size": 3991569, # - PNG文件字节大小 "file_md5": "1dc0287e73e485efb0516a80ce9d42b4", # - PNG文件MD5校验和 "pixel_size": [2016, 1512], # - 图像分辨率 "pixel_md5": "86b3470c42e33235d76b979161fb2327", # - 原始像素数据MD5校验和 "face_rect": [667, 410, 1438, 1181], # - 人脸区域的轴对齐矩形框 "face_landmarks": [...], # - dlib识别的68个面部地标 "face_quad": [...] # - 人脸区域的对齐四边形框 } }, ... } ## 致谢 我们感谢亚科·莱赫蒂宁(Jaakko Lehtinen)、大卫·勒布克(David Luebke)以及图奥马斯·昆卡宁(Tuomas Kynkäänniemi)提供的深入讨论与宝贵意见;感谢扬·赫尔尔斯滕(Janne Hellsten)、特罗·库奥曼宁(Tero Kuosmanen)与佩卡·亚尼斯(Pekka Jänis)提供的计算基础设施与代码发布协助。 我们同时感谢瓦希德·卡泽米(Vahid Kazemi)与约瑟芬·沙利文(Josephine Sullivan)在自动人脸检测与对齐领域的工作,这是我们能够收集该数据集的基础: > **《基于回归树集成的毫秒级人脸对齐》**<br> > 瓦希德·卡泽米,约瑟芬·沙利文<br> > CVPR 2014 会议论文<br> > https://www.cv-foundation.org/openaccess/content_cvpr_2014/papers/Kazemi_One_Millisecond_Face_2014_CVPR_paper.pdf
提供机构:
student
原始信息汇总

数据集概述

名称: Flickr-Faces-HQ Dataset (FFHQ)

描述: FFHQ是一个高质量的人脸图像数据集,包含70,000张分辨率为1024×1024的PNG格式图片。该数据集在年龄、种族和图像背景方面具有显著的变化,并且涵盖了多种配饰,如眼镜、太阳镜、帽子等。数据来源于Flickr网站,经过自动对齐和裁剪处理。

特点:

  • 图像数量: 70,000张
  • 分辨率: 1024×1024
  • 格式: PNG
  • 内容多样性: 包括年龄、种族、配饰等多方面变化

使用许可:

  • 单个图像根据不同作者的许可发布,包括Creative Commons BY 2.0、Creative Commons BY-NC 2.0、Public Domain Mark 1.0、Public Domain CC0 1.0和U.S. Government Works。这些许可允许免费使用、重新分发和改编,但需适当引用原作者并标明任何更改。
  • 数据集本身(包括JSON元数据、下载脚本和文档)由NVIDIA Corporation根据Creative Commons BY-NC-SA 4.0许可提供。

数据结构:

  • 主文件夹: ffhq-dataset(2.56 TB,210,014个文件)
  • 元数据: ffhq-dataset-v1.json(254 MB,1个JSON文件)
  • 图像: images1024x1024(89.1 GB,70,000个PNG文件)
  • 缩略图: thumbnails128x128(1.95 GB,70,000个PNG文件)
  • 原始图像: in-the-wild-images(955 GB,70,000个PNG文件)
  • TFRecords: tfrecords(273 GB,9个文件)

下载与使用:

  • 数据可通过Google Drive直接下载或使用提供的下载脚本download_ffhq.py进行自动化下载和管理。

训练与验证:

  • 数据集中的前60,000张图像用于训练,剩余的10,000张用于验证。

元数据信息:

  • 每个图像的详细信息包括原始Flickr照片信息、对齐的1024x1024图像信息、128x128缩略图信息以及原始图像信息,均包含在ffhq-dataset-v1.json文件中。
搜集汇总
数据集介绍
main_image_url
构建方式
FFHQ数据集的构建基于从Flickr平台爬取的高质量人脸图像,经过自动对齐和裁剪处理,确保每张图像的分辨率达到1024×1024。为确保数据集的多样性,图像涵盖了不同年龄、种族和背景,并包含多种配饰如眼镜和帽子。数据集的筛选过程包括使用自动过滤器和Amazon Mechanical Turk进行人工校验,以去除非人脸图像。最终,所有图像均基于许可协议收集,确保了数据集的合法性和可用性。
特点
FFHQ数据集以其高分辨率和多样性著称,包含70,000张PNG格式的高质量人脸图像,适用于生成对抗网络(GAN)等高级图像处理任务。数据集不仅覆盖了广泛的人脸特征,还提供了详细的元数据,包括每张图像的版权信息、原始URL和面部特征点。此外,数据集还提供了128×128分辨率的缩略图和原始的‘in-the-wild’图像,便于不同应用场景的使用。
使用方法
FFHQ数据集可通过提供的下载脚本或直接从Google Drive获取。用户可根据需求选择下载全分辨率图像、缩略图或原始图像,并可利用元数据进行进一步处理。数据集的训练和验证集已预先划分,前60,000张图像用于训练,后10,000张用于验证。此外,数据集还提供了TFRecords格式,便于直接用于StyleGAN等模型的训练。
背景与挑战
背景概述
Flickr-Faces-HQ (FFHQ) 数据集是由 NVIDIA 的研究团队于 2018 年创建的高质量人脸图像数据集,旨在为生成对抗网络(GAN)提供基准测试。该数据集由 70,000 张 1024×1024 分辨率的 PNG 图像组成,涵盖了年龄、种族、背景以及眼镜、帽子等配饰的广泛变化。这些图像从 Flickr 网站爬取,经过 dlib 自动对齐和裁剪,并通过 Amazon Mechanical Turk 人工筛选,确保图像质量。FFHQ 数据集的创建不仅推动了 GAN 技术的发展,还为人脸生成、识别等领域的研究提供了丰富的资源。
当前挑战
FFHQ 数据集在构建过程中面临多重挑战。首先,从 Flickr 爬取的图像继承了该网站的固有偏见,需通过自动过滤和人工校验来确保数据多样性和质量。其次,图像的对齐和裁剪需要高精度的面部识别技术,以避免失真或遗漏关键特征。此外,数据集的版权问题复杂,需确保所有图像均符合许可协议,并在元数据中详细记录版权信息。最后,数据集的规模庞大,存储和传输成为技术难题,需设计高效的下载和管理工具。这些挑战共同构成了 FFHQ 数据集在实际应用中的复杂性和独特性。
常用场景
经典使用场景
FFHQ数据集在生成对抗网络(GAN)领域中被广泛用作基准测试,尤其是在StyleGAN架构的训练和评估中。其高分辨率(1024×1024)的图像质量使得研究人员能够探索更复杂的面部特征生成和风格迁移任务。此外,FFHQ数据集的多变性,包括年龄、种族、背景和配饰的多样性,使其成为面部识别、图像增强和生成模型训练的理想选择。
衍生相关工作
FFHQ数据集的发布催生了许多相关研究工作,特别是在生成对抗网络和面部图像生成领域。StyleGAN架构的提出和改进是其中最为显著的成果之一,该架构利用FFHQ数据集进行训练,显著提升了生成图像的质量和多样性。此外,FFHQ数据集还被用于研究面部特征的自动提取和分析,推动了面部识别和图像处理技术的发展。
数据集最近研究
最新研究方向
在人脸识别与生成领域,FFHQ数据集因其高质量的图像和丰富的多样性,成为前沿研究的重要基石。最新的研究方向主要集中在利用FFHQ数据集优化生成对抗网络(GAN)的性能,特别是在风格迁移和图像生成方面。研究者们通过分析FFHQ中的多样性特征,探索如何提升GAN在生成逼真人脸图像时的稳定性和真实感。此外,FFHQ数据集还被广泛应用于人脸识别算法的训练与评估,尤其是在处理复杂背景和多样化人脸特征时,其表现尤为突出。这些研究不仅推动了人脸识别技术的进步,也为未来的人工智能应用提供了更广阔的可能性。
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作