five

kaikaiyao/ffhq-image-attribution

收藏
Hugging Face2026-04-06 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/kaikaiyao/ffhq-image-attribution
下载链接
链接失效反馈
官方服务:
资源简介:
--- configs: - config_name: stylegan2-ffhq-256 data_files: - viewer-stylegan2-ffhq-256.parquet - config_name: stylegan3-ffhq-256 data_files: - viewer-stylegan3-ffhq-256.parquet - config_name: r3gan-ffhq-256 data_files: - viewer-r3gan-ffhq-256.parquet - config_name: cips-ffhq-256 data_files: - viewer-cips-ffhq-256.parquet - config_name: ganformer-ffhq-256 data_files: - viewer-ganformer-ffhq-256.parquet - config_name: styleswin-ffhq-256 data_files: - viewer-styleswin-ffhq-256.parquet - config_name: vqvae-ffhq-256 data_files: - viewer-vqvae-ffhq-256.parquet - config_name: nvae-ffhq-256 data_files: - viewer-nvae-ffhq-256.parquet - config_name: vdvae-ffhq-256 data_files: - viewer-vdvae-ffhq-256.parquet - config_name: adm-ffhq-256 data_files: - viewer-adm-ffhq-256.parquet - config_name: ldm-ffhq-256 data_files: - viewer-ldm-ffhq-256.parquet - config_name: ncsnpp-ffhq-256 data_files: - viewer-ncsnpp-ffhq-256.parquet license: other language: - en task_categories: - image-classification tags: - ffhq - model-attribution - generated-images - face-generation - gan - vae - diffusion - benchmark --- # FFHQ Image Attribution [![Images](https://img.shields.io/badge/images-120,000-blue)](https://huggingface.co/datasets/kaikaiyao/ffhq-image-attribution) [![Models](https://img.shields.io/badge/models-12-2ea44f)](https://huggingface.co/datasets/kaikaiyao/ffhq-image-attribution) [![Families](https://img.shields.io/badge/families-3-8250df)](https://huggingface.co/datasets/kaikaiyao/ffhq-image-attribution) [![Version](https://img.shields.io/badge/version-v2-111827)](https://huggingface.co/datasets/kaikaiyao/ffhq-image-attribution) A public benchmark for **FFHQ model attribution**, built from twelve face generators spanning GAN, VAE, and diffusion families. ![Showcase overview](preview/showcase-hero.png) > Version `v2` includes `10,000` images from each of `12` FFHQ-trained generators for a total of `120,000` images. ## Why this dataset - Same image domain across multiple FFHQ generators makes source attribution cleaner and easier to study. - Public metadata links each image to its source model, family, release, seed, and file integrity hash. - The dataset viewer uses lightweight embedded thumbnails so you can browse each model subset quickly. - The release is deliberately simple: only model identity varies, without prompt or text metadata. ## At a glance | Images | Models | Families | Version | |---:|---:|---:|---| | **120,000** | **12** | **3** | **v2** | ## Coverage All images are `256x256` RGB face images from FFHQ-trained generators. Each subset corresponds to one model family member. | Model | Family | Checkpoint | |---|---|---| | `stylegan2-ffhq-256` | `gan` | `https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/paper-fig7c-training-set-sweeps/ffhq140k-paper256-noaug.pkl` | | `stylegan3-ffhq-256` | `gan` | `https://api.ngc.nvidia.com/v2/models/nvidia/research/stylegan3/versions/1/files/stylegan3-t-ffhqu-256x256.pkl` | | `r3gan-ffhq-256` | `gan` | `brownvc/R3GAN-FFHQ-256x256` | | `cips-ffhq-256` | `gan` | `https://drive.google.com/file/d/1JRd4ZpMDmlkbNlxnVvZx77Eyfac53KSq/view?usp=sharing` | | `ganformer-ffhq-256` | `gan` | `https://drive.google.com/uc?id=1-b0vwevUQs6LI_EybdO8XJ5uYSx63vEa` | | `styleswin-ffhq-256` | `gan` | `https://drive.google.com/file/d/1OjYZ1zEWGNdiv0RFKv7KhXRmYko72LjO/view` | | `vqvae-ffhq-256` | `vae` | `kohido/ffhq256_vqvae_mhvq` | | `nvae-ffhq-256` | `vae` | `https://drive.google.com/uc?id=1lQzywY5O71Z5NqAUJUcWy2Q1K2hPFO6j` | | `vdvae-ffhq-256` | `vae` | `https://openaipublic.blob.core.windows.net/very-deep-vaes-assets/vdvae-assets/ffhq256-iter-1700000-model-ema.th` | | `adm-ffhq-256` | `diffusion` | `xutongda/adm_ffhq_256x256` | | `ldm-ffhq-256` | `diffusion` | `kaayaanil/ldm-ffhq-256` | | `ncsnpp-ffhq-256` | `diffusion` | `https://drive.google.com/uc?id=1-mtdSwuefIZA0n85QWScQo2WRvJNWwUy` | ## Gallery ![Overview grid](preview/overview-grid.png) ## How to use Load a viewer subset with `datasets`: ```python from datasets import load_dataset ds = load_dataset("kaikaiyao/ffhq-image-attribution", "stylegan2-ffhq-256", split="train") print(ds[0]["source_id"], ds[0]["seed"]) ``` Or read the full metadata table with `pandas`: ```python import pandas as pd df = pd.read_parquet( "https://huggingface.co/datasets/kaikaiyao/ffhq-image-attribution/resolve/main/metadata/all.parquet" ) print(df[["source_id", "seed", "image_path"]].head()) ``` ## Metadata `metadata/all.parquet` is the main table for the release. - Identity: `source_id`, `family`, `seed` - Image and integrity: `image_path`, `image_size`, `sha256` - Release: `release` Each subset also has: - `metadata/by_model/<source_id>.parquet` - `viewer-<source_id>.parquet` ## Uses and limitations - Intended for research on model attribution, image provenance, model fingerprinting, and generated-face forensics. - All images are `256x256` and come from FFHQ-trained generators. - `vqvae-ffhq-256` is reconstruction-based, unlike the other models that directly sample generated images. - This public release includes `10,000` images per model from the current FFHQ bank snapshot. - Use of this dataset remains subject to the licenses and terms of the upstream model checkpoints and FFHQ-related resources. ## Citation If you use this dataset, please cite: ```bibtex @dataset{yao2026ffhq_image_attribution, author = {{Kai Yao}}, title = {{FFHQ Image Attribution}}, year = {2026}, publisher = {{Hugging Face}}, url = {https://huggingface.co/datasets/kaikaiyao/ffhq-image-attribution}, note = {Version: v2} } ```

配置项: - 配置名称:stylegan2-ffhq-256 数据文件: - viewer-stylegan2-ffhq-256.parquet - 配置名称:stylegan3-ffhq-256 数据文件: - viewer-stylegan3-ffhq-256.parquet - 配置名称:r3gan-ffhq-256 数据文件: - viewer-r3gan-ffhq-256.parquet - 配置名称:cips-ffhq-256 数据文件: - viewer-cips-ffhq-256.parquet - 配置名称:ganformer-ffhq-256 数据文件: - viewer-ganformer-ffhq-256.parquet - 配置名称:styleswin-ffhq-256 数据文件: - viewer-styleswin-ffhq-256.parquet - 配置名称:vqvae-ffhq-256 数据文件: - viewer-vqvae-ffhq-256.parquet - 配置名称:nvae-ffhq-256 数据文件: - viewer-nvae-ffhq-256.parquet - 配置名称:vdvae-ffhq-256 数据文件: - viewer-vdvae-ffhq-256.parquet - 配置名称:adm-ffhq-256 数据文件: - viewer-adm-ffhq-256.parquet - 配置名称:ldm-ffhq-256 数据文件: - viewer-ldm-ffhq-256.parquet - 配置名称:ncsnpp-ffhq-256 数据文件: - viewer-ncsnpp-ffhq-256.parquet 许可协议:其他 语言:英语 任务类别:图像分类 标签: - ffhq - 模型归因 - 生成图像 - 人脸生成 - 生成对抗网络(GAN) - 变分自编码器(VAE) - 扩散模型 - 基准测试 # FFHQ 图像归因数据集 [![图像数量](https://img.shields.io/badge/images-120,000-blue)](https://huggingface.co/datasets/kaikaiyao/ffhq-image-attribution) [![模型数量](https://img.shields.io/badge/models-12-2ea44f)](https://huggingface.co/datasets/kaikaiyao/ffhq-image-attribution) [![模型族数量](https://img.shields.io/badge/families-3-8250df)](https://huggingface.co/datasets/kaikaiyao/ffhq-image-attribution) [![版本](https://img.shields.io/badge/version-v2-111827)](https://huggingface.co/datasets/kaikaiyao/ffhq-image-attribution) 本数据集是面向**FFHQ(Flickr-Faces-HQ)模型归因**的公开基准测试集,由12款基于FFHQ训练的人脸生成模型构建而成,涵盖生成对抗网络(GAN)、变分自编码器(VAE)以及扩散模型三大模型族。 ![Showcase overview](preview/showcase-hero.png) > 版本`v2`包含12款FFHQ训练生成器各自产出的10000张图像,总计120000张图像。 ## 数据集设计初衷 - 所有图像均来自同一FFHQ图像域,使得模型归因研究更加清晰便捷。 - 公开的元数据可将每张图像与其源模型、模型族、发布版本、随机种子以及文件完整性哈希值相关联。 - 数据集查看器内置轻量级缩略图,支持快速浏览各模型子集。 - 本次发布设计简洁:仅模型类别存在差异,不包含提示词或文本元数据。 ## 概览 | 图像数量 | 模型数量 | 模型族数量 | 版本 | |---:|---:|---:|---| | **120,000** | **12** | **3** | **v2** | ## 数据集覆盖范围 所有图像均为256×256分辨率的RGB人脸图像,均来自FFHQ训练的生成模型。每个子集对应一个模型族成员。 | 模型名称 | 模型族 | 模型权重检查点 | |---|---|---| | `stylegan2-ffhq-256` | 生成对抗网络(GAN) | `https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/paper-fig7c-training-set-sweeps/ffhq140k-paper256-noaug.pkl` | | `stylegan3-ffhq-256` | 生成对抗网络(GAN) | `https://api.ngc.nvidia.com/v2/models/nvidia/research/stylegan3/versions/1/files/stylegan3-t-ffhqu-256x256.pkl` | | `r3gan-ffhq-256` | 生成对抗网络(GAN) | `brownvc/R3GAN-FFHQ-256x256` | | `cips-ffhq-256` | 生成对抗网络(GAN) | `https://drive.google.com/file/d/1JRd4ZpMDmlkbNlxnVvZx77Eyfac53KSq/view?usp=sharing` | | `ganformer-ffhq-256` | 生成对抗网络(GAN) | `https://drive.google.com/uc?id=1-b0vwevUQs6LI_EybdO8XJ5uYSx63vEa` | | `styleswin-ffhq-256` | 生成对抗网络(GAN) | `https://drive.google.com/file/d/1OjYZ1zEWGNdiv0RFKv7KhXRmYko72LjO/view` | | `vqvae-ffhq-256` | 变分自编码器(VAE) | `kohido/ffhq256_vqvae_mhvq` | | `nvae-ffhq-256` | 变分自编码器(VAE) | `https://drive.google.com/uc?id=1lQzywY5O71Z5NqAUJUcWy2Q1K2hPFO6j` | | `vdvae-ffhq-256` | 变分自编码器(VAE) | `https://openaipublic.blob.core.windows.net/very-deep-vaes-assets/vdvae-assets/ffhq256-iter-1700000-model-ema.th` | | `adm-ffhq-256` | 扩散模型 | `xutongda/adm_ffhq_256x256` | | `ldm-ffhq-256` | 扩散模型 | `kaayaanil/ldm-ffhq-256` | | `ncsnpp-ffhq-256` | 扩散模型 | `https://drive.google.com/uc?id=1-mtdSwuefIZA0n85QWScQo2WRvJNWwUy` | ## 图库 ![Overview grid](preview/overview-grid.png) ## 使用方法 使用`datasets`库加载指定的查看器子集: python from datasets import load_dataset ds = load_dataset("kaikaiyao/ffhq-image-attribution", "stylegan2-ffhq-256", split="train") print(ds[0]["source_id"], ds[0]["seed"]) 或者使用`pandas`读取完整元数据表: python import pandas as pd df = pd.read_parquet( "https://huggingface.co/datasets/kaikaiyao/ffhq-image-attribution/resolve/main/metadata/all.parquet" ) print(df[["source_id", "seed", "image_path"]].head()) ## 元数据说明 `metadata/all.parquet` 是本次发布的主元数据表,包含以下字段: - 身份信息:`source_id`(源模型ID)、`family`(模型族)、`seed`(随机种子) - 图像与完整性信息:`image_path`(图像路径)、`image_size`(图像尺寸)、`sha256`(SHA256完整性哈希) - 发布信息:`release`(发布版本) 每个模型子集还包含以下元数据文件: - `metadata/by_model/<source_id>.parquet` - `viewer-<source_id>.parquet` ## 使用场景与局限性 - 本数据集旨在用于模型归因、图像溯源、模型指纹识别以及生成人脸取证等相关研究。 - 所有图像均为256×256分辨率,均来自FFHQ训练的生成模型。 - `vqvae-ffhq-256` 属于基于重构的生成模型,与其他直接采样生成图像的模型有所不同。 - 本次公开发布包含来自当前FFHQ数据集快照的每个模型对应的10000张图像。 - 使用本数据集需遵守上游模型权重检查点以及FFHQ相关资源的许可协议与使用条款。 ## 引用方式 若您使用本数据集,请引用以下文献: bibtex @dataset{yao2026ffhq_image_attribution, author = {{Kai Yao}}, title = {{FFHQ Image Attribution}}, year = {2026}, publisher = {{Hugging Face}}, url = {https://huggingface.co/datasets/kaikaiyao/ffhq-image-attribution}, note = {Version: v2} }
提供机构:
kaikaiyao
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作