kaikaiyao/ffhq-image-attribution
收藏Hugging Face2026-04-06 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/kaikaiyao/ffhq-image-attribution
下载链接
链接失效反馈官方服务:
资源简介:
---
configs:
- config_name: stylegan2-ffhq-256
data_files:
- viewer-stylegan2-ffhq-256.parquet
- config_name: stylegan3-ffhq-256
data_files:
- viewer-stylegan3-ffhq-256.parquet
- config_name: r3gan-ffhq-256
data_files:
- viewer-r3gan-ffhq-256.parquet
- config_name: cips-ffhq-256
data_files:
- viewer-cips-ffhq-256.parquet
- config_name: ganformer-ffhq-256
data_files:
- viewer-ganformer-ffhq-256.parquet
- config_name: styleswin-ffhq-256
data_files:
- viewer-styleswin-ffhq-256.parquet
- config_name: vqvae-ffhq-256
data_files:
- viewer-vqvae-ffhq-256.parquet
- config_name: nvae-ffhq-256
data_files:
- viewer-nvae-ffhq-256.parquet
- config_name: vdvae-ffhq-256
data_files:
- viewer-vdvae-ffhq-256.parquet
- config_name: adm-ffhq-256
data_files:
- viewer-adm-ffhq-256.parquet
- config_name: ldm-ffhq-256
data_files:
- viewer-ldm-ffhq-256.parquet
- config_name: ncsnpp-ffhq-256
data_files:
- viewer-ncsnpp-ffhq-256.parquet
license: other
language:
- en
task_categories:
- image-classification
tags:
- ffhq
- model-attribution
- generated-images
- face-generation
- gan
- vae
- diffusion
- benchmark
---
# FFHQ Image Attribution
[](https://huggingface.co/datasets/kaikaiyao/ffhq-image-attribution)
[](https://huggingface.co/datasets/kaikaiyao/ffhq-image-attribution)
[](https://huggingface.co/datasets/kaikaiyao/ffhq-image-attribution)
[](https://huggingface.co/datasets/kaikaiyao/ffhq-image-attribution)
A public benchmark for **FFHQ model attribution**, built from twelve face generators spanning GAN, VAE, and diffusion families.

> Version `v2` includes `10,000` images from each of `12` FFHQ-trained generators for a total of `120,000` images.
## Why this dataset
- Same image domain across multiple FFHQ generators makes source attribution cleaner and easier to study.
- Public metadata links each image to its source model, family, release, seed, and file integrity hash.
- The dataset viewer uses lightweight embedded thumbnails so you can browse each model subset quickly.
- The release is deliberately simple: only model identity varies, without prompt or text metadata.
## At a glance
| Images | Models | Families | Version |
|---:|---:|---:|---|
| **120,000** | **12** | **3** | **v2** |
## Coverage
All images are `256x256` RGB face images from FFHQ-trained generators. Each subset corresponds to one model family member.
| Model | Family | Checkpoint |
|---|---|---|
| `stylegan2-ffhq-256` | `gan` | `https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/paper-fig7c-training-set-sweeps/ffhq140k-paper256-noaug.pkl` |
| `stylegan3-ffhq-256` | `gan` | `https://api.ngc.nvidia.com/v2/models/nvidia/research/stylegan3/versions/1/files/stylegan3-t-ffhqu-256x256.pkl` |
| `r3gan-ffhq-256` | `gan` | `brownvc/R3GAN-FFHQ-256x256` |
| `cips-ffhq-256` | `gan` | `https://drive.google.com/file/d/1JRd4ZpMDmlkbNlxnVvZx77Eyfac53KSq/view?usp=sharing` |
| `ganformer-ffhq-256` | `gan` | `https://drive.google.com/uc?id=1-b0vwevUQs6LI_EybdO8XJ5uYSx63vEa` |
| `styleswin-ffhq-256` | `gan` | `https://drive.google.com/file/d/1OjYZ1zEWGNdiv0RFKv7KhXRmYko72LjO/view` |
| `vqvae-ffhq-256` | `vae` | `kohido/ffhq256_vqvae_mhvq` |
| `nvae-ffhq-256` | `vae` | `https://drive.google.com/uc?id=1lQzywY5O71Z5NqAUJUcWy2Q1K2hPFO6j` |
| `vdvae-ffhq-256` | `vae` | `https://openaipublic.blob.core.windows.net/very-deep-vaes-assets/vdvae-assets/ffhq256-iter-1700000-model-ema.th` |
| `adm-ffhq-256` | `diffusion` | `xutongda/adm_ffhq_256x256` |
| `ldm-ffhq-256` | `diffusion` | `kaayaanil/ldm-ffhq-256` |
| `ncsnpp-ffhq-256` | `diffusion` | `https://drive.google.com/uc?id=1-mtdSwuefIZA0n85QWScQo2WRvJNWwUy` |
## Gallery

## How to use
Load a viewer subset with `datasets`:
```python
from datasets import load_dataset
ds = load_dataset("kaikaiyao/ffhq-image-attribution", "stylegan2-ffhq-256", split="train")
print(ds[0]["source_id"], ds[0]["seed"])
```
Or read the full metadata table with `pandas`:
```python
import pandas as pd
df = pd.read_parquet(
"https://huggingface.co/datasets/kaikaiyao/ffhq-image-attribution/resolve/main/metadata/all.parquet"
)
print(df[["source_id", "seed", "image_path"]].head())
```
## Metadata
`metadata/all.parquet` is the main table for the release.
- Identity: `source_id`, `family`, `seed`
- Image and integrity: `image_path`, `image_size`, `sha256`
- Release: `release`
Each subset also has:
- `metadata/by_model/<source_id>.parquet`
- `viewer-<source_id>.parquet`
## Uses and limitations
- Intended for research on model attribution, image provenance, model fingerprinting, and generated-face forensics.
- All images are `256x256` and come from FFHQ-trained generators.
- `vqvae-ffhq-256` is reconstruction-based, unlike the other models that directly sample generated images.
- This public release includes `10,000` images per model from the current FFHQ bank snapshot.
- Use of this dataset remains subject to the licenses and terms of the upstream model checkpoints and FFHQ-related resources.
## Citation
If you use this dataset, please cite:
```bibtex
@dataset{yao2026ffhq_image_attribution,
author = {{Kai Yao}},
title = {{FFHQ Image Attribution}},
year = {2026},
publisher = {{Hugging Face}},
url = {https://huggingface.co/datasets/kaikaiyao/ffhq-image-attribution},
note = {Version: v2}
}
```
配置项:
- 配置名称:stylegan2-ffhq-256
数据文件:
- viewer-stylegan2-ffhq-256.parquet
- 配置名称:stylegan3-ffhq-256
数据文件:
- viewer-stylegan3-ffhq-256.parquet
- 配置名称:r3gan-ffhq-256
数据文件:
- viewer-r3gan-ffhq-256.parquet
- 配置名称:cips-ffhq-256
数据文件:
- viewer-cips-ffhq-256.parquet
- 配置名称:ganformer-ffhq-256
数据文件:
- viewer-ganformer-ffhq-256.parquet
- 配置名称:styleswin-ffhq-256
数据文件:
- viewer-styleswin-ffhq-256.parquet
- 配置名称:vqvae-ffhq-256
数据文件:
- viewer-vqvae-ffhq-256.parquet
- 配置名称:nvae-ffhq-256
数据文件:
- viewer-nvae-ffhq-256.parquet
- 配置名称:vdvae-ffhq-256
数据文件:
- viewer-vdvae-ffhq-256.parquet
- 配置名称:adm-ffhq-256
数据文件:
- viewer-adm-ffhq-256.parquet
- 配置名称:ldm-ffhq-256
数据文件:
- viewer-ldm-ffhq-256.parquet
- 配置名称:ncsnpp-ffhq-256
数据文件:
- viewer-ncsnpp-ffhq-256.parquet
许可协议:其他
语言:英语
任务类别:图像分类
标签:
- ffhq
- 模型归因
- 生成图像
- 人脸生成
- 生成对抗网络(GAN)
- 变分自编码器(VAE)
- 扩散模型
- 基准测试
# FFHQ 图像归因数据集
[](https://huggingface.co/datasets/kaikaiyao/ffhq-image-attribution)
[](https://huggingface.co/datasets/kaikaiyao/ffhq-image-attribution)
[](https://huggingface.co/datasets/kaikaiyao/ffhq-image-attribution)
[](https://huggingface.co/datasets/kaikaiyao/ffhq-image-attribution)
本数据集是面向**FFHQ(Flickr-Faces-HQ)模型归因**的公开基准测试集,由12款基于FFHQ训练的人脸生成模型构建而成,涵盖生成对抗网络(GAN)、变分自编码器(VAE)以及扩散模型三大模型族。

> 版本`v2`包含12款FFHQ训练生成器各自产出的10000张图像,总计120000张图像。
## 数据集设计初衷
- 所有图像均来自同一FFHQ图像域,使得模型归因研究更加清晰便捷。
- 公开的元数据可将每张图像与其源模型、模型族、发布版本、随机种子以及文件完整性哈希值相关联。
- 数据集查看器内置轻量级缩略图,支持快速浏览各模型子集。
- 本次发布设计简洁:仅模型类别存在差异,不包含提示词或文本元数据。
## 概览
| 图像数量 | 模型数量 | 模型族数量 | 版本 |
|---:|---:|---:|---|
| **120,000** | **12** | **3** | **v2** |
## 数据集覆盖范围
所有图像均为256×256分辨率的RGB人脸图像,均来自FFHQ训练的生成模型。每个子集对应一个模型族成员。
| 模型名称 | 模型族 | 模型权重检查点 |
|---|---|---|
| `stylegan2-ffhq-256` | 生成对抗网络(GAN) | `https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/paper-fig7c-training-set-sweeps/ffhq140k-paper256-noaug.pkl` |
| `stylegan3-ffhq-256` | 生成对抗网络(GAN) | `https://api.ngc.nvidia.com/v2/models/nvidia/research/stylegan3/versions/1/files/stylegan3-t-ffhqu-256x256.pkl` |
| `r3gan-ffhq-256` | 生成对抗网络(GAN) | `brownvc/R3GAN-FFHQ-256x256` |
| `cips-ffhq-256` | 生成对抗网络(GAN) | `https://drive.google.com/file/d/1JRd4ZpMDmlkbNlxnVvZx77Eyfac53KSq/view?usp=sharing` |
| `ganformer-ffhq-256` | 生成对抗网络(GAN) | `https://drive.google.com/uc?id=1-b0vwevUQs6LI_EybdO8XJ5uYSx63vEa` |
| `styleswin-ffhq-256` | 生成对抗网络(GAN) | `https://drive.google.com/file/d/1OjYZ1zEWGNdiv0RFKv7KhXRmYko72LjO/view` |
| `vqvae-ffhq-256` | 变分自编码器(VAE) | `kohido/ffhq256_vqvae_mhvq` |
| `nvae-ffhq-256` | 变分自编码器(VAE) | `https://drive.google.com/uc?id=1lQzywY5O71Z5NqAUJUcWy2Q1K2hPFO6j` |
| `vdvae-ffhq-256` | 变分自编码器(VAE) | `https://openaipublic.blob.core.windows.net/very-deep-vaes-assets/vdvae-assets/ffhq256-iter-1700000-model-ema.th` |
| `adm-ffhq-256` | 扩散模型 | `xutongda/adm_ffhq_256x256` |
| `ldm-ffhq-256` | 扩散模型 | `kaayaanil/ldm-ffhq-256` |
| `ncsnpp-ffhq-256` | 扩散模型 | `https://drive.google.com/uc?id=1-mtdSwuefIZA0n85QWScQo2WRvJNWwUy` |
## 图库

## 使用方法
使用`datasets`库加载指定的查看器子集:
python
from datasets import load_dataset
ds = load_dataset("kaikaiyao/ffhq-image-attribution", "stylegan2-ffhq-256", split="train")
print(ds[0]["source_id"], ds[0]["seed"])
或者使用`pandas`读取完整元数据表:
python
import pandas as pd
df = pd.read_parquet(
"https://huggingface.co/datasets/kaikaiyao/ffhq-image-attribution/resolve/main/metadata/all.parquet"
)
print(df[["source_id", "seed", "image_path"]].head())
## 元数据说明
`metadata/all.parquet` 是本次发布的主元数据表,包含以下字段:
- 身份信息:`source_id`(源模型ID)、`family`(模型族)、`seed`(随机种子)
- 图像与完整性信息:`image_path`(图像路径)、`image_size`(图像尺寸)、`sha256`(SHA256完整性哈希)
- 发布信息:`release`(发布版本)
每个模型子集还包含以下元数据文件:
- `metadata/by_model/<source_id>.parquet`
- `viewer-<source_id>.parquet`
## 使用场景与局限性
- 本数据集旨在用于模型归因、图像溯源、模型指纹识别以及生成人脸取证等相关研究。
- 所有图像均为256×256分辨率,均来自FFHQ训练的生成模型。
- `vqvae-ffhq-256` 属于基于重构的生成模型,与其他直接采样生成图像的模型有所不同。
- 本次公开发布包含来自当前FFHQ数据集快照的每个模型对应的10000张图像。
- 使用本数据集需遵守上游模型权重检查点以及FFHQ相关资源的许可协议与使用条款。
## 引用方式
若您使用本数据集,请引用以下文献:
bibtex
@dataset{yao2026ffhq_image_attribution,
author = {{Kai Yao}},
title = {{FFHQ Image Attribution}},
year = {2026},
publisher = {{Hugging Face}},
url = {https://huggingface.co/datasets/kaikaiyao/ffhq-image-attribution},
note = {Version: v2}
}
提供机构:
kaikaiyao



