tglcourse/latent_afhqv2_256px
收藏Hugging Face2022-10-28 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/tglcourse/latent_afhqv2_256px
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: label
dtype:
class_label:
names:
0: cat
1: dog
2: wild
- name: latent
sequence:
sequence:
sequence: float32
splits:
- name: train
num_bytes: 267449972
num_examples: 15803
download_size: 260672854
dataset_size: 267449972
---
# Dataset Card for "latent_afhqv2_256px"
Each image is cropped to 256px square and encoded to a 4x32x32 latent representation using the same VAE as that employed by Stable Diffusion
Decoding
```python
from diffusers import AutoencoderKL
from datasets import load_dataset
from PIL import Image
import numpy as np
import torch
# load the dataset
dataset = load_dataset('tglcourse/latent_afhqv2_256px')
# Load the VAE (requires access - see repo model card for info)
vae = AutoencoderKL.from_pretrained("CompVis/stable-diffusion-v1-4", subfolder="vae")
latent = torch.tensor([dataset['train'][0]['latent']]) # To tensor (bs, 4, 32, 32)
latent = (1 / 0.18215) * latent # Scale to match SD implementation
with torch.no_grad():
image = vae.decode(latent).sample[0] # Decode
image = (image / 2 + 0.5).clamp(0, 1) # To (0, 1)
image = image.detach().cpu().permute(1, 2, 0).numpy() # To numpy, channels lsat
image = (image * 255).round().astype("uint8") # (0, 255) and type uint8
image = Image.fromarray(image) # To PIL
image # The resulting PIL image
```
提供机构:
tglcourse
原始信息汇总
数据集概述
数据集名称
- 名称: latent_afhqv2_256px
数据集特征
- 特征1: label
- 数据类型: 类别标签
- 类别名称:
- 0: cat
- 1: dog
- 2: wild
- 特征2: latent
- 数据类型: float32
数据集分割
- 分割名称: train
- 样本数量: 15803
- 数据大小: 267449972 字节
数据集大小
- 下载大小: 260672854 字节
- 总数据大小: 267449972 字节
图像处理
- 图像尺寸: 256px 正方形
- 编码方式: 使用与Stable Diffusion相同的VAE进行4x32x32的潜在表示编码



