tglcourse/latent_afhqv2_256px

Name: tglcourse/latent_afhqv2_256px
Creator: tglcourse
Published: 2022-10-28 11:51:36
License: 暂无描述

Hugging Face2022-10-28 更新2024-03-04 收录

下载链接：

https://hf-mirror.com/datasets/tglcourse/latent_afhqv2_256px

下载链接

链接失效反馈

官方服务：

资源简介：

--- dataset_info: features: - name: label dtype: class_label: names: 0: cat 1: dog 2: wild - name: latent sequence: sequence: sequence: float32 splits: - name: train num_bytes: 267449972 num_examples: 15803 download_size: 260672854 dataset_size: 267449972 --- # Dataset Card for "latent_afhqv2_256px" Each image is cropped to 256px square and encoded to a 4x32x32 latent representation using the same VAE as that employed by Stable Diffusion Decoding ```python from diffusers import AutoencoderKL from datasets import load_dataset from PIL import Image import numpy as np import torch # load the dataset dataset = load_dataset('tglcourse/latent_afhqv2_256px') # Load the VAE (requires access - see repo model card for info) vae = AutoencoderKL.from_pretrained("CompVis/stable-diffusion-v1-4", subfolder="vae") latent = torch.tensor([dataset['train'][0]['latent']]) # To tensor (bs, 4, 32, 32) latent = (1 / 0.18215) * latent # Scale to match SD implementation with torch.no_grad(): image = vae.decode(latent).sample[0] # Decode image = (image / 2 + 0.5).clamp(0, 1) # To (0, 1) image = image.detach().cpu().permute(1, 2, 0).numpy() # To numpy, channels lsat image = (image * 255).round().astype("uint8") # (0, 255) and type uint8 image = Image.fromarray(image) # To PIL image # The resulting PIL image ```

提供机构：

tglcourse

原始信息汇总

数据集概述

数据集名称

名称: latent_afhqv2_256px

数据集特征

特征1: label
- 数据类型: 类别标签
- 类别名称:
  - 0: cat
  - 1: dog
  - 2: wild
特征2: latent
- 数据类型: float32

数据集分割

分割名称: train
- 样本数量: 15803
- 数据大小: 267449972 字节

数据集大小

下载大小: 260672854 字节
总数据大小: 267449972 字节

图像处理

图像尺寸: 256px 正方形
编码方式: 使用与Stable Diffusion相同的VAE进行4x32x32的潜在表示编码

5,000+

优质数据集

54 个

任务类型

进入经典数据集