Birchlabs/wds-dataset-viewer-test
收藏数据集概述
数据集名称
OpenAI guided-diffusion 256px class-conditional unguided samples (20 samples)
数据集大小
n<1K
许可证
apache-2.0
数据集读取示例
从WebDataset读取
python from webdataset import WebDataset from typing import TypedDict, Iterable from PIL import Image from PIL.PngImagePlugin import PngImageFile from io import BytesIO from os import makedirs
Example = TypedDict(Example, { key: str, url: str, img.png: bytes, })
dataset = WebDataset(./wds-dataset-viewer-test/{00000..00001}.tar)
out_root = out makedirs(out_root, exist_ok=True)
it: Iterable[Example] = iter(dataset) for ix, item in enumerate(it): with BytesIO(item[img.png]) as stream: img: PngImageFile = Image.open(stream) img.load() img.save(f{out_root}/{ix}.png)
从HF数据集读取
python from datasets import load_dataset from datasets.dataset_dict import DatasetDict from datasets.arrow_dataset import Dataset from PIL.PngImagePlugin import PngImageFile from typing import TypedDict, Iterable from os import makedirs
class Item(TypedDict): index: int tar: str tar_path: str img: PngImageFile
dataset: DatasetDict = load_dataset(Birchlabs/wds-dataset-viewer-test) train: Dataset = dataset[train]
out_root = out makedirs(out_root, exist_ok=True)
it: Iterable[Item] = iter(train) for item in it: item[img].save(f{out_root}/{item["index"]}.png)



