quchenyuan/text-to-art-database
收藏Hugging Face2026-03-27 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/quchenyuan/text-to-art-database
下载链接
链接失效反馈官方服务:
资源简介:
---
pretty_name: Vieutopia T2A Privacy Train v1
language:
- en
license: other
size_categories:
- 100K<n<1M
task_categories:
- text-to-image
task_ids:
- text-to-image-generation
annotations_creators:
- machine-generated
source_datasets:
- original
tags:
- text-to-image
- image-generation
- diffusion
- privacy-preserving
- synthetic
configs:
- config_name: samples
data_files:
- split: train
path: data/samples/train/*.parquet
- split: validation
path: data/samples/validation/*.parquet
- split: test
path: data/samples/test/*.parquet
- config_name: iterations
data_files:
- split: train
path: data/iterations/train/*.parquet
- split: validation
path: data/iterations/validation/*.parquet
- split: test
path: data/iterations/test/*.parquet
---
# Vieutopia T2A Privacy Train v1
## Dataset Summary
Privacy-safe text-to-image dataset repacked into Parquet shards with embedded image bytes.
- Scope: text-to-image outputs only
- Excluded: image-to-image pipelines (`pix2pix_*`, `pst_*`)
- Privacy: no raw task UUIDs, no user/device fields
- Storage format: parquet shards (`image` as binary bytes), no `image_path` dependency
## Splits
### samples
- train: 117572
- validation: 6532
- test: 6532
- total: 130636
### iterations
- train: 477440
- validation: 52256
- test: 52256
- total: 581952
## Schema
### samples config fields
- `sample_id` (string)
- `task_id_hash` (string)
- `prompt` (string)
- `negative_prompt` (string)
- `width` (int32)
- `height` (int32)
- `has_iteration` (bool)
- `iteration_count` (int32)
- `file_sha256` (string)
- `split` (string)
- `created_date` (string)
- `image` (binary image bytes)
### iterations config fields
- `sample_id` (string)
- `frame_idx` (int32)
- `split` (string)
- `image` (binary image bytes)
## Loading Example
```python
from datasets import load_dataset, Image
samples = load_dataset("quchenyuan/text-to-art-database", "samples")
samples = samples.cast_column("image", Image())
iterations = load_dataset("quchenyuan/text-to-art-database", "iterations")
iterations = iterations.cast_column("image", Image())
```
提供机构:
quchenyuan



