Kinseong/LITORA-bases
收藏Hugging Face2026-04-07 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/Kinseong/LITORA-bases
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-nc-sa-4.0
task_categories:
- image-to-image
- text-to-image
tags:
- portrait-relighting
- HDRI
- diffusion
- multimodal
- quality-metadata
- computational-photography
language:
- en
pretty_name: "LITORA: A Large-Scale Open Portrait Relighting Dataset"
size_categories:
- 100K<n<1M
configs:
- config_name: default
data_files:
- split: train
path: metadata.jsonl
---
# LITORA: A Large-Scale Open Portrait Relighting Dataset with AnyLight HDRI-Grounded Synthesis and Multimodal Annotations
**LITORA** is a 140K-pair portrait relighting dataset constructed through fully automated synthesis, combining 70K FFHQ portraits with 2,698 real-world HDRI environment maps — providing two orders of magnitude more subject diversity than any existing open relighting dataset.
> **Paper**: *LITORA: A Large-Scale Open Portrait Relighting Dataset with AnyLight HDRI-Grounded Synthesis and Multimodal Annotations* (ACM MM 2026 Dataset Track)
## Key Features
| Property | Value |
|---|---|
| Total relighting pairs | 140,036 |
| Unique subjects (FFHQ) | 70,000 |
| HDRI environments | 2,698 (AmbientCG + Poly Haven) |
| Resolution | 1024 × 1024 |
| VLM text annotations | 100% |
| Per-sample quality scores | 100% (6 dimensions) |
## Dataset Structure
Data is distributed as **tar shards** for efficient downloading and HuggingFace compatibility (54 shards instead of 281K individual files):
```
LITORA/
├── shards/
│ ├── images-00000.tar ... images-00017.tar # 18 shards (source portraits)
│ ├── masks-00000.tar # 1 shard (foreground masks)
│ ├── relit-00000.tar ... relit-00033.tar # 34 shards (relit targets)
│ └── backgrounds-00000.tar # 1 shard (HDRI viewports)
├── metadata.jsonl
└── README.md
```
### Shard Inventory
| Shard prefix | Count | Per shard | Total | Contents |
|---|---|---|---|---|
| `images-*` | 18 | ~5 GB | ~90 GB | 70K source portraits (FFHQ 1024×1024) |
| `masks-*` | 1 | ~2.5 GB | ~2.5 GB | 70K foreground alpha masks |
| `relit-*` | 34 | ~5 GB | ~167 GB | 140K HDRI-conditioned relit targets |
| `backgrounds-*` | 1 | ~1.2 GB | ~1.2 GB | 1,349 HDRI-derived background viewports |
| **Total** | **54** | | **~260 GB** | |
### Shard ↔ Metadata Path Mapping
`metadata.jsonl` references original directory paths. Map them to shard prefixes as follows:
| Metadata field | Path prefix | Shard prefix |
|---|---|---|
| `image_path` | `images/` | `images-*.tar` |
| `mask_path` | `masks/` | `masks-*.tar` |
| `adv_image_path` | `image_adv_fbc/` | `relit-*.tar` |
| `bg_path` | `bg_env/` | `backgrounds-*.tar` |
Files are sorted by name and split into shards sequentially. Each tar stores files with their **base filename only** (no directory prefix).
### Download Size
The full LITORA release referenced in the paper (~600 GB) comprises three components:
| Component | Size | Included here | Notes |
|---|---|---|---|
| Generated data (relit targets, masks, backgrounds, metadata) | ~171 GB | Yes | Novel data produced by the AnyLight pipeline |
| Source portraits (FFHQ 1024×1024) | ~90 GB | Yes | Bundled for convenience; also available via [FFHQ on HF](https://huggingface.co/datasets/nhkrd/FFHQ) |
| HDRI source maps (AmbientCG + Poly Haven) | ~224 GB | No | CC0-licensed; download from source to avoid duplication |
**This repository: ~260 GB** (generated data + source portraits — everything needed for training).
To reproduce the full pipeline or access the raw HDRI environment maps (~224 GB additional),
download them from [AmbientCG](https://ambientcg.com/) and [Poly Haven](https://polyhaven.com/) directly,
or use the download script in our [pipeline repository](https://github.com/AnyLight-Dataset/LITORA).
## Metadata Schema
Each line in `metadata.jsonl` is a JSON object with the following fields:
```json
{
"image_path": "images/00000.png",
"mask_path": "masks/00000.png",
"adv_image_path": "image_adv_fbc/00000__wooden_studio_02_8k__r0.png",
"bg_path": "bg_env/wooden_studio_02_8k__r0.png",
"hdri_id": "wooden_studio_02_8k",
"hdri_description": "wooden studio 02 8k",
"env_rotation": [0.157, -0.135],
"width": 1024,
"height": 1024,
"text_foreground": "A baby with short dark hair wearing a ...",
"text_background": "A dark wooden studio room with large windows ...",
"transformation_instruction": "Apply a warm key light from the right side ...",
"quality_metrics": {
"dino_cosine": 0.933,
"lpips": 0.134,
"clip_quality": 0.498,
"clip_score": null,
"face_detected_src": true,
"face_detected_relit": true,
"landmark_shift": 0.004,
"dino_face_cosine": 0.766,
"boundary_gradient_ratio": 2.416,
"boundary_score": 0.646,
"highlight_color_sim": null,
"composite_score": 0.731,
"retries": 0
},
"gen_params": {
"preset": "balanced",
"steps": 25,
"cfg": 2.0,
"highres_scale": 1.5,
"highres_denoise": 0.45,
"highres_steps": 18
},
"seed": 42
}
```
## Training Paradigms
LITORA supports two complementary training paradigms:
- **Image-conditioned**: (source portrait, mask, background viewport) → relit target
- **Text-conditioned**: (source portrait, transformation instruction) → relit target
## Quality Metrics
All samples include per-sample quality scores across six dimensions:
1. **Identity Preservation**: DINOv3 cosine similarity, LPIPS perceptual distance, masked SSIM
2. **Perceptual Quality**: CLIP zero-shot quality scoring
3. **Text–Image Alignment**: CLIPScore between instruction and relit image
4. **Face Integrity**: MediaPipe landmark shift and face-crop DINOv3 similarity
5. **Boundary Artifacts**: Sobel gradient analysis along mask contour
6. **Lighting Consistency**: Chrominance comparison with HDRI dominant light
Composite score weights: DINO (0.20), CLIP quality (0.20), CLIP score (0.15), boundary (0.15), face (0.15), LPIPS⁻¹ (0.10), SSIM (0.05). Default gate threshold: τ = 0.55.
## Quick Start
### Download & Extract
```bash
# Install the HF CLI
pip install -U huggingface_hub
# Download the full dataset (~260 GB)
huggingface-cli download Kinseong/LITORA-bases --repo-type=dataset --local-dir ./LITORA
# Extract all shards into working directories
cd LITORA
mkdir -p images masks image_adv_fbc bg_env
for f in shards/images-*.tar; do tar xf "$f" -C images/; done
for f in shards/masks-*.tar; do tar xf "$f" -C masks/; done
for f in shards/relit-*.tar; do tar xf "$f" -C image_adv_fbc/; done
tar xf shards/backgrounds-00000.tar -C bg_env/
```
### Load Metadata
```python
import json
from pathlib import Path
with open("metadata.jsonl") as f:
records = [json.loads(line) for line in f]
print(f"Total pairs: {len(records):,}")
# Filter by quality threshold
high_quality = [r for r in records if r["quality_metrics"]["composite_score"] > 0.7]
print(f"High-quality pairs (composite > 0.7): {len(high_quality):,}")
```
### Selective Download
```bash
# Download only metadata (lightweight, ~200 MB)
huggingface-cli download Kinseong/LITORA-bases metadata.jsonl --repo-type=dataset
# Download only relit targets
huggingface-cli download Kinseong/LITORA-bases --repo-type=dataset --include "shards/relit-*"
# Download only source portraits
huggingface-cli download Kinseong/LITORA-bases --repo-type=dataset --include "shards/images-*"
```
## Citation
```bibtex
@inproceedings{litora2026,
title = {LITORA: A Large-Scale Open Portrait Relighting Dataset
with AnyLight HDRI-Grounded Synthesis and Multimodal Annotations},
author = {Un, Kinseong and Li, Yanfeng and Gao, Qinquan and Deng, Wei
and Tang, Su-Kit and Chan, Ka-Hou and Tan, Tao},
booktitle = {Proceedings of the ACM International Conference on Multimedia (MM)},
year = {2026}
}
```
## License
- **LITORA dataset**: CC-BY-NC-SA 4.0 (compatible with FFHQ license)
- **Source HDRI maps**: CC0 (AmbientCG, Poly Haven)
- **Pipeline code**: See repository license
## Ethical Considerations
This dataset contains human face images derived from FFHQ. Users should:
- Use the dataset for research purposes in accordance with the license
- Be aware of potential biases inherited from the FFHQ source distribution
- Follow responsible AI practices when training models on face data
提供机构:
Kinseong



