Kinseong/LITORA-bases

Name: Kinseong/LITORA-bases
Creator: Kinseong
Published: 2026-04-07 16:39:48
License: 暂无描述

Hugging Face2026-04-07 更新2026-04-12 收录

下载链接：

https://hf-mirror.com/datasets/Kinseong/LITORA-bases

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: cc-by-nc-sa-4.0 task_categories: - image-to-image - text-to-image tags: - portrait-relighting - HDRI - diffusion - multimodal - quality-metadata - computational-photography language: - en pretty_name: "LITORA: A Large-Scale Open Portrait Relighting Dataset" size_categories: - 100K<n<1M configs: - config_name: default data_files: - split: train path: metadata.jsonl --- # LITORA: A Large-Scale Open Portrait Relighting Dataset with AnyLight HDRI-Grounded Synthesis and Multimodal Annotations **LITORA** is a 140K-pair portrait relighting dataset constructed through fully automated synthesis, combining 70K FFHQ portraits with 2,698 real-world HDRI environment maps — providing two orders of magnitude more subject diversity than any existing open relighting dataset. > **Paper**: *LITORA: A Large-Scale Open Portrait Relighting Dataset with AnyLight HDRI-Grounded Synthesis and Multimodal Annotations* (ACM MM 2026 Dataset Track) ## Key Features | Property | Value | |---|---| | Total relighting pairs | 140,036 | | Unique subjects (FFHQ) | 70,000 | | HDRI environments | 2,698 (AmbientCG + Poly Haven) | | Resolution | 1024 × 1024 | | VLM text annotations | 100% | | Per-sample quality scores | 100% (6 dimensions) | ## Dataset Structure Data is distributed as **tar shards** for efficient downloading and HuggingFace compatibility (54 shards instead of 281K individual files): ``` LITORA/ ├── shards/ │ ├── images-00000.tar ... images-00017.tar # 18 shards (source portraits) │ ├── masks-00000.tar # 1 shard (foreground masks) │ ├── relit-00000.tar ... relit-00033.tar # 34 shards (relit targets) │ └── backgrounds-00000.tar # 1 shard (HDRI viewports) ├── metadata.jsonl └── README.md ``` ### Shard Inventory | Shard prefix | Count | Per shard | Total | Contents | |---|---|---|---|---| | `images-*` | 18 | ~5 GB | ~90 GB | 70K source portraits (FFHQ 1024×1024) | | `masks-*` | 1 | ~2.5 GB | ~2.5 GB | 70K foreground alpha masks | | `relit-*` | 34 | ~5 GB | ~167 GB | 140K HDRI-conditioned relit targets | | `backgrounds-*` | 1 | ~1.2 GB | ~1.2 GB | 1,349 HDRI-derived background viewports | | **Total** | **54** | | **~260 GB** | | ### Shard ↔ Metadata Path Mapping `metadata.jsonl` references original directory paths. Map them to shard prefixes as follows: | Metadata field | Path prefix | Shard prefix | |---|---|---| | `image_path` | `images/` | `images-*.tar` | | `mask_path` | `masks/` | `masks-*.tar` | | `adv_image_path` | `image_adv_fbc/` | `relit-*.tar` | | `bg_path` | `bg_env/` | `backgrounds-*.tar` | Files are sorted by name and split into shards sequentially. Each tar stores files with their **base filename only** (no directory prefix). ### Download Size The full LITORA release referenced in the paper (~600 GB) comprises three components: | Component | Size | Included here | Notes | |---|---|---|---| | Generated data (relit targets, masks, backgrounds, metadata) | ~171 GB | Yes | Novel data produced by the AnyLight pipeline | | Source portraits (FFHQ 1024×1024) | ~90 GB | Yes | Bundled for convenience; also available via [FFHQ on HF](https://huggingface.co/datasets/nhkrd/FFHQ) | | HDRI source maps (AmbientCG + Poly Haven) | ~224 GB | No | CC0-licensed; download from source to avoid duplication | **This repository: ~260 GB** (generated data + source portraits — everything needed for training). To reproduce the full pipeline or access the raw HDRI environment maps (~224 GB additional), download them from [AmbientCG](https://ambientcg.com/) and [Poly Haven](https://polyhaven.com/) directly, or use the download script in our [pipeline repository](https://github.com/AnyLight-Dataset/LITORA). ## Metadata Schema Each line in `metadata.jsonl` is a JSON object with the following fields: ```json { "image_path": "images/00000.png", "mask_path": "masks/00000.png", "adv_image_path": "image_adv_fbc/00000__wooden_studio_02_8k__r0.png", "bg_path": "bg_env/wooden_studio_02_8k__r0.png", "hdri_id": "wooden_studio_02_8k", "hdri_description": "wooden studio 02 8k", "env_rotation": [0.157, -0.135], "width": 1024, "height": 1024, "text_foreground": "A baby with short dark hair wearing a ...", "text_background": "A dark wooden studio room with large windows ...", "transformation_instruction": "Apply a warm key light from the right side ...", "quality_metrics": { "dino_cosine": 0.933, "lpips": 0.134, "clip_quality": 0.498, "clip_score": null, "face_detected_src": true, "face_detected_relit": true, "landmark_shift": 0.004, "dino_face_cosine": 0.766, "boundary_gradient_ratio": 2.416, "boundary_score": 0.646, "highlight_color_sim": null, "composite_score": 0.731, "retries": 0 }, "gen_params": { "preset": "balanced", "steps": 25, "cfg": 2.0, "highres_scale": 1.5, "highres_denoise": 0.45, "highres_steps": 18 }, "seed": 42 } ``` ## Training Paradigms LITORA supports two complementary training paradigms: - **Image-conditioned**: (source portrait, mask, background viewport) → relit target - **Text-conditioned**: (source portrait, transformation instruction) → relit target ## Quality Metrics All samples include per-sample quality scores across six dimensions: 1. **Identity Preservation**: DINOv3 cosine similarity, LPIPS perceptual distance, masked SSIM 2. **Perceptual Quality**: CLIP zero-shot quality scoring 3. **Text–Image Alignment**: CLIPScore between instruction and relit image 4. **Face Integrity**: MediaPipe landmark shift and face-crop DINOv3 similarity 5. **Boundary Artifacts**: Sobel gradient analysis along mask contour 6. **Lighting Consistency**: Chrominance comparison with HDRI dominant light Composite score weights: DINO (0.20), CLIP quality (0.20), CLIP score (0.15), boundary (0.15), face (0.15), LPIPS⁻¹ (0.10), SSIM (0.05). Default gate threshold: τ = 0.55. ## Quick Start ### Download & Extract ```bash # Install the HF CLI pip install -U huggingface_hub # Download the full dataset (~260 GB) huggingface-cli download Kinseong/LITORA-bases --repo-type=dataset --local-dir ./LITORA # Extract all shards into working directories cd LITORA mkdir -p images masks image_adv_fbc bg_env for f in shards/images-*.tar; do tar xf "$f" -C images/; done for f in shards/masks-*.tar; do tar xf "$f" -C masks/; done for f in shards/relit-*.tar; do tar xf "$f" -C image_adv_fbc/; done tar xf shards/backgrounds-00000.tar -C bg_env/ ``` ### Load Metadata ```python import json from pathlib import Path with open("metadata.jsonl") as f: records = [json.loads(line) for line in f] print(f"Total pairs: {len(records):,}") # Filter by quality threshold high_quality = [r for r in records if r["quality_metrics"]["composite_score"] > 0.7] print(f"High-quality pairs (composite > 0.7): {len(high_quality):,}") ``` ### Selective Download ```bash # Download only metadata (lightweight, ~200 MB) huggingface-cli download Kinseong/LITORA-bases metadata.jsonl --repo-type=dataset # Download only relit targets huggingface-cli download Kinseong/LITORA-bases --repo-type=dataset --include "shards/relit-*" # Download only source portraits huggingface-cli download Kinseong/LITORA-bases --repo-type=dataset --include "shards/images-*" ``` ## Citation ```bibtex @inproceedings{litora2026, title = {LITORA: A Large-Scale Open Portrait Relighting Dataset with AnyLight HDRI-Grounded Synthesis and Multimodal Annotations}, author = {Un, Kinseong and Li, Yanfeng and Gao, Qinquan and Deng, Wei and Tang, Su-Kit and Chan, Ka-Hou and Tan, Tao}, booktitle = {Proceedings of the ACM International Conference on Multimedia (MM)}, year = {2026} } ``` ## License - **LITORA dataset**: CC-BY-NC-SA 4.0 (compatible with FFHQ license) - **Source HDRI maps**: CC0 (AmbientCG, Poly Haven) - **Pipeline code**: See repository license ## Ethical Considerations This dataset contains human face images derived from FFHQ. Users should: - Use the dataset for research purposes in accordance with the license - Be aware of potential biases inherited from the FFHQ source distribution - Follow responsible AI practices when training models on face data

提供机构：

Kinseong

5,000+

优质数据集

54 个

任务类型

进入经典数据集