weixiny0408/vtonqa
收藏Hugging Face2026-03-24 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/weixiny0408/vtonqa
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-4.0
---
# VTONQA: A Multi-Dimensional Quality Assessment Dataset for Virtual Try-On
## 📌 Overview
VTONQA is a benchmark dataset for evaluating the quality of virtual try-on (VTON) results.
It provides multi-dimensional human annotations to assess realism, alignment, and perceptual quality of synthesized try-on images.
Compared with existing VTON datasets, VTONQA focuses on **quality assessment** rather than generation, enabling standardized evaluation across different methods.
---
## 🎯 Key Features
- Multi-dimensional quality scores (e.g., realism, alignment, overall)
- Human-annotated evaluation labels
- Designed for benchmarking VTON models
- Supports both regression and ranking tasks
---
## 🧾 Data Description
### Input Data (`dataset/`)
The input data follows a standard virtual try-on (VTON) pipeline and consists of multiple modalities describing human appearance, clothing, and structural information.
#### Directory Structure
```
dataset/
├── image/ # original person images
├── cloth/ # clothing images
├── cloth_mask/ # binary masks for clothing
├── agnostic/ # person images with clothing removed
├── agnostic_mask/ # masks for removed clothing regions
├── image-parse/ # human parsing (semantic segmentation)
├── densepose_gray/ # DensePose representation
├── openpose_img/ # visualized human pose
├── openpose_json/ # pose keypoints (JSON format)
├── test_pairs_unpaired.txt
```
---
#### Components Description
- `image/`:
Original images of persons wearing their initial clothing.
These provide identity, pose, and background information.
- `cloth/`:
Standalone images of garments to be transferred onto the person.
- `cloth_mask/`:
Binary masks for clothing images, where foreground indicates the clothing region.
Used to improve garment alignment and shape modeling.
- `agnostic/`:
Person images with original clothing removed while preserving identity-related regions such as face and hair.
Serves as a clean input for try-on generation.
- `agnostic_mask/`:
Masks indicating the regions where clothing has been removed.
Helps guide the generation of new garments.
- `image-parse/`:
Human parsing maps that segment the body into semantic regions (e.g., upper body, arms, background).
Provides structural guidance for clothing placement.
- `densepose_gray/`:
DensePose representations encoding fine-grained human body geometry.
Enables better alignment between clothing and body surface.
- `openpose_img/`:
Visualized human skeletons representing pose structure.
- `openpose_json/`:
2D keypoint coordinates of human pose in JSON format.
Provides precise numerical pose information.
- `test_pairs_unpaired.txt`:
Defines the pairing between person images and clothing items.
Example:
```
167_whole.jpg d1.jpg
283_whole.jpg d1.jpg
85_whole.jpg d1.jpg
195_whole.jpg d1.jpg
```
---
### Generated Results (`result/`)
The `result/` directory contains outputs generated by multiple state-of-the-art virtual try-on (VTON) methods, as well as predefined evaluation splits for different quality dimensions.
---
#### 📂 Directory Structure
```
result/
├── CAT-DM/
├── CatV2TON/
├── DS-VITON/
├── FS-VTON/
├── keling/
├── ladivton/
├── LINKFOX/
├── OOTDiffusion/
├── StableVITON/
├── TPD/
├── VITON-HD/
│
├── body_compatibility_test.txt
├── body_compatibility_train.txt
├── clothing_fit_test.txt
├── clothing_fit_train.txt
├── overall_quality_test.txt
├── overall_quality_train.txt
```
---
#### 🧾 Method Outputs
Each subfolder corresponds to a specific VTON method:
- `CAT-DM/`, `CatV2TON/`, `DS-VITON/`, `FS-VTON/`, `VITON-HD/`, etc.
Each folder contains generated try-on images for the same set of input pairs.
---
#### 📊 Evaluation Splits
The dataset provides predefined splits for different evaluation dimensions:
- `body_compatibility_*.txt`
→ Evaluates how well the clothing matches the human body structure
- `clothing_fit_*.txt`
→ Evaluates how naturally the clothing fits the person
- `overall_quality_*.txt`
→ Evaluates overall perceptual quality
Each split includes:
- `*_train.txt`: training subset (for learning-based evaluators)
- `*_test.txt`: testing subset (for evaluation)
---
#### 📄 Split File Format
Each `.txt` file contains a list of sample identifiers.
Example:
```
523_lower_tr10_DS-VITON.png, 54.447846865424935
85_lower_tr6_OOTDiffusion.png, 37.64992522777412
510_upper_shirt10_ladivton.png, 56.459661598182684
```
---
---
license: CC BY 4.0
---
# VTONQA:面向虚拟试穿(Virtual Try-On,以下简称VTON)的多维度质量评估数据集
## 📌 概述
VTONQA是一款用于评估虚拟试穿结果质量的基准数据集,其提供多维度人工标注以评估合成试穿图像的真实感、对齐度与感知质量。
相较于现有VTON数据集,VTONQA聚焦于**质量评估**而非图像生成,可实现不同模型方法间的标准化评测。
---
## 🎯 核心特性
- 多维度质量评分(如真实感、对齐度、整体质量)
- 人工标注的评估标签
- 专为VTON模型基准测试设计
- 支持回归与排序两类评测任务
---
## 🧾 数据说明
### 输入数据(`dataset/`)
输入数据遵循标准虚拟试穿流水线,包含描述人体外观、服装及结构信息的多模态数据。
#### 目录结构
dataset/
├── image/ # 原始人物图像
├── cloth/ # 服装图像
├── cloth_mask/ # 服装二值掩码
├── agnostic/ # 无服装人物图像
├── agnostic_mask/ # 无服装区域掩码
├── image-parse/ # 人体解析(语义分割)结果
├── densepose_gray/ # DensePose 表征
├── openpose_img/ # 可视化人体姿态
├── openpose_json/ # 姿态关键点(JSON格式)
├── test_pairs_unpaired.txt
---
#### 组件说明
- `image/`:
人物身着初始服装的原始图像,包含身份、姿态与背景信息。
- `cloth/`:
待迁移至人物身上的独立服装图像。
- `cloth_mask/`:
服装图像的二值掩码,前景区域表示服装范围,用于提升服装对齐与形状建模效果。
- `agnostic/`:
移除原有服装但保留面部、毛发等身份相关区域的人物图像,作为试穿生成的干净输入。
- `agnostic_mask/`:
标记已移除服装区域的掩码,用于引导新服装的生成。
- `image-parse/`:
将人体分割为语义区域(如上半身、手臂、背景)的人体解析图,为服装摆放提供结构引导。
- `densepose_gray/`:
编码细粒度人体几何信息的DensePose表征,可实现服装与体表的更精准对齐。
- `openpose_img/`:
可视化的人体骨骼图,用于表示姿态结构。
- `openpose_json/`:
JSON格式的二维人体姿态关键点坐标,提供精准的数值化姿态信息。
- `test_pairs_unpaired.txt`:
定义人物图像与服装配对关系的文件。
示例:
167_whole.jpg d1.jpg
283_whole.jpg d1.jpg
85_whole.jpg d1.jpg
195_whole.jpg d1.jpg
---
### 生成结果(`result/`)
`result/` 目录包含多个前沿虚拟试穿方法生成的试穿结果,以及针对不同质量维度的预定义评测划分集。
---
#### 📂 目录结构
result/
├── CAT-DM/
├── CatV2TON/
├── DS-VITON/
├── FS-VTON/
├── keling/
├── ladivton/
├── LINKFOX/
├── OOTDiffusion/
├── StableVITON/
├── TPD/
├── VITON-HD/
│
├── body_compatibility_test.txt
├── body_compatibility_train.txt
├── clothing_fit_test.txt
├── clothing_fit_train.txt
├── overall_quality_test.txt
├── overall_quality_train.txt
---
#### 🧾 模型输出
每个子文件夹对应一种特定的VTON方法:
- `CAT-DM/`、`CatV2TON/`、`DS-VITON/`、`FS-VTON/`、`VITON-HD/` 等。
每个文件夹包含针对同一组输入配对生成的试穿图像。
---
#### 📊 评测划分集
该数据集提供针对不同评测维度的预定义划分集:
- `body_compatibility_*.txt`:评估服装与人体结构的匹配程度
- `clothing_fit_*.txt`:评估服装在人物身上的贴合自然度
- `overall_quality_*.txt`:评估整体感知质量
每个划分集包含:
- `*_train.txt`:训练子集(用于基于学习的评估器)
- `*_test.txt`:测试子集(用于正式评测)
---
#### 📄 划分文件格式
每个`.txt`文件包含样本标识符列表。
示例:
523_lower_tr10_DS-VITON.png, 54.447846865424935
85_lower_tr6_OOTDiffusion.png, 37.64992522777412
510_upper_shirt10_ladivton.png, 56.459661598182684
提供机构:
weixiny0408



