weixiny0408/vtonqa

Name: weixiny0408/vtonqa
Creator: weixiny0408
Published: 2026-03-24 08:59:12
License: 暂无描述

Hugging Face2026-03-24 更新2026-03-29 收录

下载链接：

https://hf-mirror.com/datasets/weixiny0408/vtonqa

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: cc-by-4.0 --- # VTONQA: A Multi-Dimensional Quality Assessment Dataset for Virtual Try-On ## 📌 Overview VTONQA is a benchmark dataset for evaluating the quality of virtual try-on (VTON) results. It provides multi-dimensional human annotations to assess realism, alignment, and perceptual quality of synthesized try-on images. Compared with existing VTON datasets, VTONQA focuses on **quality assessment** rather than generation, enabling standardized evaluation across different methods. --- ## 🎯 Key Features - Multi-dimensional quality scores (e.g., realism, alignment, overall) - Human-annotated evaluation labels - Designed for benchmarking VTON models - Supports both regression and ranking tasks --- ## 🧾 Data Description ### Input Data (`dataset/`) The input data follows a standard virtual try-on (VTON) pipeline and consists of multiple modalities describing human appearance, clothing, and structural information. #### Directory Structure ``` dataset/ ├── image/ # original person images ├── cloth/ # clothing images ├── cloth_mask/ # binary masks for clothing ├── agnostic/ # person images with clothing removed ├── agnostic_mask/ # masks for removed clothing regions ├── image-parse/ # human parsing (semantic segmentation) ├── densepose_gray/ # DensePose representation ├── openpose_img/ # visualized human pose ├── openpose_json/ # pose keypoints (JSON format) ├── test_pairs_unpaired.txt ``` --- #### Components Description - `image/`: Original images of persons wearing their initial clothing. These provide identity, pose, and background information. - `cloth/`: Standalone images of garments to be transferred onto the person. - `cloth_mask/`: Binary masks for clothing images, where foreground indicates the clothing region. Used to improve garment alignment and shape modeling. - `agnostic/`: Person images with original clothing removed while preserving identity-related regions such as face and hair. Serves as a clean input for try-on generation. - `agnostic_mask/`: Masks indicating the regions where clothing has been removed. Helps guide the generation of new garments. - `image-parse/`: Human parsing maps that segment the body into semantic regions (e.g., upper body, arms, background). Provides structural guidance for clothing placement. - `densepose_gray/`: DensePose representations encoding fine-grained human body geometry. Enables better alignment between clothing and body surface. - `openpose_img/`: Visualized human skeletons representing pose structure. - `openpose_json/`: 2D keypoint coordinates of human pose in JSON format. Provides precise numerical pose information. - `test_pairs_unpaired.txt`: Defines the pairing between person images and clothing items. Example: ``` 167_whole.jpg d1.jpg 283_whole.jpg d1.jpg 85_whole.jpg d1.jpg 195_whole.jpg d1.jpg ``` --- ### Generated Results (`result/`) The `result/` directory contains outputs generated by multiple state-of-the-art virtual try-on (VTON) methods, as well as predefined evaluation splits for different quality dimensions. --- #### 📂 Directory Structure ``` result/ ├── CAT-DM/ ├── CatV2TON/ ├── DS-VITON/ ├── FS-VTON/ ├── keling/ ├── ladivton/ ├── LINKFOX/ ├── OOTDiffusion/ ├── StableVITON/ ├── TPD/ ├── VITON-HD/ │ ├── body_compatibility_test.txt ├── body_compatibility_train.txt ├── clothing_fit_test.txt ├── clothing_fit_train.txt ├── overall_quality_test.txt ├── overall_quality_train.txt ``` --- #### 🧾 Method Outputs Each subfolder corresponds to a specific VTON method: - `CAT-DM/`, `CatV2TON/`, `DS-VITON/`, `FS-VTON/`, `VITON-HD/`, etc. Each folder contains generated try-on images for the same set of input pairs. --- #### 📊 Evaluation Splits The dataset provides predefined splits for different evaluation dimensions: - `body_compatibility_*.txt` → Evaluates how well the clothing matches the human body structure - `clothing_fit_*.txt` → Evaluates how naturally the clothing fits the person - `overall_quality_*.txt` → Evaluates overall perceptual quality Each split includes: - `*_train.txt`: training subset (for learning-based evaluators) - `*_test.txt`: testing subset (for evaluation) --- #### 📄 Split File Format Each `.txt` file contains a list of sample identifiers. Example: ``` 523_lower_tr10_DS-VITON.png, 54.447846865424935 85_lower_tr6_OOTDiffusion.png, 37.64992522777412 510_upper_shirt10_ladivton.png, 56.459661598182684 ``` ---

--- license: CC BY 4.0 --- # VTONQA：面向虚拟试穿（Virtual Try-On，以下简称VTON）的多维度质量评估数据集 ## 📌 概述 VTONQA是一款用于评估虚拟试穿结果质量的基准数据集，其提供多维度人工标注以评估合成试穿图像的真实感、对齐度与感知质量。相较于现有VTON数据集，VTONQA聚焦于**质量评估**而非图像生成，可实现不同模型方法间的标准化评测。 --- ## 🎯 核心特性 - 多维度质量评分（如真实感、对齐度、整体质量） - 人工标注的评估标签 - 专为VTON模型基准测试设计 - 支持回归与排序两类评测任务 --- ## 🧾 数据说明 ### 输入数据（`dataset/`）输入数据遵循标准虚拟试穿流水线，包含描述人体外观、服装及结构信息的多模态数据。 #### 目录结构 dataset/ ├── image/ # 原始人物图像 ├── cloth/ # 服装图像 ├── cloth_mask/ # 服装二值掩码 ├── agnostic/ # 无服装人物图像 ├── agnostic_mask/ # 无服装区域掩码 ├── image-parse/ # 人体解析（语义分割）结果 ├── densepose_gray/ # DensePose 表征 ├── openpose_img/ # 可视化人体姿态 ├── openpose_json/ # 姿态关键点（JSON格式） ├── test_pairs_unpaired.txt --- #### 组件说明 - `image/`：人物身着初始服装的原始图像，包含身份、姿态与背景信息。 - `cloth/`：待迁移至人物身上的独立服装图像。 - `cloth_mask/`：服装图像的二值掩码，前景区域表示服装范围，用于提升服装对齐与形状建模效果。 - `agnostic/`：移除原有服装但保留面部、毛发等身份相关区域的人物图像，作为试穿生成的干净输入。 - `agnostic_mask/`：标记已移除服装区域的掩码，用于引导新服装的生成。 - `image-parse/`：将人体分割为语义区域（如上半身、手臂、背景）的人体解析图，为服装摆放提供结构引导。 - `densepose_gray/`：编码细粒度人体几何信息的DensePose表征，可实现服装与体表的更精准对齐。 - `openpose_img/`：可视化的人体骨骼图，用于表示姿态结构。 - `openpose_json/`： JSON格式的二维人体姿态关键点坐标，提供精准的数值化姿态信息。 - `test_pairs_unpaired.txt`：定义人物图像与服装配对关系的文件。示例： 167_whole.jpg d1.jpg 283_whole.jpg d1.jpg 85_whole.jpg d1.jpg 195_whole.jpg d1.jpg --- ### 生成结果（`result/`） `result/` 目录包含多个前沿虚拟试穿方法生成的试穿结果，以及针对不同质量维度的预定义评测划分集。 --- #### 📂 目录结构 result/ ├── CAT-DM/ ├── CatV2TON/ ├── DS-VITON/ ├── FS-VTON/ ├── keling/ ├── ladivton/ ├── LINKFOX/ ├── OOTDiffusion/ ├── StableVITON/ ├── TPD/ ├── VITON-HD/ │ ├── body_compatibility_test.txt ├── body_compatibility_train.txt ├── clothing_fit_test.txt ├── clothing_fit_train.txt ├── overall_quality_test.txt ├── overall_quality_train.txt --- #### 🧾 模型输出每个子文件夹对应一种特定的VTON方法： - `CAT-DM/`、`CatV2TON/`、`DS-VITON/`、`FS-VTON/`、`VITON-HD/` 等。每个文件夹包含针对同一组输入配对生成的试穿图像。 --- #### 📊 评测划分集该数据集提供针对不同评测维度的预定义划分集： - `body_compatibility_*.txt`：评估服装与人体结构的匹配程度 - `clothing_fit_*.txt`：评估服装在人物身上的贴合自然度 - `overall_quality_*.txt`：评估整体感知质量每个划分集包含： - `*_train.txt`：训练子集（用于基于学习的评估器） - `*_test.txt`：测试子集（用于正式评测） --- #### 📄 划分文件格式每个`.txt`文件包含样本标识符列表。示例： 523_lower_tr10_DS-VITON.png, 54.447846865424935 85_lower_tr6_OOTDiffusion.png, 37.64992522777412 510_upper_shirt10_ladivton.png, 56.459661598182684

提供机构：

weixiny0408

5,000+

优质数据集

54 个

任务类型

进入经典数据集