淘宝星辰试衣评测集
收藏魔搭社区2026-05-16 更新2026-05-03 收录
下载链接:
https://modelscope.cn/datasets/TaoTianGroup/Tstars-VTON
下载链接
链接失效反馈官方服务:
资源简介:
数据集文件元信息以及数据文件,请浏览“数据集文件”页面获取。
当前数据集卡片使用的是默认模版,数据集的贡献者未提供更加详细的数据集介绍,但是您可以通过如下GIT Clone命令,或者ModelScope SDK来下载数据集
#### 下载方法
:modelscope-code[]{type="sdk"}
:modelscope-code[]{type="git"}
# Tstars-Tryon 1.0
<p align="center">
<a href="https://huggingface.co/datasets/TaobaoTmall-AlgorithmProducts/Tstars-VTON">
<img src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Tstars--VTON-ffc107?logoColor=white" alt="Hugging Face Dataset" style="vertical-align: middle;">
</a>
<a href="https://modelscope.cn/datasets/TaoTianGroup/Tstars-VTON">
<img src="https://img.shields.io/badge/ModelScope-Tstars--VTON-blue" alt="ModelScope Dataset" style="vertical-align: middle;">
</a>
<a href="https://arxiv.org/abs/2604.19748">
<img src="https://img.shields.io/badge/Report-arXiv-b5212f.svg?logo=arxiv" alt="Technical Report" style="vertical-align: middle;">
</a>
<a href="https://creativecommons.org/licenses/by-nc-nd/4.0/legalcode">
<img src="https://img.shields.io/badge/License-CC%20BY--NC--ND%204.0-lightgrey.svg" alt="License" style="vertical-align: middle;">
</a>
</p>
## Commercial Applications
Our virtual try-on model, Tstars-Tryon 1.0, is now deployed on the Taobao App.
Simply scan the QR code below with the Taobao app to instantly try on your favorite looks.
We hope you enjoy a seamless and delightful shopping experience!

## Tstars-VTON - MetaInfo
### Introduction
Tstars-VTON is a comprehensive benchmark designed to evaluate whether a virtual try-on model is truly capable of functioning in real-world scenarios.
In total, Tstars-VTON comprises 1780 random paired samples across 5 garment categories(up, coat, pant, skirt, dress) and 3 accessory categories(shoes, bag, hat),
covering a diverse range of 465 fine-grained subcategories.
## ✨ Key Features
- **High-Complexity Multi-Garment/Accessory Try-On Scenarios**: Introduces multi-garment combinations with layered outfits and accessories, covering free-combination scenarios with 1-6 items.
- **Diverse Data Coverage and Fine-Grained Attributes**: Fine-gined attributes of both model and garment images through VLM-based generation and manual check.
- **Privacy-Preserving Mechanism**: All portraits collected from open-source data are matched to a similar face in licensed database and anonymized through face swapping.
- **Flexible Unpaired Settings**: Supports a fully unpaired evaluation setting, decouplin the model and garment database
- **Comprehensive Evaluation Paradigm Aligned with Human Preferences**: Proposes a VLM-driven evaluation paradigm that Decomposes virtual try-on quality into five rigorous dimensions.
## ✨ Key Attributes
Key attributes in our Tstars-VTON
| Fields | Description |
|----------|-------|
| model | Reference image of the person (target identity) to be dressed |
| up | Source image of the upper-body garment (e.g., T-shirt, blouse) |
| coat | Source image of the outerwear garment (e.g., jacket, coat) |
| pant | Source image of the lower-body garment (e.g., trousers, jeans) |
| skirt | Source image of the lower-body garment (skirt) |
| dress | Source image of the full-body garment (one-piece dress) |
| shoes | Source image of shoes (e.g., Sneakers, High Heels) |
| bag | Source image of bags (e.g., handbag, shoulder bag) |
| hat | Source image of hats (e.g., Peaked Cap, Beret) |
| short_caption | English short instruction generated by gemini-3.1-flash-lite-preview |
| model_info | Model attributes(VLM-based + manual check): Gender, Age, Skin_tone, Body_type, Pose, Clothing_occlusion, Lighting_condition, Shooting_angle, Background_complexity, Clarity, Portrait_size |
| up_info | Cloth attributes(VLM-based + manual check): Clothing-Category, Clothing-Style, Clothing-Length, Clothing-Sleeve_length, Clothing-Fit, Clothing-Material, Clothing-Texture, Clothing-Design_detail, Clothing-Display_format, Gender_fit |
| coat_info | Same as up_info |
| pant_info | Same as up_info |
| skirt_info | Same as up_info |
| dress_info | Same as up_info |
| shoes_info | Accessory attributes(VLM-based + manual check): Accessory-Category, Accessory-Style, Accessory-Display_format, Gender_fit |
| bag_info | Same as shoes_info |
| hat_info | Same as shoes_info |
### Distribution
The Distribution of Multi-Garment/Accessory Try-On Combinations
| Garment nums | Count |
|----------|-------|
| 1 | 799 |
| 2 | 320 |
| 3 | 286 |
| 4 | 173 |
| 5 | 104 |
| 6 | 98 |
## Tstars-VTON - Evaluation Toolkit
An automated evaluation toolkit for virtual try-on models, powered by VLM-as-Judge (e.g., Gemini). Given a set of try-on results, the toolkit scores each sample across **four quality dimensions** and produces an aggregated report.
### Scoring Dimensions
Each dimension is scored on a **1.0–10.0** scale:
| Dimension | What it measures |
|---|---|
| **identity_consistency** | Face, body shape, pose, scale and skin tone preservation |
| **garment_fidelity** | Silhouette, style details, color, pattern and layering correctness |
| **background_preservation** | Background content unchanged, no cropping or expansion |
| **physical_logic** | Limb anatomy, mesh clipping and object interpenetration artifacts |
The **overall score** is the geometric mean of the four dimension scores.
### How It Works
The toolkit uses a **split-call strategy** — each sample is evaluated via two VLM API calls:
- **Call 1** (identity + garment): sends `[person, garment(s)..., result]` images
- **Call 2** (background + physics): sends `[person, result]` images
This split design improves scoring accuracy by letting the VLM focus on related aspects together.
### Prerequisites
- **Benchmark dataset**: the Tstars-VTON Benchmark parquet files (`Tstars-VTON-*.parquet`)
- **API access**: an OpenAI-compatible VLM endpoint (default: Gemini API)
### Input Format
Prepare a **JSONL file** where each line maps a benchmark sample index to your model's try-on result image:
```jsonl
{"sample_index": 0, "result": "/path/to/result_0.png"}
{"sample_index": 1, "result": "/path/to/result_1.png"}
{"sample_index": 2, "result": "/path/to/result_2.png"}
```
- `sample_index`: the 0-based index into the benchmark dataset
- `result`: path to the generated try-on image
### Usage
```bash
python Evaluation_Toolkit/eval.py \
--dataset_path "/path/to/Tstars-VTON-*.parquet" \
--result_jsonl "/path/to/my_results.jsonl" \
--output_dir eval_output/my_model \
--api_key "YOUR_API_KEY" \
--workers 8
```
### Output
The toolkit produces two files in `--output_dir`:
**`cases.jsonl`** — per-sample scoring details
**`summary.json`** — aggregated scores with breakdowns
Results are broken down into **overall**, **single-garment** (1 item), and **multi-garment** (2+ items) subsets.
### Features
- **Resume support**: if evaluation is interrupted, re-running the same command will skip already-scored samples and continue from where it left off.
- **Multi-key rotation**: pass multiple API keys (comma-separated) to distribute requests across keys and avoid rate limits.
- **Concurrent scoring**: use `--workers` to control parallelism for faster evaluation.
- **Custom VLM endpoint**: any OpenAI-compatible API can be used via `--api_base_url` and `--model_name`.
### File Structure
```
Evaluation_Toolkit/
├── assets/
│ └── sample_index0.png
├── eval.py # Main evaluation script
├── tryon_prompts.py # VLM prompt templates for all scoring dimensions
├── run.sh # Example run script
└── test.jsonl # Test file
```
## License
**Tstars-VTON** is released under the [Creative Commons Attribution–NonCommercial–NoDerivatives (CC BY-NC-ND 4.0)](https://creativecommons.org/licenses/by-nc-nd/4.0/) license.
- ✅ **Free for academic research purposes only**
- ❌ **Commercial use is prohibited**
**Data Source:** All images included in Tstars-VTON were legally purchased and obtained through official channels to ensure copyright compliance.
*By using this dataset, you agree to comply with the applicable license terms.*
## 🖊️ Citation
We kindly encourage citation of our work if you find it useful.
```bibtex
@misc{chen2026tstarstryon10robustrealistic,
title={Tstars-Tryon 1.0: Robust and Realistic Virtual Try-On for Diverse Fashion Items},
author={Mengting Chen and Zhengrui Chen and Yongchao Du and Zuan Gao and Taihang Hu and Jinsong Lan and Chao Lin and Yefeng Shen and Xingjian Wang and Zhao Wang and Zhengtao Wu and Xiaoli Xu and Zhengze Xu and Hao Yan and Mingzhou Zhang and Jun Zheng and Qinye Zhou and Xiaoyong Zhu and Bo Zheng},
year={2026},
eprint={2604.19748},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2604.19748},
}
```
数据集文件元数据与数据文件可通过「数据集文件」页面获取。本数据集卡片采用默认模板生成,数据集贡献者未提供更详细的介绍,但您可通过下述GIT Clone命令或ModelScope SDK下载该数据集。
#### 下载方法
:modelscope-code[]{type="sdk"}
:modelscope-code[]{type="git"}
# Tstars-Tryon 1.0
<p align="center">
<a href="https://huggingface.co/datasets/TaobaoTmall-AlgorithmProducts/Tstars-VTON">
<img src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Tstars--VTON-ffc107?logoColor=white" alt="Hugging Face 数据集" style="vertical-align: middle;">
</a>
<a href="https://modelscope.cn/datasets/TaoTianGroup/Tstars-VTON">
<img src="https://img.shields.io/badge/ModelScope-Tstars--VTON-blue" alt="ModelScope 数据集" style="vertical-align: middle;">
</a>
<a href="https://arxiv.org/abs/2604.19748">
<img src="https://img.shields.io/badge/Report-arXiv-b5212f.svg?logo=arxiv" alt="技术报告" style="vertical-align: middle;">
</a>
<a href="https://creativecommons.org/licenses/by-nc-nd/4.0/legalcode">
<img src="https://img.shields.io/badge/License-CC%20BY--NC--ND%204.0-lightgrey.svg" alt="许可证" style="vertical-align: middle;">
</a>
</p>
## 商业应用
我们的虚拟试穿模型Tstars-Tryon 1.0现已部署于淘宝App。您只需使用淘宝App扫描下方二维码,即可即刻试穿心仪穿搭。祝您拥有流畅愉悦的购物体验!

## Tstars-VTON - 元数据信息
### 简介
Tstars-VTON是一款综合性基准测试集,旨在评估虚拟试穿模型是否真正具备在真实场景中落地应用的能力。该数据集总计包含1780组随机配对样本,涵盖5类服装(上衣、外套、长裤、半身裙、连衣裙)与3类配饰(鞋履、箱包、帽饰),覆盖465个细分品类。
## ✨ 核心特性
- **高复杂度多服装/配饰试穿场景**:引入包含分层穿搭与配饰的多服装组合,覆盖1至6件单品的自由搭配场景。
- **多样化数据覆盖与细粒度属性标注**:通过基于大语言模型(Large Language Model, LLM)的生成技术与人工核验,获取人物与服装图像的细粒度属性信息。
- **隐私保护机制**:所有从开源数据中采集的人像,均会与授权数据库中的相似人脸进行匹配,并通过人脸替换技术实现匿名化处理。
- **灵活的非配对设置**:支持完全非配对的评估模式,将人物数据库与服装数据库解耦。
- **贴合人类偏好的综合评估范式**:提出基于视觉语言模型(Vision-Language Model, VLM)的评估框架,将虚拟试穿质量拆解为五个严谨的评估维度。
## ✨ 核心属性
Tstars-VTON数据集的核心属性如下:
| 字段 | 描述 |
|----------|-------|
| model | 待试穿的人物参考图像(目标身份) |
| up | 上身服装的源图像(如T恤、衬衫) |
| coat | 外套类服装的源图像(如夹克、大衣) |
| pant | 下装类服装的源图像(如西裤、牛仔裤) |
| skirt | 半身裙类服装的源图像 |
| dress | 全身连衣裙类服装的源图像 |
| shoes | 鞋履类配饰的源图像(如运动鞋、高跟鞋) |
| bag | 箱包类配饰的源图像(如手提包、肩背包) |
| hat | 帽饰类配饰的源图像(如棒球帽、贝雷帽) |
| short_caption | 由gemini-3.1-flash-lite-preview生成的英文简短指令 |
| model_info | 人物属性(基于视觉语言模型与人工核验):性别、年龄、肤色、体型、姿态、衣物遮挡情况、光照条件、拍摄角度、背景复杂度、清晰度、人像占比 |
| up_info | 服装属性(基于视觉语言模型与人工核验):服装品类、服装风格、衣长、袖长、合身度、面料材质、纹理细节、设计细节、展示形式、适配性别 |
| coat_info | 与up_info一致 |
| pant_info | 与up_info一致 |
| skirt_info | 与up_info一致 |
| dress_info | 与up_info一致 |
| shoes_info | 配饰属性(基于视觉语言模型与人工核验):配饰品类、配饰风格、展示形式、适配性别 |
| bag_info | 与shoes_info一致 |
| hat_info | 与shoes_info一致 |
### 样本分布
多服装/配饰试穿组合的样本分布如下:
| 服装件数 | 样本数量 |
|----------|-------|
| 1 | 799 |
| 2 | 320 |
| 3 | 286 |
| 4 | 173 |
| 5 | 104 |
| 6 | 98 |
## Tstars-VTON - 评估工具包
本工具包为虚拟试穿模型的自动化评估工具,基于「大语言模型作为评判者(VLM-as-Judge)」技术(如Gemini)实现。针对一组虚拟试穿结果,工具包将从**四个质量维度**对每个样本进行评分,并生成汇总报告。
### 评分维度
每个维度的评分范围为**1.0–10.0**:
| 评估维度 | 评估内容 |
|---|---|
| **身份一致性** | 人脸、体型、姿态、比例与肤色的保留程度 |
| **服装保真度** | 服装轮廓、风格细节、色彩、图案与搭配层次的正确性 |
| **背景保留度** | 背景内容无改动,未出现裁剪或扩展 |
| **物理合理性** | 肢体解剖结构、网格裁剪与物体穿模瑕疵 |
**综合得分**为四个维度得分的几何平均值。
### 工作原理
工具包采用**分段调用策略**——每个样本通过两次视觉语言模型API调用完成评估:
- **调用1**(身份与服装维度):传入`[人物、服装(多件)、试穿结果]`图像
- **调用2**(背景与物理合理性维度):传入`[人物、试穿结果]`图像
该分段设计可让视觉语言模型聚焦于相关联的评估维度,从而提升评分准确性。
### 前置依赖
- **基准数据集**:Tstars-VTON基准测试集的Parquet格式文件(`Tstars-VTON-*.parquet`)
- **API访问权限**:兼容OpenAI格式的视觉语言模型接口(默认使用Gemini API)
### 输入格式
请准备一个**JSONL格式文件**,其中每一行将基准测试集的样本索引与您的模型生成的试穿结果图像路径进行映射:
jsonl
{"sample_index": 0, "result": "/path/to/result_0.png"}
{"sample_index": 1, "result": "/path/to/result_1.png"}
{"sample_index": 2, "result": "/path/to/result_2.png"}
- `sample_index`:基准数据集中的0基索引
- `result`:生成的虚拟试穿结果图像的路径
### 使用方法
bash
python Evaluation_Toolkit/eval.py
--dataset_path "/path/to/Tstars-VTON-*.parquet"
--result_jsonl "/path/to/my_results.jsonl"
--output_dir eval_output/my_model
--api_key "YOUR_API_KEY"
--workers 8
### 输出结果
工具包将在`--output_dir`指定的目录下生成两个文件:
**`cases.jsonl`**:单样本的详细评分结果
**`summary.json`**:包含细分维度的汇总评分结果
评估结果将分为**综合得分**、**单服装试穿**(1件单品)与**多服装试穿**(2件及以上单品)三个子集。
### 工具特性
- **断点续评支持**:若评估过程中断,重新运行相同命令将跳过已完成评分的样本,从断点处继续执行。
- **多密钥轮换**:传入多个以逗号分隔的API密钥,可将请求分发至不同密钥,避免触发调用限制。
- **并发评分**:通过`--workers`参数控制并行进程数,加速评估流程。
- **自定义视觉语言模型接口**:可通过`--api_base_url`与`--model_name`参数使用任意兼容OpenAI格式的API接口。
### 文件结构
Evaluation_Toolkit/
├── assets/
│ └── sample_index0.png
├── eval.py # 主评估脚本
├── tryon_prompts.py # 各评估维度的视觉语言模型提示模板
├── run.sh # 示例运行脚本
└── test.jsonl # 测试文件
## 许可协议
**Tstars-VTON**采用[知识共享署名-非商业性使用-禁止演绎 4.0 国际许可协议(CC BY-NC-ND 4.0)](https://creativecommons.org/licenses/by-nc-nd/4.0/)进行发布。
- ✅ **仅可用于学术研究用途**
- ❌ **禁止商业使用**
**数据来源**:Tstars-VTON中的所有图像均通过官方渠道合法购买获取,以确保版权合规。
*使用本数据集即表示您同意遵守相关许可协议条款。*
## 🖊️ 引用声明
若本数据集对您的研究有所帮助,恳请引用我们的工作:
bibtex
@misc{chen2026tstarstryon10robustrealistic,
title={Tstars-Tryon 1.0: Robust and Realistic Virtual Try-On for Diverse Fashion Items},
author={Mengting Chen and Zhengrui Chen and Yongchao Du and Zuan Gao and Taihang Hu and Jinsong Lan and Chao Lin and Yefeng Shen and Xingjian Wang and Zhao Wang and Zhengtao Wu and Xiaoli Xu and Zhengze Xu and Hao Yan and Mingzhou Zhang and Jun Zheng and Qinye Zhou and Xiaoyong Zhu and Bo Zheng},
year={2026},
eprint={2604.19748},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2604.19748},
}
提供机构:
maas
创建时间:
2026-04-20
搜集汇总
数据集介绍

背景与挑战
背景概述
淘宝星辰试衣评测集(Tstars-VTON)是一个用于评估虚拟试穿模型在真实场景中性能的基准数据集,包含1780个样本,覆盖5类服装和3类配饰,共计465个细分子类别。该数据集支持多服装/配饰组合、细粒度属性标注和隐私保护机制,并提供了一个基于视觉语言模型的自动评估工具包,用于从身份一致性、服装保真度、背景保持和物理逻辑四个维度进行评分。
以上内容由遇见数据集搜集并总结生成



