five

jasonfan/youtube-fashion-vton

收藏
Hugging Face2026-04-10 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/jasonfan/youtube-fashion-vton
下载链接
链接失效反馈
官方服务:
资源简介:
# YouTube Fashion Analyzer + Multi-Agent Virtual Try-On **End-to-end pipeline**: YouTube video → outfit identification → product matching → AI virtual try-on, iteratively improved by a 120-round multi-agent debate (kimi-k2.5 × gpt-5.3-codex × gpt-5.4 × Claude Opus arbiter). ## What This Is This project analyzes the YouTube video ["The Most UNHINGED Asian Reality Dating Show"](https://www.youtube.com/watch?v=IU8mjG5xtvU) by Jimmy Zhang, identifies the 4 male contestants' outfits, finds matching Uniqlo/New Balance products online, and generates AI virtual try-on images using [FASHN VTON v1.5](https://github.com/fashn-AI/fashn-vton-1.5). The entire design was then debated and refined by 3 AI agents over 120 rounds, with Claude Opus serving as arbiter. ## Pipeline Overview ``` YouTube Video │ ▼ yt-dlp sb0 storyboard 324 frames (9.7s interval, 320×180px) │ ▼ Claude Vision (claude-sonnet-4-6) 4 male contestants identified + outfit analysis │ ▼ WebSearch + Uniqlo image URL scraping Product matches (tops/bottoms/shoes) │ ▼ FASHN VTON v1.5 (local CPU, nice=20, 4 threads, 15 steps) Virtual try-on images │ ▼ PIL composite cards [Video frame | Product image | AI try-on] ``` ## Outputs | File | Description | |------|-------------| | `outputs/tryon_4males_final.jpg` | 4-row card: all males side-by-side | | `outputs/tryon_A_black_shirt.jpg` | Male A (纹身街头) wearing Uniqlo black linen open collar shirt | | `outputs/tryon_B_blazer.jpg` | Male B (全黑西装) wearing Uniqlo AirSense blazer | | `outputs/tryon_C_white_linen.jpg` | Male C (清爽白T) wearing Uniqlo premium linen oversized shirt | | `outputs/tryon_D_boxy.jpg` | Male D (链条韩系) wearing Uniqlo boxy short sleeve shirt | | `outputs/tryon_C_beige_pants.jpg` | Uniqlo pleated wide ankle pants try-on | ## The 4 Outfits (Budget ≤ $300 each, purchasable online) ### Male A — 纹身街头极简暗系 ($294) - [Linen Blend Open Collar SS Shirt Black $39.90](https://www.uniqlo.com/us/en/products/E464739-000/00) - [U Ribbed Tank Top White $14.90](https://www.uniqlo.com/us/en/products/E455359-000/00) - [Slim Fit Jeans Indigo Blue $49.90](https://www.uniqlo.com/us/en/products/E461589-000/00) - New Balance 2002R Black ~$130 - Gold chain necklace ~$20, Casio MTP-V004G ~$40 ### Male B — 全黑西装白衬衫 ($289) - [Men's AirSense Blazer Black $149.90](https://www.uniqlo.com/us/en/products/E448034-000/00) - [Extra Fine Cotton Broadcloth Shirt White $29.90](https://www.uniqlo.com/us/en/products/E460980-000/00) - [Satin Slim Tie Black $19.90](https://www.uniqlo.com/us/en/products/E474985-000/00) - [AirSense Slim Pants Black $79.90](https://www.uniqlo.com/us/en/products/E467538-000/00) - Silver stud earring (single) ~$10 ### Male C — 清爽亚麻白×海滩风 ($229) - [Premium Linen Oversized SS Shirt White $49.90](https://www.uniqlo.com/us/en/products/E484876-000/00) - [Pleated Wide Ankle Pants Beige $49.90](https://www.uniqlo.com/us/en/products/E457967-000/00) - [New Balance 550 White $99.99](https://www.newbalance.com/pd/550/BB550V1-34466.html) - Square sunglasses ~$20, braided bracelet ~$10 ### Male D — 韩系Boxy极简 ($274) - [Boxy Shirt SS Black $39.90](https://www.uniqlo.com/us/en/products/E482497-000/00) - [Relaxed Fit Straight Jeans Black $49.90](https://www.uniqlo.com/us/en/products/E471530-000/00) - [New Balance 9060 Rain Cloud $139.99](https://www.newbalance.com/pd/9060/U9060GRY.html) - Silver Cuban Link Chain 6mm ~$30, minimalist ring ~$15 ## Code | File | Purpose | |------|---------| | `code/youtube_fashion_analyzer.py` | Main pipeline: storyboard extraction + Claude Vision analysis (supports `--v2` for 720p keyframes) | | `code/catalog.py` | **[NEW]** Closed product catalog — Uniqlo/NB SKUs with `asset_score` gating (Opus verdict #1) | | `code/keyframe_extractor.py` | **[NEW]** 720p keyframe mining with quality scoring + scene diversity (Opus verdict #2) | | `code/intent_gate.py` | **[NEW]** Intent-gated rendering — shoppable card default, VTON on demand (Opus verdict #3) | | `code/run_tryon_optimized.py` | CPU-optimized FASHN VTON runner (4 threads, 15 steps, nice=20) | | `code/create_comparisons.py` | PIL composite card generator (legacy, superseded by intent_gate.py) | | `code/debate_orchestrator.py` | 20-round multi-agent debate (kimi × codex × gpt-5.4 × Opus) | | `code/debate_continue.py` | 100-round continuation with Build/Refine/Harden phases | | `code/fashn_basic_inference.py` | FASHN VTON basic CLI inference | ## Multi-Agent Debate (120 Rounds) Three agents debated the design for 120 rounds: - **Agent-K** (`kimi-k2.5`): Fashion/UX/product perspective (Chinese) - **Agent-C** (`gpt-5.3-codex`): ML engineering / inference optimization - **Agent-G** (`gpt-5.4`): Architecture / devil's advocate Claude Opus served as arbiter at Round 20, Round 70 (mid-point), and Round 120 (final). Full debate log: `debate/rounds_1_20_debate.md` ### Opus Final Verdict (Round 20 summary) > *"The system proves the video→identify→try-on pipeline can work, but it's overconfident where it shouldn't be. Priority: build product truth first, fix input quality second, conditional rendering last. Honesty builds more trust than showing off."* **Opus scores**: Technical 6.5/10 | Fashion accuracy 4.5/10 | UX 4/10 | Cost 5.5/10 **Top 3 improvements** (Opus ordered): 1. ✅ Build closed catalog (Uniqlo/NB SKUs, asset_score ≥ 0.85) → `code/catalog.py` 2. ✅ Replace sb0 storyboard with keyframe mining (yt-dlp 720p + person Re-ID) → `code/keyframe_extractor.py` 3. ✅ Intent-gated rendering (shoppable card by default, VTON only on user click) → `code/intent_gate.py` ## Setup ```bash # 1. Clone FASHN VTON git clone https://github.com/fashn-AI/fashn-vton-1.5.git cd fashn-vton-1.5 # 2. Patch for Mac CPU (no onnxruntime-gpu) sed -i '' 's/onnxruntime-gpu/onnxruntime/' pyproject.toml # 3. Install (Python 3.10+) python3 -m venv .venv && source .venv/bin/activate pip install -e . # 4. Download weights (~2GB) python scripts/download_weights.py --weights-dir ./weights # 5. Run optimized try-on python run_tryon_optimized.py # ~10 min/garment on CPU (M-series Mac), ~30s on GPU ``` ## Target User Profile - Height: 174cm, Weight: 65kg, Asian, slim build - Location: San Jose CA 95124 - Nearest stores: UNIQLO Oakridge (~2mi), New Balance Union Ave (~1.5mi) ## Related Projects - [FASHN VTON v1.5](https://github.com/fashn-AI/fashn-vton-1.5) — virtual try-on model used - [OOTDiffusion](https://github.com/levihsu/OOTDiffusion) — alternative (6.5k stars) - [IDM-VTON](https://github.com/yisol/IDM-VTON) — ECCV 2024, best quality - [CatVTON](https://github.com/Zheng-Chong/CatVTON) — ICLR 2025, most lightweight ## License Code: MIT | FASHN VTON model weights: Apache-2.0 | Debate logs: CC BY 4.0

# YouTube时尚分析器+多智能体虚拟试穿系统 ## 端到端流水线:YouTube视频→穿搭识别→商品匹配→AI虚拟试穿,经120轮多智能体辩论迭代优化(参与智能体包括kimi-k2.5、gpt-5.3-codex、gpt-5.4,仲裁者为Claude Opus)。 ## 项目概述 本项目分析吉米·张(Jimmy Zhang)的YouTube视频《The Most UNHINGED Asian Reality Dating Show》(链接:https://www.youtube.com/watch?v=IU8mjG5xtvU),识别4位男性参赛者的穿搭造型,在线匹配优衣库(Uniqlo)/新百伦(New Balance)商品,并使用[FASHN VTON v1.5](https://github.com/fashn-AI/fashn-vton-1.5)生成AI虚拟试穿图像。随后由3个AI智能体开展120轮辩论优化整体设计,以Claude Opus作为仲裁者。 ## 流水线概览 YouTube 视频 │ ▼ 使用 yt-dlp 生成分镜故事板(sb0 storyboard) 324 帧(间隔9.7秒,分辨率320×180px) │ ▼ Claude Vision(claude-sonnet-4-6)分析 识别4位男性参赛者 + 穿搭解析 │ ▼ 网页搜索 + 优衣库图片URL爬取 匹配商品(上衣/下装/鞋履) │ ▼ FASHN VTON v1.5(本地CPU模式,nice值20,4线程,15步迭代) 生成虚拟试穿图像 │ ▼ 使用PIL合成展示卡片 [视频帧 | 商品图 | AI试穿图] ## 输出文件 | 文件路径 | 描述 | |------|-------------| | `outputs/tryon_4males_final.jpg` | 4行布局卡片:所有男性参赛者并排展示 | | `outputs/tryon_A_black_shirt.jpg` | 男性A(纹身街头风格)身着优衣库黑色亚麻开领衬衫的试穿图 | | `outputs/tryon_B_blazer.jpg` | 男性B(全黑西装造型)身着优衣库AirSense西装外套的试穿图 | | `outputs/tryon_C_white_linen.jpg` | 男性C(清爽白T造型)身着优衣库高级亚麻宽松短袖衬衫的试穿图 | | `outputs/tryon_D_boxy.jpg` | 男性D(链条韩系风格)身着优衣库Boxy版型短袖衬衫的试穿图 | | `outputs/tryon_C_beige_pants.jpg` | 优衣库褶裥宽踝裤试穿图 | ## 四款穿搭方案(单套预算≤300美元,均可在线购买) ### 男性A — 纹身街头极简暗色系(总价294美元) - [亚麻混纺开领短袖衬衫 黑色 39.90美元](https://www.uniqlo.com/us/en/products/E464739-000/00) - [U系列罗纹背心 白色 14.90美元](https://www.uniqlo.com/us/en/products/E455359-000/00) - [修身牛仔裤 靛蓝色 49.90美元](https://www.uniqlo.com/us/en/products/E461589-000/00) - 新百伦(New Balance)2002R黑色款 约130美元 - 金色项链约20美元,卡西欧MTP-V004G腕表约40美元 ### 男性B — 全黑西装搭配白衬衫(总价289美元) - [男士AirSense西装外套 黑色 149.90美元](https://www.uniqlo.com/us/en/products/E448034-000/00) - [超细棉宽布衬衫 白色 29.90美元](https://www.uniqlo.com/us/en/products/E460980-000/00) - [缎面修身领带 黑色 19.90美元](https://www.uniqlo.com/us/en/products/E474985-000/00) - [AirSense修身西裤 黑色 79.90美元](https://www.uniqlo.com/us/en/products/E467538-000/00) - 单只银色耳钉约10美元 ### 男性C — 清爽亚麻白×海滩风造型(总价229美元) - [高级亚麻宽松短袖衬衫 白色 49.90美元](https://www.uniqlo.com/us/en/products/E484876-000/00) - [褶裥宽踝裤 米色 49.90美元](https://www.uniqlo.com/us/en/products/E457967-000/00) - [新百伦(New Balance)550 白色 99.99美元](https://www.newbalance.com/pd/550/BB550V1-34466.html) - 方形太阳镜约20美元,编织手链约10美元 ### 男性D — 韩系Boxy极简风格(总价274美元) - [Boxy版型短袖衬衫 黑色 39.90美元](https://www.uniqlo.com/us/en/products/E482497-000/00) - [宽松直筒牛仔裤 黑色 49.90美元](https://www.uniqlo.com/us/en/products/E471530-000/00) - [新百伦(New Balance)9060 雨云灰 139.99美元](https://www.newbalance.com/pd/9060/U9060GRY.html) - 6mm银色古巴链约30美元,极简指环约15美元 ## 代码文件说明 | 文件路径 | 用途 | |------|---------| | `code/youtube_fashion_analyzer.py` | 主流水线脚本:分镜提取+Claude Vision分析(支持`--v2`参数启用720p关键帧模式) | | `code/catalog.py` | **[新增]** 封闭商品目录 — 优衣库/新百伦SKU,带`asset_score`筛选规则(Opus仲裁裁决第1项) | | `code/keyframe_extractor.py` | **[新增]** 720p关键帧挖掘工具,支持质量评分+场景多样性筛选(Opus仲裁裁决第2项) | | `code/intent_gate.py` | **[新增]** 意图门控渲染 — 默认生成可购物卡片,按需触发虚拟试穿(Opus仲裁裁决第3项) | | `code/run_tryon_optimized.py` | CPU优化版FASHN VTON运行脚本(4线程,15步迭代,nice值20) | | `code/create_comparisons.py` | PIL合成卡片生成脚本(旧版,已被`intent_gate.py`取代) | | `code/debate_orchestrator.py` | 20轮多智能体辩论脚本(kimi × codex × gpt-5.4 × Opus) | | `code/debate_continue.py` | 延续100轮辩论的脚本,包含构建/优化/强化三个阶段 | | `code/fashn_basic_inference.py` | FASHN VTON基础命令行推理脚本 | ## 120轮多智能体辩论 三名智能体针对设计方案开展120轮辩论: - **Agent-K(`kimi-k2.5`)**:时尚/用户体验/商品视角(中文语境) - **Agent-C(`gpt-5.3-codex`)**:机器学习工程/推理优化视角 - **Agent-G(`gpt-5.4`)**:架构设计/批判性视角(即“魔鬼代言人”) Claude Opus分别在第20轮、第70轮(中期)及第120轮(最终轮)担任仲裁者。 完整辩论日志:`debate/rounds_1_20_debate.md` ### Opus最终裁决(第20轮摘要) > *"本系统验证了「视频→识别→试穿」流水线的可行性,但在不应过度自信的环节过于乐观。优先级排序:首先构建可靠的商品数据体系,其次优化输入质量,最后实现条件化渲染。诚实比炫技更能建立用户信任。"* **Opus评分**:技术得分6.5/10 | 穿搭准确率4.5/10 | 用户体验4/10 | 成本控制5.5/10 **Opus提出的Top3改进项(按优先级排序)**: 1. ✅ 构建封闭商品目录(优衣库/新百伦SKU,`asset_score`≥0.85)→ 对应`code/catalog.py` 2. ✅ 将分镜故事板替换为关键帧挖掘方案(yt-dlp 720p采集+人物重识别)→ 对应`code/keyframe_extractor.py` 3. ✅ 实现意图门控渲染(默认生成可购物卡片,仅在用户触发时执行虚拟试穿)→ 对应`code/intent_gate.py` ## 部署步骤 bash # 1. 克隆FASHN VTON仓库 git clone https://github.com/fashn-AI/fashn-vton-1.5.git cd fashn-vton-1.5 # 2. 适配Mac CPU环境(无需onnxruntime-gpu) sed -i '' 's/onnxruntime-gpu/onnxruntime/' pyproject.toml # 3. 安装依赖(需Python 3.10+) python3 -m venv .venv && source .venv/bin/activate pip install -e . # 4. 下载模型权重(约2GB) python scripts/download_weights.py --weights-dir ./weights # 5. 运行优化版试穿脚本 python run_tryon_optimized.py # CPU(M系列Mac)约10分钟/单品,GPU约30秒/单品 ## 目标用户画像 - 身高:174cm,体重:65kg,亚裔,偏瘦体型 - 所在地:美国加利福尼亚州圣何塞市,邮编95124 - 最近门店:优衣库(Uniqlo)Oakridge店(约2英里),新百伦(New Balance)Union Ave店(约1.5英里) ## 相关项目 - [FASHN VTON v1.5](https://github.com/fashn-AI/fashn-vton-1.5) — 本项目使用的虚拟试穿模型 - [OOTDiffusion](https://github.com/levihsu/OOTDiffusion) — 替代方案(获6.5k星标) - [IDM-VTON](https://github.com/yisol/IDM-VTON) — ECCV 2024收录论文方案,视觉效果最优 - [CatVTON](https://github.com/Zheng-Chong/CatVTON) — ICLR 2025收录论文方案,轻量化程度最佳 ## 开源许可 代码:MIT协议 | FASHN VTON模型权重:Apache-2.0协议 | 辩论日志:CC BY 4.0协议
提供机构:
jasonfan
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作