five

UniWorld-V1

收藏
魔搭社区2025-12-12 更新2025-06-07 收录
下载链接:
https://modelscope.cn/datasets/PKU-YuanLab/UniWorld-V1
下载链接
链接失效反馈
官方服务:
资源简介:
<p style="color:red; font-size:25px"> The Geneval-style dataset is sourced from <a href="https://huggingface.co/datasets/BLIP3o/BLIP3o-60k" style="color:red">BLIP3o-60k</a>. </p> This dataset is presented in the paper: [UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation](https://huggingface.co/papers/2506.03147) More details can be found in [UniWorld-V1](https://github.com/PKU-YuanGroup/UniWorld-V1) ### Data preparation Download the data from [LanguageBind/UniWorld-V1](https://huggingface.co/datasets/LanguageBind/UniWorld-V1). The dataset consists of two parts: source images and annotation JSON files. Prepare a `data.txt` file in the following format: 1. The first column is the root path to the image. 2. The second column is the corresponding annotation JSON file. 3. The third column indicates whether to enable the region-weighting strategy. We recommend setting it to True for edited data and False for others. ``` data/BLIP3o-60k,json/blip3o_t2i_58859.json,false data/coco2017_caption_canny-236k,coco2017_canny_236574.json,false data/imgedit,json/imgedit/laion_add_part0_edit.json,true ``` We have prepared a `data.txt` file about ImgEdit for your reference. ``` data/imgedit/action/action,json/imgedit/pandam_action_edit.json,true data/imgedit/action/action_part2,json/imgedit/pandam2_action_edit.json,true data/imgedit/action/action_part3,json/imgedit/pandam3_action_edit.json,true data/imgedit/action/action_part4,json/imgedit/pandam4_action_edit.json,true data/imgedit/add/add_part0,json/imgedit/laion_add_part0_edit.json,true data/imgedit/add/add_part1,json/imgedit/laion_add_part1_edit.json,true data/imgedit/add/add_part4,json/imgedit/results_add_laion_part4_edit.json,true data/imgedit/add/add_part5,json/imgedit/results_add_laion_part5_edit.json,true data/imgedit/adjust/adjust_part0,json/imgedit/results_adjust_canny_laion_part0_edit.json,true data/imgedit/adjust/adjust_part2,json/imgedit/results_adjust_canny_laion_part2_edit.json,true data/imgedit/adjust/adjust_part3,json/imgedit/results_adjust_canny_laion_part3_edit.json,true data/imgedit/adjust/adjust_part4,json/imgedit/laion_adjust_canny_part4_edit.json,true data/imgedit/background/background_part0,json/imgedit/results_background_laion_part0_edit.json,true data/imgedit/background/background_part2,json/imgedit/results_background_laion_part2_edit.json,true data/imgedit/background/background_part3,json/imgedit/laion_background_part3_edit.json,true data/imgedit/background/background_part5,json/imgedit/laion_background_part5_edit.json,true data/imgedit/background/background_part7,json/imgedit/laion_background_part7_edit.json,true data/imgedit/compose/compose_part0,json/imgedit/results_compose_part0_edit.json,false data/imgedit/compose/compose_part2,json/imgedit/results_compose_part2_edit.json,false data/imgedit/compose/compose_part6,json/imgedit/results_compose_part6_fix_edit.json,false data/imgedit/refine_replace/refine_replace_part1,json/imgedit/results_extract_ref_part1_refimg_edit.json,true data/imgedit/remove/remove_part0,json/imgedit/laion_remove_part0_edit.json,true data/imgedit/remove/remove_part1,json/imgedit/results_remove_laion_part1_edit.json,true data/imgedit/remove/remove_part4,json/imgedit/results_remove_laion_part4_edit.json,true data/imgedit/remove/remove_part5,json/imgedit/results_remove_laion_part5_edit.json,true data/imgedit/replace/replace_part0,json/imgedit/laion_replace_part0_edit.json,true data/imgedit/replace/replace_part1,json/imgedit/laion_replace_part1_edit.json,true data/imgedit/replace/replace_part4,json/imgedit/results_replace_laion_part4_edit.json,true data/imgedit/replace/replace_part5,json/imgedit/results_replace_laion_part5_edit.json,true data/imgedit/transfer/transfer,json/imgedit/results_style_transfer_edit.json,false data/imgedit/transfer/transfer_part0,json/imgedit/results_style_transfer_part0_cap36472_edit.json,false ``` ### Data details Text-to-Image Generation - [BLIP3o-60k](https://huggingface.co/datasets/BLIP3o/BLIP3o-60k): We add text-to-image instructions to half of the data. [108 GB storage usage.] - [OSP1024-286k](https://huggingface.co/datasets/LanguageBind/UniWorld-V1/tree/main/data/OSP1024-286k): Sourced from internal data of the [Open-Sora Plan](https://github.com/PKU-YuanGroup/Open-Sora-Plan), with captions generated using [Qwen2-VL-72B](https://huggingface.co/Qwen/Qwen2-VL-72B-Instruct). Images have an aspect ratio between 3:4 and 4:3, aesthetic score ≥ 6, and a short side ≥ 1024 pixels. [326 GB storage usage.] Image Editing - [imgedit-724k](https://huggingface.co/datasets/sysuyy/ImgEdit/tree/main): Data is filtered using GPT-4o, retaining approximately half. [2.8T storage usage.] - [OmniEdit-368k](https://huggingface.co/datasets/TIGER-Lab/OmniEdit-Filtered-1.2M): For image editing data, samples with edited regions smaller than 1/100 were filtered out; images have a short side ≥ 1024 pixels. [204 GB storage usage.] - [SEED-Data-Edit-Part1-Openimages-65k](https://huggingface.co/datasets/AILab-CVC/SEED-Data-Edit-Part1-Openimages): For image editing data, samples with edited regions smaller than 1/100 were filtered out. Images have a short side ≥ 1024 pixels. [10 GB storage usage.] - [SEED-Data-Edit-Part2-3-12k](https://huggingface.co/datasets/AILab-CVC/SEED-Data-Edit-Part2-3): For image editing data, samples with edited regions smaller than 1/100 were filtered out. Images have a short side ≥ 1024 pixels. [10 GB storage usage.] - [PromptfixData-18k](https://huggingface.co/datasets/yeates/PromptfixData): For image restoration data and some editing data, samples with edited regions smaller than 1/100 were filtered out. Images have a short side ≥ 1024 pixels. [9 GB storage usage.] - [StyleBooth-11k](https://huggingface.co/scepter-studio/stylebooth): For transfer style data, images have a short side ≥ 1024 pixels. [4 GB storage usage.] - [Ghibli-36k](https://huggingface.co/datasets/LanguageBind/UniWorld-V1/tree/main/data/Ghibli-36k): For transfer style data, images have a short side ≥ 1024 pixels. **Warning: This data has not been quality filtered.** [170 GB storage usage.] Extract & Try-on - [viton_hd-23k](https://huggingface.co/datasets/forgeml/viton_hd): Converted from the source data into an instruction dataset for product extraction. [1 GB storage usage.] - [deepfashion-27k](https://huggingface.co/datasets/lirus18/deepfashion): Converted from the source data into an instruction dataset for product extraction. [1 GB storage usage.] - [shop_product-23k](https://huggingface.co/datasets/LanguageBind/UniWorld-V1/tree/main/data/shop_product-23k): Sourced from internal data of the [Open-Sora Plan](https://github.com/PKU-YuanGroup/Open-Sora-Plan), focusing on product extraction and virtual try-on, with images having a short side ≥ 1024 pixels. [12 GB storage usage.] Image Perception - [coco2017_caption_canny-236k](https://huggingface.co/datasets/gebinhui/coco2017_caption_canny): img->canny & canny->img [25 GB storage usage.] - [coco2017_caption_depth-236k](https://huggingface.co/datasets/gebinhui/coco2017_caption_depth): img->depth & depth->img [8 GB storage usage.] - [coco2017_caption_hed-236k](https://huggingface.co/datasets/gebinhui/coco2017_caption_hed): img->hed & hed->img [13 GB storage usage.] - [coco2017_caption_mlsd-236k](https://huggingface.co/datasets/gebinhui/coco2017_caption_mlsd): img->mlsd & mlsd->img [ GB storage usage.] - [coco2017_caption_normal-236k](https://huggingface.co/datasets/gebinhui/coco2017_caption_normal): img->normal & normal->img [10 GB storage usage.] - [coco2017_caption_openpose-62k](https://huggingface.co/datasets/wangherr/coco2017_caption_openpose): img->pose & pose->img [2 GB storage usage.] - [coco2017_caption_sketch-236k](https://huggingface.co/datasets/wangherr/coco2017_caption_sketch): img->sketch & sketch->img [15 GB storage usage.] - [unsplash_canny-20k](https://huggingface.co/datasets/wtcherr/unsplash_10k_canny): img->canny & canny->img [2 GB storage usage.] - [open_pose-40k](https://huggingface.co/datasets/raulc0399/open_pose_controlnet): img->pose & pose->img [4 GB storage usage.] - [mscoco-controlnet-canny-less-colors-236k](https://huggingface.co/datasets/hazal-karakus/mscoco-controlnet-canny-less-colors): img->canny & canny->img [13 GB storage usage.] - [coco2017_seg_box-448k](https://huggingface.co/datasets/LanguageBind/UniWorld-V1/tree/main/data/coco2017_seg_box-448k): img->detection & img->segmentation (mask), instances with regions smaller than 1/100 were filtered out. We visualise masks on the original image as gt-image. [39 GB storage usage.] - [viton_hd-11k](https://huggingface.co/datasets/forgeml/viton_hd): img->pose [1 GB storage usage.] - [deepfashion-13k](https://huggingface.co/datasets/lirus18/deepfashion): img->pose [1 GB storage usage.]

<p style="color:red; font-size:25px">本Geneval风格数据集源自<a href="https://huggingface.co/datasets/BLIP3o/BLIP3o-60k" style="color:red">BLIP3o-60k</a>。</p> 本数据集收录于论文:[UniWorld:面向统一视觉理解与生成的高分辨率语义编码器](https://huggingface.co/papers/2506.03147) 更多细节可查阅[UniWorld-V1](https://github.com/PKU-YuanGroup/UniWorld-V1) ### 数据准备 从[LanguageBind/UniWorld-V1](https://huggingface.co/datasets/LanguageBind/UniWorld-V1)下载数据集。本数据集包含两部分:原始图像与标注JSON文件。 请按照以下格式编写`data.txt`文件: 1. 第一列为图像根路径; 2. 第二列为对应的标注JSON文件路径; 3. 第三列用于指定是否启用区域加权策略,我们建议对于编辑类数据设置为`True`,其余设置为`False`。 data/BLIP3o-60k,json/blip3o_t2i_58859.json,false data/coco2017_caption_canny-236k,coco2017_canny_236574.json,false data/imgedit,json/imgedit/laion_add_part0_edit.json,true 我们已准备好针对ImgEdit的`data.txt`文件供您参考。 data/imgedit/action/action,json/imgedit/pandam_action_edit.json,true data/imgedit/action/action_part2,json/imgedit/pandam2_action_edit.json,true data/imgedit/action/action_part3,json/imgedit/pandam3_action_edit.json,true data/imgedit/action/action_part4,json/imgedit/pandam4_action_edit.json,true data/imgedit/add/add_part0,json/imgedit/laion_add_part0_edit.json,true data/imgedit/add/add_part1,json/imgedit/laion_add_part1_edit.json,true data/imgedit/add/add_part4,json/imgedit/results_add_laion_part4_edit.json,true data/imgedit/add/add_part5,json/imgedit/results_add_laion_part5_edit.json,true data/imgedit/adjust/adjust_part0,json/imgedit/results_adjust_canny_laion_part0_edit.json,true data/imgedit/adjust/adjust_part2,json/imgedit/results_adjust_canny_laion_part2_edit.json,true data/imgedit/adjust/adjust_part3,json/imgedit/results_adjust_canny_laion_part3_edit.json,true data/imgedit/adjust/adjust_part4,json/imgedit/laion_adjust_canny_part4_edit.json,true data/imgedit/background/background_part0,json/imgedit/results_background_laion_part0_edit.json,true data/imgedit/background/background_part2,json/imgedit/results_background_laion_part2_edit.json,true data/imgedit/background/background_part3,json/imgedit/laion_background_part3_edit.json,true data/imgedit/background/background_part5,json/imgedit/laion_background_part5_edit.json,true data/imgedit/background/background_part7,json/imgedit/laion_background_part7_edit.json,true data/imgedit/compose/compose_part0,json/imgedit/results_compose_part0_edit.json,false data/imgedit/compose/compose_part2,json/imgedit/results_compose_part2_edit.json,false data/imgedit/compose/compose_part6,json/imgedit/results_compose_part6_fix_edit.json,false data/imgedit/refine_replace/refine_replace_part1,json/imgedit/results_extract_ref_part1_refimg_edit.json,true data/imgedit/remove/remove_part0,json/imgedit/laion_remove_part0_edit.json,true data/imgedit/remove/remove_part1,json/imgedit/results_remove_laion_part1_edit.json,true data/imgedit/remove/remove_part4,json/imgedit/results_remove_laion_part4_edit.json,true data/imgedit/remove/remove_part5,json/imgedit/results_remove_laion_part5_edit.json,true data/imgedit/replace/replace_part0,json/imgedit/laion_replace_part0_edit.json,true data/imgedit/replace/replace_part1,json/imgedit/laion_replace_part1_edit.json,true data/imgedit/replace/replace_part4,json/imgedit/results_replace_laion_part4_edit.json,true data/imgedit/replace/replace_part5,json/imgedit/results_replace_laion_part5_edit.json,true data/imgedit/transfer/transfer,json/imgedit/results_style_transfer_edit.json,false data/imgedit/transfer/transfer_part0,json/imgedit/results_style_transfer_part0_cap36472_edit.json,false ### 数据详情 ### 文本到图像生成 - [BLIP3o-60k](https://huggingface.co/datasets/BLIP3o/BLIP3o-60k):我们为半数数据添加了文本到图像生成指令。[占用存储空间:108 GB] - [OSP1024-286k](https://huggingface.co/datasets/LanguageBind/UniWorld-V1/tree/main/data/OSP1024-286k):数据源自[Open-Sora计划(Open-Sora Plan)](https://github.com/PKU-YuanGroup/Open-Sora-Plan)的内部数据,标注字幕由[Qwen2-VL-72B](https://huggingface.co/Qwen/Qwen2-VL-72B-Instruct)生成。图像宽高比介于3:4至4:3之间,美学评分≥6,短边分辨率≥1024像素。[占用存储空间:326 GB] ### 图像编辑 - [imgedit-724k](https://huggingface.co/datasets/sysuyy/ImgEdit/tree/main):通过GPT-4o进行数据过滤,仅保留约半数样本。[占用存储空间:2.8 TB] - [OmniEdit-368k](https://huggingface.co/datasets/TIGER-Lab/OmniEdit-Filtered-1.2M):针对图像编辑数据,过滤掉编辑区域占比小于1/100的样本;图像短边分辨率≥1024像素。[占用存储空间:204 GB] - [SEED-Data-Edit-Part1-Openimages-65k](https://huggingface.co/datasets/AILab-CVC/SEED-Data-Edit-Part1-Openimages):针对图像编辑数据,过滤掉编辑区域占比小于1/100的样本,图像短边分辨率≥1024像素。[占用存储空间:10 GB] - [SEED-Data-Edit-Part2-3-12k](https://huggingface.co/datasets/AILab-CVC/SEED-Data-Edit-Part2-3):针对图像编辑数据,过滤掉编辑区域占比小于1/100的样本,图像短边分辨率≥1024像素。[占用存储空间:10 GB] - [PromptfixData-18k](https://huggingface.co/datasets/yeates/PromptfixData):针对图像修复与部分编辑类数据,过滤掉编辑区域占比小于1/100的样本,图像短边分辨率≥1024像素。[占用存储空间:9 GB] - [StyleBooth-11k](https://huggingface.co/scepter-studio/stylebooth):针对风格迁移数据,图像短边分辨率≥1024像素。[占用存储空间:4 GB] - [Ghibli-36k](https://huggingface.co/datasets/LanguageBind/UniWorld-V1/tree/main/data/Ghibli-36k):针对风格迁移数据,图像短边分辨率≥1024像素。**警告:该数据集未经过质量过滤。**[占用存储空间:170 GB] ### 提取与虚拟试穿 - [viton_hd-23k](https://huggingface.co/datasets/forgeml/viton_hd):由原始数据转换为面向商品提取的指令数据集。[占用存储空间:1 GB] - [deepfashion-27k](https://huggingface.co/datasets/lirus18/deepfashion):由原始数据转换为面向商品提取的指令数据集。[占用存储空间:1 GB] - [shop_product-23k](https://huggingface.co/datasets/LanguageBind/UniWorld-V1/tree/main/data/shop_product-23k):数据源自[Open-Sora计划(Open-Sora Plan)](https://github.com/PKU-YuanGroup/Open-Sora-Plan)的内部数据,聚焦商品提取与虚拟试穿任务,图像短边分辨率≥1024像素。[占用存储空间:12 GB] ### 图像感知 - [coco2017_caption_canny-236k](https://huggingface.co/datasets/gebinhui/coco2017_caption_canny):支持图像→Canny边缘检测图与Canny边缘检测图→图像的双向转换。[占用存储空间:25 GB] - [coco2017_caption_depth-236k](https://huggingface.co/datasets/gebinhui/coco2017_caption_depth):支持图像→深度图与深度图→图像的双向转换。[占用存储空间:8 GB] - [coco2017_caption_hed-236k](https://huggingface.co/datasets/gebinhui/coco2017_caption_hed):支持图像→HED边缘图与HED边缘图→图像的双向转换。[占用存储空间:13 GB] - [coco2017_caption_mlsd-236k](https://huggingface.co/datasets/gebinhui/coco2017_caption_mlsd):支持图像→MLSD线检测图与MLSD线检测图→图像的双向转换。[占用存储空间:无标注] - [coco2017_caption_normal-236k](https://huggingface.co/datasets/gebinhui/coco2017_caption_normal):支持图像→法向图与法向图→图像的双向转换。[占用存储空间:10 GB] - [coco2017_caption_openpose-62k](https://huggingface.co/datasets/wangherr/coco2017_caption_openpose):支持图像→OpenPose姿态图与姿态图→图像的双向转换。[占用存储空间:2 GB] - [coco2017_caption_sketch-236k](https://huggingface.co/datasets/wangherr/coco2017_caption_sketch):支持图像→素描图与素描图→图像的双向转换。[占用存储空间:15 GB] - [unsplash_canny-20k](https://huggingface.co/datasets/wtcherr/unsplash_10k_canny):支持图像→Canny边缘检测图与Canny边缘检测图→图像的双向转换。[占用存储空间:2 GB] - [open_pose-40k](https://huggingface.co/datasets/raulc0399/open_pose_controlnet):支持图像→姿态图与姿态图→图像的双向转换。[占用存储空间:4 GB] - [mscoco-controlnet-canny-less-colors-236k](https://huggingface.co/datasets/hazal-karakus/mscoco-controlnet-canny-less-colors):支持低色彩版本的图像→Canny边缘检测图与Canny边缘检测图→图像的双向转换。[占用存储空间:13 GB] - [coco2017_seg_box-448k](https://huggingface.co/datasets/LanguageBind/UniWorld-V1/tree/main/data/coco2017_seg_box-448k):支持图像→目标检测结果与图像→语义分割掩码的转换,过滤掉区域占比小于1/100的实例。我们将掩码叠加至原始图像上作为真值图像(gt-image)。[占用存储空间:39 GB] - [viton_hd-11k](https://huggingface.co/datasets/forgeml/viton_hd):面向图像→姿态图转换的数据集。[占用存储空间:1 GB] - [deepfashion-13k](https://huggingface.co/datasets/lirus18/deepfashion):面向图像→姿态图转换的数据集。[占用存储空间:1 GB]
提供机构:
maas
创建时间:
2025-06-05
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作