Dataset Diffusion

Name: Dataset Diffusion
Creator: VinAI Research
Published: 2023-11-13 13:11:52
License: 暂无描述

arXiv2023-11-13 更新2024-06-21 收录

下载链接：

https://github.com/VinAIResearch/Dataset-Diffusion

下载链接

链接失效反馈

官方服务：

资源简介：

Dataset Diffusion是由VinAI Research开发的合成数据集，旨在通过生成具有像素级语义分割标签的高保真图像来辅助训练语义分割模型。该数据集包含40,000对图像及其对应的语义分割掩码，适用于如自动驾驶、场景理解和物体识别等应用。通过利用Stable Diffusion模型的文本到图像生成能力，Dataset Diffusion能够生成复杂且真实的场景图像，同时提供精确的分割掩码，从而减少对人工标注的依赖。该数据集的创建过程涉及使用特定的文本提示和注意力机制来指导图像和分割掩码的生成，最终用于训练和评估语义分割模型。

Dataset Diffusion is a synthetic dataset developed by VinAI Research, designed to facilitate the training of semantic segmentation models by generating high-fidelity images paired with pixel-level semantic segmentation labels. This dataset includes 40,000 image-segmentation mask pairs, applicable to scenarios such as autonomous driving, scene understanding, and object recognition. Leveraging the text-to-image generation capabilities of the Stable Diffusion model, Dataset Diffusion can produce complex and realistic scene images while delivering precise segmentation masks, thus reducing the dependence on manual annotation. The development process of this dataset utilizes specific text prompts and attention mechanisms to guide the generation of both images and segmentation masks, and it is ultimately employed for the training and evaluation of semantic segmentation models.

提供机构：

VinAI Research

创建时间：

2023-09-26

5,000+

优质数据集

54 个

任务类型

进入经典数据集