hamnaanaa/Duckietown-Multiclass-Semantic-Segmentation-Dataset

Name: hamnaanaa/Duckietown-Multiclass-Semantic-Segmentation-Dataset
Creator: hamnaanaa
Published: 2023-01-25 16:03:13
License: 暂无描述

Hugging Face2023-01-25 更新2024-03-04 收录

下载链接：

https://hf-mirror.com/datasets/hamnaanaa/Duckietown-Multiclass-Semantic-Segmentation-Dataset

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: openrail task_categories: - image-segmentation tags: - Duckietown - Lane Following - Autonomous Driving pretty_name: Duckietown Multiclass Semantic Segmentation Dataset size_categories: - n<1K --- # Multiclass Semantic Segmentation Duckietown Dataset A dataset of multiclass semantic segmentation image annotations for the first 250 images of the ["Duckietown Object Detection Dataset"](https://docs.duckietown.org/daffy/AIDO/out/object_detection_dataset.html). | Raw Image | Segmentated Image | | --- | --- | | <img width="915" alt="raw_image" src="https://user-images.githubusercontent.com/42655977/211690204-301193c3-a651-4a3a-bd66-6458cf3a8778.png"> | <img width="915" alt="segmentation_mask" src="https://user-images.githubusercontent.com/42655977/211690212-2c9ca63a-f3ae-4d65-a4e0-ea76b20a616f.png"> | # Semantic Classes This dataset defines 8 semantic classes (7 distinct classes + implicit background class): | Class | XML Label | Description | Color (RGB) | | --- | --- | --- | --- | | Ego Lane | `Ego Lane` | The lane the agent is supposed to be driving in (default right-hand traffic assumed) | `[102,255,102]` | | Opposite Lane | `Opposite Lane` | The lane opposite to the one the agent is supposed to be driving in (default right-hand traffic assumed) | `[245,147,49]` | | Road End | `Road End` | Perpendicular red indicator found in Duckietown indicating the end of the road or the beginning of an intersection | `[184,61,245]` | | Intersection | `Intersection` | Road tile with no lane markings that has either 3 (T-intersection) or 4 (X-intersection) adjacent road tiles | `[50,183,250]` | | Middle Lane | `Middle Lane` | Broken yellow lane in the middle of the road separating lanes | `[255,255,0]` | | Side Lane | `Side Lane` | Solid white lane marking the road boundary | `[255,255,255]` | | Background | `Background` | Unclassified | - (implicit class) | ### **Notice**: (1) The color assignment is purely a suggestion as the color information encoded in the annotation file is not used by the `cvat_preprocessor.py` and can therefore be overwritten by any other mapping. The specified color mapping is mentioned here for explanatory and consistency reasons as this mapping is used in `dataloader.py` (see [Usage](#usage) for more information). (2) `[Ego Lane, Opposite Lane, Intersection]` are three semantic classes for essentially the same road tiles - the three classes were added to introduce more information for some use cases. Keep in mind, that some semantic segmentation neural network have a hard time learning the difference between these classes, leading to a poor performance on detecting these classes. In such case, treating these three classes as one *"Road"* class helps improving the segmentation performance. (3) The `Middle Lane` and `Side Lane` classes were added later and thus only the first 125 images were annotated. If you want to use these, use the `segmentation_annotation.xml` annotation file. Otherwise, `segmentation_annotation_old.xml` stores 250 images (including the 125 images from the other annotation file) but without these two classes. (4) `Background` is a special semantic class as it is not stored in the annotation file. This class is assigned to all pixels that don't have any other class (see `dataloader.py` for a reference solution for that). # Usage [](#usage) Due to the rather large size of the original dataset *(~750MB)*, this repository only contains annotations file stored in `CVAT for Images 1.1` format as well as two python files: - `cvat_preprocessor.py`: A collection of helper functions to read the annotations file and extract the annotation masks stored as polygons. - `dataloader.py`: A [_PyTorch_](https://pytorch.org)-specific example implementation of a wrapper-dataset to use with PyTorch machine learning models.

--- 许可证：OpenRail 任务类别： - 图像分割标签： - 鸭子城（Duckietown） - 车道跟随 - 自动驾驶展示名称：鸭子城多分类语义分割数据集规模类别：样本数少于1000 --- # 鸭子城多分类语义分割数据集本数据集源自「鸭子城（Duckietown）目标检测数据集」的前250张图像，用于多分类语义分割的图像标注。相关链接：https://docs.duckietown.org/daffy/AIDO/out/object_detection_dataset.html | 原始图像 | 分割掩码图像 | | --- | --- | | <img width="915" alt="raw_image" src="https://user-images.githubusercontent.com/42655977/211690204-301193c3-a651-4a3a-bd66-6458cf3a8778.png"> | <img width="915" alt="segmentation_mask" src="https://user-images.githubusercontent.com/42655977/211690212-2c9ca63a-f3ae-4d65-a4e0-ea76b20a616f.png"> | # 语义类别本数据集共定义8个语义类别（7个显式类别+1个隐式背景类别）： | 类别 | XML标签 | 描述 | RGB颜色值 | | --- | --- | --- | --- | | 本车道（Ego Lane） | `Ego Lane` | AI智能体（AI Agent）应当行驶所在的车道（默认采用右侧通行规则） | `[102,255,102]` | | 对向车道（Opposite Lane） | `Opposite Lane` | 与AI智能体应当行驶车道相对的车道（默认采用右侧通行规则） | `[245,147,49]` | | 道路终点（Road End） | `Road End` | 鸭子城（Duckietown）场景中用于标识道路尽头或交叉口起始位置的垂直红色标识 | `[184,61,245]` | | 交叉口（Intersection） | `Intersection` | 无车道标线的道路区块，带有3个（T型交叉口）或4个（十字型交叉口）相邻道路区块 | `[50,183,250]` | | 中央车道标线（Middle Lane） | `Middle Lane` | 设于道路中央、用于分隔不同车道的黄色虚线车道 | `[255,255,0]` | | 侧车道标线（Side Lane） | `Side Lane` | 用于标识道路边界的白色实线车道 | `[255,255,255]` | | 背景（Background） | `Background` | 未分类区域 | -（隐式类别） | ### 注意事项： 1. 颜色分配仅为建议值：标注文件中编码的颜色信息并未被`cvat_preprocessor.py`使用，因此可被任意自定义映射覆盖。此处指定的颜色映射仅用于说明与保持一致性，该映射在`dataloader.py`中已启用（详见[使用方法](#usage)）。 2. `本车道（Ego Lane）`、`对向车道（Opposite Lane）`与`交叉口（Intersection）`这三个语义类别本质上对应同类道路区块，增设这三个类别是为了在部分应用场景中提供更丰富的信息。需注意，部分语义分割神经网络难以学习这三类别的差异，导致此类别的检测性能不佳。此时将这三个类别合并为一个“道路”类别，可有效提升分割性能。 3. `中央车道标线（Middle Lane）`与`侧车道标线（Side Lane）`这两个类别为后续新增，因此仅对前125张图像进行了标注。若需使用这两个类别，请使用`segmentation_annotation.xml`标注文件；若无需使用，则可使用`segmentation_annotation_old.xml`，该文件包含全部250张图像（涵盖上述125张图像），但未包含这两个类别的标注。 4. `背景（Background）`为特殊语义类别，并未存储在标注文件中。所有未被其他类别覆盖的像素均被归为该类别（相关参考实现详见`dataloader.py`）。 # 使用方法 [](#usage) 由于原始数据集体积较大（约750MB），本仓库仅包含`CVAT for Images 1.1`格式的标注文件与两个Python脚本： - `cvat_preprocessor.py`：用于读取标注文件并提取以多边形形式存储的分割掩码的辅助函数集。 - `dataloader.py`：专为PyTorch框架设计的适配PyTorch机器学习模型的包装数据集示例实现。

提供机构：

hamnaanaa

原始信息汇总

Multiclass Semantic Segmentation Duckietown Dataset 概述

数据集基本信息

许可证: openrail
任务类别: 图像分割
标签: Duckietown, Lane Following, Autonomous Driving
美观名称: Duckietown Multiclass Semantic Segmentation Dataset
大小类别: n<1K

数据集内容

图像数量: 250张
数据来源: 来自 "Duckietown Object Detection Dataset" 的前250张图像
图像类型: 原始图像与分割图像

语义类别

类别数量: 8类（7个显式类别 + 隐式背景类别）

类别详情:

类别	XML标签	描述	颜色(RGB)
Ego Lane	`Ego Lane`	代理应驾驶的车道（默认右侧交通）	`[102,255,102]`
Opposite Lane	`Opposite Lane`	代理应驾驶车道的对向车道（默认右侧交通）	`[245,147,49]`
Road End	`Road End`	指示道路结束或交叉口开始的垂直红色指示器	`[184,61,245]`
Intersection	`Intersection`	无车道标记的道路瓦片，连接3或4个相邻道路瓦片	`[50,183,250]`
Middle Lane	`Middle Lane`	道路中间的黄色虚线，分隔车道	`[255,255,0]`
Side Lane	`Side Lane`	标记道路边界的白色实线	`[255,255,255]`
Background	`Background`	未分类	- (隐式类别)

注意事项

颜色分配: 仅为建议，实际使用中可被覆盖。
类别区分: Ego Lane, Opposite Lane, Intersection 在某些情况下可视为同一类别 "Road" 以提高分割性能。
Middle Lane 和 Side Lane: 仅前125张图像有此两类标注。
背景类别: 不存储在标注文件中，自动分配给无其他类别标注的像素。

使用工具

cvat_preprocessor.py: 用于读取标注文件并提取多边形标注掩码的辅助函数集合。
dataloader.py: PyTorch 特定实现，用于与PyTorch机器学习模型一起使用的包装数据集。

搜集汇总

数据集介绍

构建方式

在自动驾驶仿真领域，Duckietown多类别语义分割数据集基于Duckietown物体检测数据集的前250幅图像构建。该数据集采用CVAT for Images 1.1格式进行标注，通过多边形标注技术对图像中的语义类别进行精细划分。标注过程分阶段进行，初始阶段定义了包括自车道、对向车道、道路尽头、交叉口及背景在内的七类语义标签；后续扩展阶段又引入了中间车道和侧车道两类，但仅覆盖前125幅图像。这种分阶段的标注策略既保证了数据集的完整性，又为不同研究需求提供了灵活性。

特点

该数据集的核心特点在于其精细的语义类别划分，共定义了八个语义类别，其中七个为显式类别，背景作为隐式类别。特别值得注意的是，自车道、对向车道和交叉口三类本质上均属于道路区域，但通过细分可为特定应用场景提供更丰富的信息层次。数据集提供了两种标注文件：包含全部250幅图像但未含中间车道和侧车道的旧版标注，以及包含前125幅图像且涵盖所有类别的新版标注。这种设计允许研究者根据模型复杂度或任务需求选择合适的标注版本，以优化语义分割模型的性能。

使用方法

为便于研究者使用，数据集提供了配套的Python工具脚本。cvat_preprocessor.py包含一系列辅助函数，用于读取CVAT格式的标注文件并提取以多边形存储的标注掩码。dataloader.py则是一个基于PyTorch框架的示例实现，展示了如何将数据集封装为适用于机器学习模型的包装器数据集。用户可根据需要调整颜色映射或类别合并策略，例如将自车道、对向车道和交叉口合并为统一的道路类别，以提升模型在复杂场景下的分割鲁棒性。数据集虽未包含原始图像，但通过标注文件与原始数据集的关联，可轻松实现完整的数据流水线构建。

背景与挑战

背景概述

在自动驾驶技术蓬勃发展的背景下，Duckietown-Multiclass-Semantic-Segmentation-Dataset应运而生，由Duckietown项目团队于近年创建，旨在为小型自动驾驶平台提供精细化的语义分割数据支持。该数据集基于Duckietown对象检测数据集的前250幅图像，通过多类别标注策略，专注于解决自动驾驶中关键的环境感知问题，特别是针对结构化道路场景下的车道、交叉口及路标等元素的精确识别。其核心研究问题在于提升自动驾驶系统在复杂、动态道路环境中的语义理解能力，为相关算法的训练与验证提供了宝贵的实验资源，对推动低成本、教育导向的自动驾驶研究具有显著影响力。

当前挑战

该数据集面临的挑战主要体现在两个方面：首先，在领域问题层面，自动驾驶语义分割需应对复杂道路场景中类间相似性高的问题，例如自车道、对向车道与交叉口等类别本质共享相同道路区域，导致神经网络难以区分，影响分割精度；其次，在构建过程中，数据集因后期新增中间车道和侧车道类别，仅部分图像包含完整标注，引入了数据不一致性，同时标注依赖多边形掩码提取，处理流程较为繁琐，且背景类作为隐式类别需额外处理，增加了数据使用的复杂性。

常用场景

经典使用场景

在自动驾驶领域，语义分割技术是实现环境感知的核心环节。Duckietown多类别语义分割数据集以其精心标注的八种语义类别，为研究者提供了一个标准化的测试平台。该数据集最经典的使用场景在于训练和评估深度学习模型，特别是卷积神经网络，以识别道路场景中的关键元素，如自车道、对向车道、交叉口及各类车道线。通过提供像素级的精确标注，它使得模型能够学习从原始图像中解析出复杂的道路结构，为后续的路径规划和决策制定提供可靠的视觉输入。

解决学术问题

该数据集有效应对了自动驾驶研究中环境感知模块的若干关键挑战。它通过提供多类别、细粒度的道路场景标注，解决了传统方法在复杂、动态环境中语义理解不足的问题。具体而言，数据集帮助研究者探索模型对相似语义类别（如自车道、对向车道与交叉口）的区分能力，以及在小样本或类别不平衡情况下的学习策略。其意义在于推动了语义分割模型在特定、结构化场景下的泛化性与鲁棒性研究，为构建更安全、可靠的自动驾驶感知系统奠定了数据基础。

衍生相关工作

围绕该数据集，已衍生出一系列具有影响力的研究工作。许多经典工作聚焦于改进语义分割网络架构，例如针对数据集中类别相似性高、样本量有限的特点，提出了特定的损失函数或数据增强策略以提升模型性能。此外，部分研究将该数据集作为基准，用于比较不同分割模型在结构化道路场景下的效率与精度。这些工作不仅深化了对特定场景语义分割的理解，也促进了相关模型向更轻量化、更高效的方向发展，形成了从数据到算法再到系统集成的完整研究链条。

以上内容由遇见数据集搜集并总结生成

5,000+

优质数据集

54 个

任务类型

进入经典数据集