PalLoc6D - Estimating the Pose of a Euro Pallet with an RGB Camera based on Synthetic Training Data

Name: PalLoc6D - Estimating the Pose of a Euro Pallet with an RGB Camera based on Synthetic Training Data
Creator: TUHH Universitätsbibliothek
Published: 2022-09-02 10:17:21
License: 暂无描述

DataCite Commons2022-09-02 更新2024-07-13 收录

下载链接：

https://tore.tuhh.de/handle/11420/13164

下载链接

链接失效反馈

官方服务：

资源简介：

PalLoc6D contains 50 000 synthetically generated images of a photorealistic pallet in a domain-radomized environment. PalLoc6D includes annotations of the pallets' 6D pose. The data was created using the NVIDIA Dataset Synthesizer (NDDS, https://github.com/NVIDIA/Dataset_Synthesizer). Additionally, a photorealistic 3D model of a Euro pallet is provided. PalLoc6D can be used to train neural networks for RGB camera-based 6D pallet pose estimation, such as Nvidia's "Deep Object Pose Estimation" (DOPE, https://github.com/NVlabs/Deep_Object_Pose). Furthermore, the weights of the DOPE algorithm, trained with the annotated images are included in PalLoc6D to allow a quick start for experimenting with 6D pose estimation. PalLoc6D was published as part of the paper "Estimating the Pose of a Euro Pallet with an RGB Camera based on Synthetic Training Data", which was presented at the WGTL Fachkolloquium 2022 in Bremen and will be subsequently published in the Logistics Journal. The purpose, creation, and validation of the dataset are further elaborated in the publication. Paper abstract: "Estimating the pose of a pallet and other logistics objects is crucial for various use cases, such as automatized material handling or tracking. Innovations in computer vision, computing power, and machine learning open up new opportunities for device-free localization based on cameras and neural networks. Large image datasets with annotated poses are required for training the network. Manual annotation, especially of 6D poses, is an extremely labor-intensive process. Hence, newer approaches often leverage synthetic training data to automatize the process of generating annotated image datasets. In this work, the generation of synthetic training data for 6D pose estimation of pallets is presented. The data is then used to train the Deep Object Pose Estimation (DOPE) algorithm. The experimental validation of the algorithm proves that the 6D pose estimation of a standardized Euro pallet with a Red-Green-Blue (RGB) camera is feasible. The comparison of the results from three varying datasets under different lighting conditions shows the relevance of an appropriate dataset design to achieve an accurate and robust localization. The quantitative evaluation shows an average position error of less than 20 cm for the preferred dataset. The validated training dataset and a photorealistic model of a Euro pallet are publicly provided."

PalLoc6D 包含50000张在域随机化（domain-randomized）环境中合成生成的照片级真实感托盘图像。该数据集附带托盘6D位姿（6D pose）的标注信息。本数据集通过NVIDIA数据集合成器（NVIDIA Dataset Synthesizer，NDDS，https://github.com/NVIDIA/Dataset_Synthesizer）构建。此外，数据集还提供了欧洲标准托盘（Euro pallet）的照片级真实感3D模型。 PalLoc6D可用于训练基于红绿蓝（RGB）相机的托盘6D位姿（6D pose）估计神经网络，例如英伟达的“深度物体位姿估计”（Deep Object Pose Estimation，DOPE，https://github.com/NVlabs/Deep_Object_Pose）。此外，数据集还附带使用该标注图像训练得到的DOPE算法权重，方便用户快速开展6D位姿（6D pose）估计相关实验。 PalLoc6D作为论文《基于合成训练数据的RGB相机欧洲托盘位姿估计》（Estimating the Pose of a Euro Pallet with an RGB Camera based on Synthetic Training Data）的一部分发布，该论文已在2022年不莱梅WGTL专业研讨会上发表，并将随后在《物流期刊》（Logistics Journal）刊发。该数据集的用途、构建过程与验证细节可在上述论文中进一步查阅。论文摘要：“托盘及其他物流物体的位姿估计在自动化物料搬运、物流追踪等诸多场景中至关重要。计算机视觉、计算能力与机器学习领域的创新为基于相机与神经网络的无设备定位提供了全新可能。训练神经网络需要大量带有位姿标注的图像数据集，而手动标注（尤其是6D位姿（6D pose）标注）是一项劳动强度极高的工作。因此，近年来的相关研究常利用合成训练数据来自动化标注图像数据集的生成流程。本文介绍了面向托盘6D位姿（6D pose）估计的合成训练数据生成方法，并将该数据用于训练深度物体位姿估计（Deep Object Pose Estimation，DOPE）算法。算法的实验验证表明，使用红绿蓝（RGB）相机对标准化欧洲托盘进行6D位姿（6D pose）估计是可行的。通过对比三种不同光照条件下数据集的实验结果，证明了合适的数据集设计对实现精准且鲁棒的定位具有重要意义。定量评估结果显示，最优数据集的平均位置误差低于20厘米。本经过验证的训练数据集与欧洲标准托盘的照片级真实感3D模型均已公开提供。”

提供机构：

TUHH Universitätsbibliothek

创建时间：

2022-08-26

搜集汇总

数据集介绍

背景与挑战

背景概述

PalLoc6D是一个包含50,000张合成图像的数据集，用于训练RGB相机下的6D托盘姿态估计模型，附带托盘3D模型和预训练权重。

以上内容由遇见数据集搜集并总结生成