YCB-M: A Multi-Camera RGB-D Dataset for Object Recognition and 6DoF Pose Estimation

Mendeley Data2024-03-27 更新2024-06-29 收录

下载链接：

https://zenodo.org/record/2579173

下载链接

链接失效反馈

官方服务：

资源简介：

While a great variety of 3D cameras have been introduced in recent years, most publicly available datasets for object recognition and pose estimation focus on one single camera. This dataset consists of 32 scenes that have been captured by 7 different 3D cameras, totaling 49,294 frames. This allows evaluating the sensitivity of pose estimation algorithms to the specifics of the used camera and the development of more robust algorithms that are more independent of the camera model. Vice versa, our dataset enables researchers to perform a quantitative comparison of the data from several different cameras and depth sensing technologies and evaluate their algorithms before selecting a camera for their specific task. The scenes in our dataset contain 20 different objects from the common benchmark YCB object and model set. We provide full ground truth 6DoF poses for each object, per-pixel segmentation, 2D and 3D bounding boxes and a measure of the amount of occlusion of each object. If you use this dataset in your research, please cite the following publication: T. Grenzdörffer, M. Günther, and J. Hertzberg, “YCB-M: A Multi-Camera RGB-D Dataset for Object Recognition and 6DoF Pose Estimation,” in 2020 IEEE International Conference on Robotics and Automation, ICRA 2020, Paris, France, May 31-June 4, 2020. IEEE, 2020. @InProceedings{Grenzdoerffer2020ycbm, title = {{YCB-M}: A Multi-Camera {RGB-D} Dataset for Object Recognition and {6DoF} Pose Estimation}, author = {Grenzd{\"{o}}rffer, Till and G{\"{u}}nther, Martin and Hertzberg, Joachim}, booktitle = {2020 {IEEE} International Conference on Robotics and Automation, {ICRA} 2020, Paris, France, May 31-June 4, 2020}, year = {2020}, publisher = {{IEEE}} } This paper is also available on arXiv: https://arxiv.org/abs/2004.11657 To visualize the dataset, follow these instructions (tested on Ubuntu Xenial 16.04): # IMPORTANT: the ROS setup.bash must NOT be sourced, otherwise the following error occurs: # ImportError: /opt/ros/kinetic/lib/python2.7/dist-packages/cv2.so: undefined symbol: PyCObject_Type # nvdu requires Python 3.5 or 3.6 sudo add-apt-repository -y ppa:deadsnakes/ppa # to get python3.6 on Ubuntu Xenial sudo apt-get update sudo apt-get install -y python3.6 libsm6 libxext6 libxrender1 python-virtualenv python-pip # create a new virtual environment virtualenv -p python3.6 venv_nvdu cd venv_nvdu/ source bin/activate # clone our fork of NVIDIA's Dataset Utilities that incorporates some essential fixes pip install -e 'git+https://github.com/mintar/Dataset_Utilities.git#egg=nvdu' # download and transform the meshes # (alternatively, unzip the meshes contained in the dataset # to <path to venv_nvdu>/lib/python3.6/site-packages/nvdu/data/ycb/aligned_cm) nvdu_ycb -s # run nvdu_viz to visualize the dataset cd <a subdirectory of the YCB-M dataset with some frames> nvdu_viz --name_filters '*.jpg' For further details, see README.md.

近年来，各类3D相机层出不穷，但当前公开可用的目标识别与位姿估计数据集大多仅针对单一相机。本数据集涵盖7台不同3D相机采集的32个场景，总计49294帧数据，可用于评估位姿估计算法对所用相机特性的敏感性，助力开发更具鲁棒性、更少依赖相机型号的算法。反之，本数据集也允许研究人员对多款不同相机及深度传感技术采集的数据进行定量对比，并在为特定任务选定相机前，对自身算法进行性能评估。本数据集的场景包含通用基准YCB物体与模型集（YCB object and model set）中的20种不同物体。我们为每个物体提供了完整的6自由度（6DoF）位姿真值、逐像素分割标注、二维与三维边界框，以及各物体遮挡程度的量化指标。若您在研究中使用本数据集，请引用以下文献：T. Grenzdörffer、M. Günther与J. Hertzberg，《YCB-M：一款面向目标识别与6DoF位姿估计的多相机RGB-D（红绿蓝-深度）数据集》，收录于2020年IEEE国际机器人与自动化会议（ICRA 2020），法国巴黎，2020年5月31日至6月4日，IEEE出版社，2020年。对应的BibTeX引用格式如下： @InProceedings{Grenzdoerffer2020ycbm, title = {{YCB-M}: A Multi-Camera {RGB-D} Dataset for Object Recognition and {6DoF} Pose Estimation}, author = {Grenzd{"{o}}rffer, Till and G{"{u}}nther, Martin and Hertzberg, Joachim}, booktitle = {2020 {IEEE} International Conference on Robotics and Automation, {ICRA} 2020, Paris, France, May 31-June 4, 2020}, year = {2020}, publisher = {{IEEE}} } 该论文亦可在arXiv平台获取：https://arxiv.org/abs/2004.11657 如需可视化本数据集，请遵循以下经Ubuntu Xenial 16.04测试通过的操作步骤： # 重要提示：不得源入机器人操作系统（ROS）的setup.bash，否则将触发如下错误： # ImportError: /opt/ros/kinetic/lib/python2.7/dist-packages/cv2.so: undefined symbol: PyCObject_Type # nvdu要求Python 3.5或3.6版本 sudo add-apt-repository -y ppa:deadsnakes/ppa # 用于在Ubuntu Xenial系统上获取Python 3.6 sudo apt-get update sudo apt-get install -y python3.6 libsm6 libxext6 libxrender1 python-virtualenv python-pip # 创建全新虚拟环境 virtualenv -p python3.6 venv_nvdu cd venv_nvdu/ source bin/activate # 克隆我们针对NVIDIA数据集工具的修复分支，该分支整合了若干关键修复项 pip install -e 'git+https://github.com/mintar/Dataset_Utilities.git#egg=nvdu' # 下载并处理网格模型 # （亦可将数据集中包含的网格模型直接解压至<venv_nvdu的路径>/lib/python3.6/site-packages/nvdu/data/ycb/aligned_cm目录下） nvdu_ycb -s # 运行nvdu_viz以可视化数据集 cd <包含部分帧数据的YCB-M数据集子目录> nvdu_viz --name_filters '*.jpg' 更多细节请参阅README.md。

创建时间：

2023-06-28

搜集汇总

数据集介绍

背景与挑战

背景概述

YCB-M是一个多相机RGB-D数据集，专为对象识别和6DoF姿态估计设计，包含32个场景，由7种不同3D相机捕捉，总计49,294帧，旨在评估算法对相机特性的敏感性和提升算法鲁棒性。数据集涵盖20个YCB基准对象，提供完整的6DoF姿态真值、像素级分割、边界框和遮挡度量，适用于多相机比较和算法开发。

以上内容由遇见数据集搜集并总结生成