具有RGB-D视频和3D手势注释的数据集
收藏帕依提提2024-03-04 收录
下载链接:
https://www.payititi.com/opendatasets/show-690.html
下载链接
链接失效反馈官方服务:
资源简介:
在这项工作中,我们研究了使用3D手势识别与3D对象交互的第一人称动态手势。为了实现这一目标,我们收集了RGB-D视频序列,该序列由超过40万个日常动作类别的100K帧组成,涉及26种不同物体的几种手部配置。为了获得手势注释,我们使用了自己的Mo-cap系统,该系统通过6个磁传感器和逆运动学原理自动推断手模型的21个关节中每个关节的3D位置。此外,我们记录了6D对象的姿势,并为一部分手-对象交互序列提供了3D对象模型。据我们所知,这是第一个使用3D手势研究第一人称手势的基准。我们通过18种基准/最先进的方法对RGB-D和基于姿势的动作识别进行了广泛的实验评估。测量使用外观特征,姿势及其组合的影响,并评估不同的训练/测试协议。最后,我们评估在以自我为中心的视图中物体严重遮挡手及其对动作识别的影响时,3D手势估计字段的准备程度。从结果可以看出,与其他数据模式相比,使用手势作为动作识别的线索有明显的好处。我们的数据集和实验可能对3D手姿势估计,6D对象姿势,机器人技术以及动作识别等社区感兴趣。 Citation @InProceedings{FirstPersonAction_CVPR2018, title={First-Person Hand Action Benchmark with RGB-D Videos and 3D Hand Pose Annotations}, author={Garcia-Hernando, Guillermo and Yuan, Shanxin and Baek, Seungryul and Kim, Tae-Kyun} booktitle = {Proceedings of Computer Vision and Pattern Recognition ({CVPR})}, year = {2018} }
In this work, we investigate first-person dynamic gestures for 3D gesture recognition and 3D object interaction. To achieve this goal, we collected RGB-D video sequences, which consist of 100K frames spanning more than 40 daily action categories, involving several hand configurations for 26 distinct objects. To obtain gesture annotations, we employed our own motion capture (Mo-cap) system, which automatically infers the 3D positions of each of the 21 joints of the hand model via six magnetic sensors and inverse kinematics principles. In addition, we recorded 6D object poses and provided 3D object models for a subset of hand-object interaction sequences. To the best of our knowledge, this is the first benchmark for first-person gesture research leveraging 3D hand gestures. We conducted extensive experimental evaluations for RGB-D and pose-based action recognition across 18 baseline and state-of-the-art methods. We measured the impacts of using appearance features, pose features, and their combinations, and evaluated different training/testing protocols. Finally, we evaluated the readiness of the 3D hand gesture estimation field when confronted with severe hand occlusion by objects in egocentric viewpoints, as well as the impact of such occlusion on action recognition. The results demonstrate that using hand gestures as cues for action recognition yields significant benefits compared to other data modalities. Our dataset and experiments may be of interest to communities including 3D hand pose estimation, 6D object pose estimation, robotics, and action recognition. Citation @InProceedings{FirstPersonAction_CVPR2018, title={First-Person Hand Action Benchmark with RGB-D Videos and 3D Hand Pose Annotations}, author={Garcia-Hernando, Guillermo and Yuan, Shanxin and Baek, Seungryul and Kim, Tae-Kyun}, booktitle = {Proceedings of Computer Vision and Pattern Recognition ({CVPR})}, year = {2018} }
提供机构:
帕依提提
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集是一个专注于第一人称动态手势与3D对象交互的RGB-D视频数据集,包含超过40万个日常动作类别的100K帧,涉及26种不同物体和多种手部配置。它提供了3D手势注释(基于Mo-cap系统自动推断的21个关节位置)、6D对象姿势以及部分3D对象模型,是首个用于3D手势第一人称手势研究的基准,适用于动作识别、3D手姿势估计和机器人技术等领域。
以上内容由遇见数据集搜集并总结生成



