HANDS: a dataset of static Hand-Gestures for Human-Robot Interaction

Mendeley Data2026-04-18 收录

下载链接：

https://data.mendeley.com/datasets/ndrczc35bt

下载链接

链接失效反馈

官方服务：

资源简介：

The HANDS dataset has been created for human-robot interaction research, and it is composed of spatially and temporally aligned RGB and Depth frames. It contains 12 static single-hand gestures performed with both the right-hand and the left-hand, and 3 static two-hands gestures for a total of 29 unique classes. Five subjects (2 females and 3 males) performed the gestures, each of them with a different background and light conditions. For each subject, 150 RGB frames and their corresponding 150 depth frames per gesture have been collected, for a total of 2400 RGB frames and 2400 depth frames per subject. Data has been collected using a Kinect v2 camera intrinsically calibrated to spatially align RGB data to depth data. The temporal alignment has been performed offline using MATLAB, aligning frames with a maximum temporal distance of 66 ms. We provide our MATLAB scripts to process similar rosbags and align the streams, elaborate a MATLAB Labeling Session, and create the same Annotation files we provide. For users who want to use the annotated data for research, we also provide a Python script showing how to convert the Annotation files into a TensorFlow record. The data is valuable for the field of Computer Vision, especially for the tasks of hand-gesture recognition, human-machine interaction, and hand-pose recognition. The dataset can be used to train Deep Learning models to recognize the gestures in the dataset using only a single modality (RGB or depth) or both at the same time. It is also useful as a reference dataset for benchmarking models. If you use this dataset for your work, please cite the related papers: - https://doi.org/10.1016/j.dib.2021.106791 - https://doi.org/10.1016/j.rcim.2020.102085

HANDS数据集专为人机交互研究构建，由空间与时间对齐的RGB帧（RGB frames）与深度帧（Depth frames）组成。其包含12种静态单手势（分别使用左手与右手完成）以及3种静态双手手势，总计29个独立类别。共有5名受试者（2名女性、3名男性）完成了所有手势采集，每位受试者的采集场景均采用不同的背景与光照条件。针对每位受试者的每一类手势，均采集了150帧RGB帧与对应的150帧深度帧，即每位受试者总计拥有2400帧RGB帧与2400帧深度帧。本数据集采用经过内参标定的Kinect v2相机采集，该相机可实现RGB数据与深度数据的空间对齐。时间对齐步骤通过MATLAB离线完成，将最大时间偏差为66ms的帧进行对齐。我们提供了MATLAB脚本，可用于处理同类rosbags数据、实现多数据流对齐、构建MATLAB标注会话，以及生成与本数据集配套的标注文件。针对需要将标注数据用于研究的用户，我们还提供了Python脚本，演示如何将标注文件转换为TensorFlow记录格式。本数据集在计算机视觉（Computer Vision）领域具有重要应用价值，尤其适用于手势识别、人机交互以及手部姿态识别任务。该数据集可用于训练深度学习（Deep Learning）模型，仅通过单模态数据（RGB或深度数据）或同时使用双模态数据完成数据集中手势的识别任务。此外，它还可作为基准数据集用于模型性能评测。若您在研究工作中使用本数据集，请引用以下相关论文： - https://doi.org/10.1016/j.dib.2021.106791 - https://doi.org/10.1016/j.rcim.2020.102085

创建时间：

2021-03-08

5,000+

优质数据集

54 个

任务类型

进入经典数据集