Hand Gesture Recognition Dataset: Static & Dynamic Landmarks
收藏DataCite Commons2026-05-05 更新2026-05-07 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.20032887
下载链接
链接失效反馈官方服务:
资源简介:
Here is the structured technical description translated into English, ready to be published on repositories like Kaggle, GitHub, or Zenodo.
Overview
This repository contains a structured dataset specifically designed for training and evaluating hand gesture recognition models based on anatomical landmarks. The dataset is optimized for workflows involving hybrid neural network architectures and the integration of real-time motion capture, facilitating seamless human-computer interaction in 3D virtual environments.
Data Structure
The dataset comprises a total of 30,027 records, functionally divided into static gestures (postures) and dynamic gestures (motion sequences). It is distributed across the following four files:
1. Static Gestures
Ideal for posture classification tasks using dense neural networks, such as Multilayer Perceptron (MLP) architectures.
static_gestures_v3.csv: Contains 15,028 samples. Each row consists of 43 columns; the first corresponds to the numerical index of the label, and the remaining 42 represent the normalized spatial coordinates of the hand landmarks.
static_gestures_label.csv: A dictionary mapping numerical indices to the 5 available static gesture classes:
Open
Close
Pointer
Ok
Nice
2. Dynamic Gestures
Designed for sequential spatial analysis using recurrent neural networks, such as Long Short-Term Memory (LSTM) architectures.
dynamic_gesturesV3.csv: Contains 14,999 samples. Each row consists of 33 columns; the first indicates the movement class, and the following 32 capture the landmark information associated with the gesture's variation or displacement over time.
dynamic_gestures_label.csv: A dictionary defining the 5 directional and motion-control gesture classes:
Stop
Left
Right
Up
Down
Technical Considerations & Preprocessing
Algorithmic Dimensionality (X, Y): The features extracted in the main files focus exclusively on the planar X and Y coordinates of each landmark.
Z-Coordinate Handling: It is important to emphasize that the Z-coordinate (depth) has been deliberately discarded from this dataset, as it lacks algorithmic relevance and does not improve the classification models' accuracy in this context. If the resulting models are deployed in graphics engines (such as Unity), the Z-coordinate captured by the hardware should be isolated and transmitted solely to the user interface layer to support the visual representation of the hand (as a mirror effect). It must not be injected as a predictive input variable into the neural network.
Recommended Use Cases
This dataset has been structured to facilitate the development of:
Hybrid Recognition Systems: Parallel models where an MLP detects static intentions and fixed commands, while an LSTM network processes dynamic navigation commands.
Virtual and Immersive Reality Control: Translating model predictions into interactive events, navigation commands, and object manipulation within interactive 3D scenarios.
提供机构:
Zenodo
创建时间:
2026-05-05



