HPM dataset: a dataset of recorded Hand Palm Motion gestures
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/15020057
下载链接
链接失效反馈官方服务:
资源简介:
Background & motivation
This upload consist of a dataset of recorded Hand Palm Motion (HPM) gestures. The motions within this HPM dataset involve rotational movements, translational movements, or a combination of both performed by the palm of the hand. These motions were designed for the frame-invariant gesture recognition Proof of Concept, shown in the following video. In this PoC, the goal is to control the movement of the end-effector and gripper fingers of a manipulator arm through hand palm motion gestures. Specifically, the aim is to maintain a high recognition performance when challenged by significant variations in both the tracker reference frame and the sensor reference frame.
Gesture design
Seven hand palm motion gestures were designed. These gestures are explained below. A figure (gestures.svg) of these gestures is also included.
a) Go Left: move the hand along a straight line.
b) Go Right: trace a circular path with the hand while allowing a slight rotation of the forearm.
c) Go Up: move the hand upward while rotating the forearm,.
d) Go Down: rotate the hand so the palm transitions from facing upward to facing downward.
e) Open Gripper: starting from a downward facing palm, move the hand forward while rotating it such that the palm faces upward (this generates a screw motion with a positive pitch)
f) Close Gripper starting from an upward facing palm, move the hand forward while rotating it such that the palm faces downward (this generates a screw motion with a negative pitch).
g) Go Home: trace an arc by translating the hand.
These hand palm motion gestures are easy to perform, which ensures accessibility for users. Additionally, the gestures were carefully designed such that distinguishing the gestures does not rely on specific coordinate reference frames or directions. That is, the shape of the motion (rectilinear or circular translation, pure rotation, pure screw motion, etc.) contains sufficient information for this distinction.
To reduce transition effects between gestures that are performed successively, all gestures, except for Go Right, were designed to include the motions as explained above, followed by their reverse motions. Hence, after each gesture, the hand returned to the same pose it started from.
Gesture recording setup
The hand palm motions were recorded using an HTC Vive motion capture system, where the user's hand motion was captured by holding an HTC Vive tracker. The HTC Vive system recorded the orientation and position of the tracker with an accuracy of a few degrees and a few millimeters, respectively. The orientation and position trajectories of the tracker were retained as quaternion coordinates and 3D position coordinates sampled at a frequency of 50 Hz. For each of the seven gestures, five trials were recorded, resulting in a total of 7x5=35 recordings.
Dataset augmentation
To introduce the challenge of dealing with contextual variations when recognizing motions, the HPM dataset is augmented toward 420 trials by artificially transforming and perturbing the recorded trajectory data. Specifically, twelve different contexts were designed:
The context Original 1 consists of the original recordings.
The context Original 2 serves as a baseline, with no artificial transformations applied.
The contexts Slower and Faster were obtained by rescaling the time axis and numerically resampling the trajectory coordinates using Screw Linear Interpolation (ScLERP), a generalization of SLERP to SE(3). The resulting transformed trajectories simulate twice as slow and twice as fast executions of the gestures.
The context First Half includes trajectories that consist of only the first half of the trajectory data. These trajectories hence represent gestures that have not yet been finished. This context allows the evaluation of an approach's ability to recognize trajectories when dealing with incomplete data.
The six contexts Change in body frame 1-3 and Change in world frame 1-3 incorporate reference frame changes. The resulting transformations are the following:
Change in body frame 1: the body frame is translated along its X-axis with 5 cm and rotated about its Z-axis with 180°.
Change in body frame 2: the body frame is translated along its Y-axis with 5 cm and rotated about its X-axis with 90°.
Change in body frame 3: the body frame is translated along its Z-axis with 5 cm and rotated about its Y-axis with -90°.(The origin of the body frame was translated by only 5 cm. Hence, this perturbation remained within reasonable deviations with respect to the size of the human hand.)
Change in world frame 1: the world frame is translated along its X-axis with 1 m and rotated about its Z-axis with 180°.
Change in world frame 2: the world frame is translated along its Y-axis with 1 m and rotated about its X-axis with 90°.
Change in world frame 3: the world frame is translated along its Z-axis with 1 m and rotated about its Y-axis with -90°.
The context Combination incorporates multiple transformations. That is, the motions were simulated to be performed twice as fast, only the first half of the trajectory data was retained, and both the body and world frames were varied.
To prevent that the data samples from the contexts Original 2 and First Half 'exactly' match those from the context Original 1, small perturbations were introduced by adding white noise with standard deviations of 1 mm and 1° to the position and orientation trajectories, respectively. For consistency reasons, this noise perturbation was applied to every trial of each context.
Data format
Within this dataset, every trial_x.csv file is a Comma-Separated Values (CSV) file. The trailing number x refers to the order in which the trials were performed. The file trial_x.csv has the following columns:
The first column represents the time axis, consisting of the time stamps at a sampling frequency of 50Hz.
The second to fourth columns contain the xyz-position coordinates of the origin of the body's reference frame.
The fifth to eighth columns contain the quaternion coordinates of the orientation of the body's reference frame. The quaternion coordinates adhere to the scalar first convention.
Citing
The design of these hand palm motion gestures and the development of this dataset is one of the contributions of the work in [link]. This work is submitted to the 2025 IEEE Conference on Automation Science and Engineering (CASE). If you use this HPM dataset, please cite it as follows:
@misc{verduyn2025, title={Enhancing Hand Palm Motion Gesture Recognition by Eliminating Reference Frame Bias via Frame-Invariant Similarity Measures}, author={Arno Verduyn and Maxim Vochten and Joris De Schutter}, year={2025}, eprint={2503.11352}, archivePrefix={arXiv}, primaryClass={cs.RO}, url={https://arxiv.org/abs/2503.11352}, }
创建时间:
2025-03-17



