基于单帧图像的静态手势识别数据

Name: 基于单帧图像的静态手势识别数据
Creator: 正数智慧（温州）科技有限公司
Published: 2025-12-24 09:41:09
License: 暂无描述

浙江省数据知识产权登记平台2025-12-24 更新2025-12-25 收录

下载链接：

https://www.zjip.org.cn/home/announce/trends/8418015

下载链接

链接失效反馈

官方服务：

资源简介：

通过构建一个包含大量静态手势图像及其对应类别标签（如“点赞”、“胜利”、“OK”、“握拳”等）的数据集，用于训练分类模型。该数据适用于智能家居设备控制、演示文稿的远程切换、社交应用中的表情符号触发以及无人机控制等场景。利用该数据训练的模型能够快速准确地识别用户的静态指令，解决了依赖物理接触或语音指令的交互局限性，提供了一种直观、静默的控制方式。静态手势识别旨在将单个图像中的手势分类为预定义的指令。具体过程包括：（1）数据收集：采集覆盖多种常见手势的图像，确保每个类别都有充足且多样化的样本。（2）数据处理：利用手掌检测模型提取手部区域图片，然后将该区域输入到一个特征提取网络中。手势语义特征向量是手部区域图片在高维语义特征空间的映射,通过公式 F_gesture=Encodercnn(I_hand) 提取，其中I_hand表示手部区域图片，F_gesture是代表手势语义特征向量。（3）模型构建：在提取的特征向量后连接一个分类器Classifier，根据公式 P_class=Classifier(F_gesture) 预测出手势类别，其中P_class是预测的手势类别；关键评估指标包括平均分类准确率（Accuracy）、平均精确率（Precision）和平均召回率（Recall）。

This dataset comprises a large corpus of static hand gesture images paired with their corresponding category labels (such as "like", "victory", "OK", "fist", etc.), and is intended for training classification models. The data is applicable to scenarios including smart home device control, remote switching of presentation slides, emoji triggering in social applications, and drone control. Models trained on this dataset can quickly and accurately recognize users' static instructions, addressing the interaction limitations of relying on physical contact or voice commands, and delivering an intuitive and silent interaction approach. Static gesture recognition aims to classify gestures in a single image into pre-defined instructions. The specific process includes: (1) Data collection: Collect images covering various common gestures, ensuring that each category has sufficient and diverse samples. (2) Data processing: Use a palm detection model to extract hand region images, then input this region into a feature extraction network. The gesture semantic feature vector is a mapping of the hand region image in the high-dimensional semantic feature space, extracted through the formula $F_{ ext{gesture}} = ext{Encoder}_{ ext{cnn}}(I_{ ext{hand}})$, where $I_{ ext{hand}}$ represents the hand region image, and $F_{ ext{gesture}}$ is the gesture semantic feature vector. (3) Model construction: Attach a classifier to the extracted feature vector to predict the gesture category via the formula $P_{ ext{class}} = ext{Classifier}(F_{ ext{gesture}})$, where $P_{ ext{class}}$ is the predicted gesture category; key evaluation metrics include average classification accuracy (Accuracy), average precision (Precision), and average recall (Recall).

提供机构：

正数智慧（温州）科技有限公司

创建时间：

2025-10-17

搜集汇总

数据集介绍