基于单帧图像的数字手势识别数据

Name: 基于单帧图像的数字手势识别数据
Creator: 正数智慧（温州）科技有限公司
Published: 2025-12-23 17:07:43
License: 暂无描述

浙江省数据知识产权登记平台2025-12-23 更新2025-12-24 收录

下载链接：

https://www.zjip.org.cn/home/announce/trends/8417917

下载链接

链接失效反馈

官方服务：

资源简介：

通过构建一个专门针对数字0到9的静态手势图像及其对应类别标签的数据集，用于训练高精度的数字分类模型。此数据适用于需要快速、非接触式进行数字输入或选择的场景，例如在虚拟/增强现实（VR/AR）应用中，用于无控制器环境下的密码输入、物品数量选择或菜单项快速导航；以及在公共场所的交互式屏幕上。利用该数据训练的模型能够专门优化数字识别的准确性和鲁棒性，解决了在特定场景下语音输入易受噪声干扰、触摸屏输入不便或不卫生的问题，提供了一种高效、明确且无声的数字交互指令通道。面向数字的静态手势识别旨在将单个图像中的手势分类为预定义的数字内容。具体过程包括：（1）数据收集：采集覆盖多种常见手势图像，记录所属数字类别。（2）数据处理：利用手掌检测模型提取手部区域图片，然后将该区域输入到一个特征提取网络中，用于将手部区域图片映射到高维特征空间。特征提取通过公式 F_gesture=Encodercnn(I_hand) 完成，其中 F_gesture是代表手势语义的特征向量。（3）模型构建：在提取的特征向量后连接一个分类器Classifier，根据公式 P_class=Classifier(F_gesture) 预测出数字类别，其中Pclass是预测的数字类别；关键评估指标包括平均分类准确率。

This dataset is constructed specifically for static hand gesture images of digits 0 through 9 and their corresponding category labels, intended for training high-precision digital classification models. This data is applicable to scenarios requiring fast, contactless digital input or selection, such as password entry, item quantity selection, or quick menu navigation in controller-free Virtual Reality (VR)/Augmented Reality (AR) environments, as well as interactive screens in public places. Models trained with this dataset can be specifically optimized for digital recognition accuracy and robustness, addressing issues including noise interference in voice input, inconvenience or unhygienic conditions for touchscreen input in specific scenarios, and providing an efficient, unambiguous and silent digital interaction command channel. Static hand gesture recognition for digits aims to classify gestures in a single image into predefined digital categories. The specific process includes: (1) Data collection: Collect various common gesture images and record their corresponding digital categories. (2) Data processing: Use a palm detection model to extract the hand region image, then input this region into a feature extraction network to map the hand region image to a high-dimensional feature space. Feature extraction is completed via the formula $F_{ ext{gesture}} = ext{Encoder}_{ ext{cnn}}(I_{ ext{hand}})$, where $F_{ ext{gesture}}$ is the feature vector representing the semantic meaning of the gesture. (3) Model construction: Attach a classifier to the extracted feature vector, predict the digital category via the formula $P_{ ext{class}} = ext{Classifier}(F_{ ext{gesture}})$, where $P_{ ext{class}}$ is the predicted digital category. Key evaluation metrics include average classification accuracy.

提供机构：

正数智慧（温州）科技有限公司

创建时间：

2025-10-22

搜集汇总

数据集介绍

背景与挑战

背景概述

该数据集是一个专门针对数字0到9的静态手势图像数据集，包含手势图像、手部区域图片、数字类别标签和语义特征向量等结构化数据，规模为136.54条，用于训练高精度的数字分类模型。它适用于虚拟/增强现实和交互式屏幕等场景，提供非接触式数字输入解决方案，通过特征提取和分类器算法实现手势识别，平均分类准确率达到0.98，优化了数字交互的准确性和鲁棒性。

以上内容由遇见数据集搜集并总结生成