EgoGesture 公开数据集
收藏国家基础学科公共科学数据中心2026-01-30 收录
下载链接:
https://nbsdc.cn/general/dataDetail?id=67d5118b195d260905af9ff7&type=1
下载链接
链接失效反馈官方服务:
资源简介:
EgoGesture公开数据集由北京航空航天大学与微软亚洲研究院于2017年联合创建,专注于自中心视角(Egocentric View)下的手势识别研究,旨在解决复杂背景、光照变化及视角独特性带来的识别挑战。作为首个大规模第一视角RGB-D手势数据集,其多模态数据(包括RGB视频和深度信息)为增强现实(AR)、虚拟现实(VR)及人机交互领域提供了关键训练资源,推动算法在真实场景中的鲁棒性提升。在多智能体系统中,EgoGesture数据集的多模态特性使其在多重知识融合的超图表示研究中具有重要应用价值。在本项目中,通过将RGB视频、深度图像及其他辅助信息融合,能够建立一个包含多个智能体与交互关系的超图结构,从而加强系统在复杂交互场景中的理解和推理能力。这种超图表示不仅能够提升对单一智能体行为的识别准确度,还能更好地捕捉智能体之间的动态交互,为多智能体协作与行为预测提供坚实的基础。数据采集通过头戴式摄像头记录50名参与者在6种室内外场景中执行的83类手势(包括单手和双手动作,如滑动、握拳等),并同步捕获RGB视频与深度图像。数据集包含2,081个RGB-D视频片段、24,161个手势样本和近300万帧图像,每个样本标注了手势类别、时间边界及参与者ID。数据总规模约为500GB到1TB(原始未压缩数据),按场景和参与者划分为训练集和测试集,支持跨场景泛化能力评估。2019年,数据集新增了深度图像和多模态支持,增强了跨模态学习(如RGB+深度融合)的研究价值。EgoGesture广泛应用于3D卷积网络(C3D)、时空图卷积网络(ST-GCN)等模型训练,成为150多篇论文的基准数据集,尤其在AR眼镜手势交互、智能家居控制等场景中发挥重要作用。数据集可以通过OpenDataLab平台申请获取,使用时需遵循学术用途协议。
The EgoGesture public dataset was jointly created by Beihang University and Microsoft Research Asia in 2017, focusing on gesture recognition research under the egocentric view, aiming to address recognition challenges caused by complex backgrounds, varying lighting conditions, and unique camera perspectives. As the first large-scale first-person RGB-D gesture dataset, its multimodal data (including RGB videos and depth information) provides critical training resources for augmented reality (AR), virtual reality (VR), and human-computer interaction fields, promoting the improvement of algorithm robustness in real-world scenarios. In multi-agent systems, the multimodal nature of the EgoGesture dataset gives it important application value in the research of hypergraph representation for multi-knowledge fusion. In this project, by fusing RGB videos, depth images and other auxiliary information, a hypergraph structure containing multiple agents and their interactive relationships can be established, thereby enhancing the system's ability to understand and reason in complex interactive scenarios. This hypergraph representation can not only improve the recognition accuracy of single-agent behaviors, but also better capture the dynamic interactions between agents, providing a solid foundation for multi-agent collaboration and behavior prediction. For data collection, a head-mounted camera was used to record 83 types of gestures (including one-handed and two-handed movements such as sliding, fist-clenching, etc.) performed by 50 participants across 6 indoor and outdoor scenarios, while simultaneously capturing RGB videos and depth images. The dataset contains 2,081 RGB-D video clips, 24,161 gesture samples, and nearly 3 million image frames. Each sample is annotated with gesture category, temporal boundaries, and participant ID. The total data size is approximately 500GB to 1TB (raw uncompressed data), which is divided into training and test sets according to scenarios and participants, supporting the evaluation of cross-scenario generalization ability. In 2019, the dataset added depth image and multimodal support, enhancing its research value for cross-modal learning (such as RGB+depth fusion). EgoGesture is widely used for training models such as 3D convolutional networks (C3D) and spatial-temporal graph convolutional networks (ST-GCN), and has become a benchmark dataset for more than 150 papers, playing an important role especially in scenarios like AR glasses gesture interaction and smart home control. The dataset can be applied for access via the OpenDataLab platform, and users must comply with the academic use agreement when utilizing it.
提供机构:
山西大学
搜集汇总
数据集介绍

背景与挑战
背景概述
EgoGesture公开数据集是一个专注于自中心视角手势识别的大规模多模态数据集,包含RGB视频和深度信息,适用于AR、VR及人机交互领域的研究。数据集由2,081个RGB-D视频片段和24,161个手势样本组成,广泛应用于算法训练和基准测试。
以上内容由遇见数据集搜集并总结生成



