GIMO

Name: GIMO
Creator: OpenDataLab
Published: 2026-05-24 13:30:31
License: 暂无描述

OpenDataLab2026-05-24 更新2024-05-09 收录

下载链接：

https://opendatalab.org.cn/OpenDataLab/GIMO

下载链接

链接失效反馈

官方服务：

资源简介：

预测人体运动对于辅助机器人和AR/VR应用至关重要，在这些应用中，与人类的交互需要安全舒适。同时，准确的预测取决于对场景环境和人类意图的理解。尽管许多作品都在研究场景感知的人类运动预测，但由于缺乏以自我为中心的观点来揭示人类的意图以及运动和场景的有限多样性，后者在很大程度上尚未得到充分探索。为了缩小差距，我们提出了一个大规模的人体运动数据集，该数据集提供高质量的身体姿势序列，场景扫描以及带有眼睛凝视的以自我为中心的视图，这些视图可作为推断人类意图的替代品。通过使用惯性传感器进行运动捕获，我们的数据收集与特定场景无关，这进一步提高了从我们的对象观察到的运动动力学。我们通过各种最先进的架构对利用眼睛凝视进行以自我为中心的人类运动预测的好处进行了广泛的研究。此外，为了充分发挥凝视的潜力，我们提出了一种新颖的网络体系结构，该体系结构可在凝视和运动分支之间进行双向通信。由于来自凝视的意图信息和运动调制的去噪凝视特征，我们的网络在建议的数据集上实现了人体运动预测的最高性能。拟议的数据集和我们的网络实施将公开可用

Predicting human motion is critical for assistive robotics and AR/VR applications, where safe and comfortable human-robot interaction is a core requirement. Meanwhile, accurate motion prediction relies on the understanding of both the surrounding scene context and human intentions. While numerous studies have explored scene-aware human motion prediction, this field has largely remained under-explored due to two critical limitations: the scarcity of egocentric viewpoints that can directly reveal human intentions, and the limited diversity of existing motion and scene datasets. To bridge this research gap, we propose a large-scale human motion dataset that provides high-quality body pose sequences, scene scans, and egocentric views with recorded eye gaze, which serves as a viable proxy for inferring human intentions. Using inertial sensors for motion capture, our data collection process is scene-agnostic, which further enriches the authenticity of the motion dynamics observed from our study participants. We perform extensive experiments to evaluate the benefits of leveraging eye gaze for egocentric human motion prediction across various state-of-the-art model architectures. Furthermore, to fully unlock the potential of gaze information, we present a novel network architecture that enables bidirectional communication between the gaze and motion processing branches. Benefiting from the intention information extracted from gaze and the denoised gaze features modulated by motion dynamics, our network achieves state-of-the-art performance for human motion prediction on the proposed dataset. The proposed dataset and the implementation code of our network will be made publicly available.

提供机构：

OpenDataLab

创建时间：

2022-11-02

搜集汇总

数据集介绍

背景与挑战

背景概述

GIMO是一个大规模的人体运动数据集，旨在预测人体运动，适用于辅助机器人和AR/VR应用。它提供高质量的身体姿势序列、场景扫描和以自我为中心的视图（含眼睛凝视），用于推断人类意图，并通过惯性传感器实现与场景无关的数据收集，增强了运动动力学的多样性。该数据集由斯坦福大学和清华大学于2022年发布，支持相关研究和开发。

以上内容由遇见数据集搜集并总结生成

5,000+

优质数据集

54 个

任务类型

进入经典数据集