iReplica EgoHOI dataset
收藏DataCite Commons2025-05-08 更新2025-04-16 收录
下载链接:
https://edmond.mpg.de/citation?persistentId=doi:10.17617/3.N9HXE2
下载链接
链接失效反馈官方服务:
资源简介:
Our world is not static and humans naturally cause changes in their environments through interactions, e.g., opening doors or moving furniture. Modeling changes caused by humans is essential for building digital twins, e.g., in the context of shared physical-virtual spaces (metaverses) and robotics. In order for widespread adoption of such emerging applications, the sensor setup used to capture the interactions needs to be inexpensive and easy-to-use for non-expert users. I.e., interactions should be captured and modeled by simple ego-centric sensors such as a combination of cameras and IMU sensors, not relying on any external cameras or object trackers. Yet, to the best of our knowledge, no work tackling the challenging problem of modeling human-scene interactions via such an ego-centric sensor setup exists. This paper closes this gap in the literature by developing a novel approach that combines visual localization of humans in the scene with contact-based reasoning about human-scene interactions from IMU data. Interestingly, we can show that even without visual observations of the interactions, human-scene contacts and interactions can be realistically predicted from human pose sequences. Our method, iReplica (Interaction Replica), is an essential first step towards the egocentric capture of human interactions and modeling of dynamic scenes, which is required for future AR/VR applications in immersive virtual universes and for training machines to behave like humans.
我们所处的世界并非静止一成不变,人类通过与周遭环境的交互活动自然会引发环境变化,例如开门、挪动家具等场景。对人类引发的环境变化进行建模,是构建数字孪生的核心前提,例如在共享物理-虚拟空间(metaverses,即元宇宙)以及机器人技术场景中。为推动此类新兴应用的大规模落地,用于采集交互行为的传感器系统需具备低成本、易上手的特性,以适配非专业用户的使用需求。换言之,交互行为的捕捉与建模应依托简易的第一人称视角传感器(ego-centric sensors)实现,例如相机与惯性测量单元(IMU)的组合方案,无需依赖任何外部相机或物体追踪器。然而,据我们所知,目前尚无研究针对通过此类第一人称视角传感器方案建模人-场景交互这一极具挑战性的问题展开相关工作。本文针对该学术空白提出了一种全新方法,该方法结合场景中人类的视觉定位技术,与基于IMU数据的人-场景交互接触推理,填补了这一领域的研究空白。值得注意的是,我们的研究证实,即便缺乏交互行为的视觉观测数据,仅通过人类姿态序列,也能够精准还原并预测人-场景接触与交互行为。我们提出的方法iReplica(Interaction Replica)是实现人类交互第一视角捕捉与动态场景建模的关键前置步骤,这一技术对于未来沉浸式虚拟宇宙中的增强现实(AR)与虚拟现实(VR)应用,以及训练具备类人行为能力的智能机器而言均至关重要。
提供机构:
Edmond
创建时间:
2025-03-11



