Learning realistic lip motions for humanoid face robots

NIAID Data Ecosystem2026-05-10 收录

下载链接：

http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.j6q573nrc

下载链接

链接失效反馈

官方服务：

资源简介：

Lip motion represents outsized importance in human communication, capturing nearly half of our visual attention during conversation. Yet anthropomorphic robots often fail to achieve lip-audio synchronization, resulting in clumsy and lifeless lip behaviors. Two fundamental barriers underlay this challenge. First, robotic lips typically lack the mechanical complexity required to reproduce nuanced human mouth movements; second, existing synchronization methods depend on manually predefined movements and rules, restricting adaptability and realism. Here, we present a humanoid robot face designed to overcome these limitations, featuring soft silicone lips actuated by a ten-degree-of-freedom (10-DoF) mechanism. To achieve lip synchronization without predefined movements, we use a self-supervised learning pipeline based on a Variational Autoencoder (VAE) combined with a Facial Action Transformer, enabling the robot to autonomously infer more realistic lip trajectories directly from speech audio. Our experimental results suggest that this method outperforms simple heuristics like amplitude-based baselines in achieving more visually coherent lip-audio synchronization. Furthermore, the learned synchronization successfully generalizes across multiple linguistic contexts, enabling robot speech articulation in ten languages unseen during training.

创建时间：

2026-01-07

5,000+

优质数据集

54 个

任务类型

进入经典数据集