ViCo

Name: ViCo
Creator: OpenDataLab
Published: 2026-05-17 12:30:33
License: 暂无描述

OpenDataLab2026-05-17 更新2024-05-09 收录

下载链接：

https://opendatalab.org.cn/OpenDataLab/ViCo

下载链接

链接失效反馈

官方服务：

资源简介：

ViCo数据集主要用于情景理解的视觉面部表情的生成，应用场景是在面对面的对话中产生受众反馈 (如点头、微笑)。ViCo总共涉及92个身份 (67个扬声器和76个听众) 以及483个视频和音频剪辑。它采用配对的 “说-听” 模式，听者根据说话者的声音和视频实时生成不同的态度。反应反馈 (正、中性、负)。与传统的语音到手势或说话头生成不同，收听者头生成利用来自说话者的音频和视频信号作为输入，并实时提供非语言反馈 (例如头部运动、面部表情)。该数据集支持广泛的应用程序，例如人机交互，视频到视频的翻译，跨模式的理解和生成。

The ViCo dataset is primarily developed for generating visual facial expressions for scenario understanding, with its target application being the generation of audience feedback (e.g., nodding, smiling) during face-to-face conversations. It encompasses a total of 92 unique identities (67 speakers and 76 listeners) and 483 video-audio clips. It adopts a paired 'speaker-listener' paradigm, where listeners generate real-time attitude-aligned reaction feedback (positive, neutral, negative) based on the audio and visual signals of the speakers. Unlike traditional speech-to-gesture or talking-head generation tasks, listener head generation takes audio and visual signals from the speaker as input and delivers real-time non-verbal feedback, such as head movements and facial expressions. This dataset enables a wide spectrum of applications, including human-computer interaction, video-to-video translation, cross-modal understanding and generation.

提供机构：

OpenDataLab

创建时间：

2022-10-24

搜集汇总

数据集介绍