five

中文VF数据集

收藏
arXiv2019-12-30 更新2024-08-06 收录
下载链接:
http://arxiv.org/abs/1911.09338v2
下载链接
链接失效反馈
官方服务:
资源简介:
中文VF数据集是由中国人民大学构建的一个大型语音-面部匹配与检索数据集,包含500名中国说话者的1.15M面部数据和0.29M音频数据。该数据集通过开发的一种半自动工具从特定类型的视频中收集,主要利用MOOC视频资源,确保数据的质量和控制。数据集的创建旨在解决语音与面部跨模态匹配和检索的问题,特别是在视频同步和语音生成面部等应用中。通过该数据集,研究者可以评估和改进模型在处理大规模数据时的泛化能力和测试信心。

The Chinese VF Dataset is a large-scale voice-face matching and retrieval dataset constructed by Renmin University of China. It contains 1.15 million facial samples and 0.29 million audio samples from 500 Chinese speakers. This dataset is collected from specific types of videos via a developed semi-automatic tool, mainly leveraging MOOC video resources to ensure data quality and controllability. The dataset is developed to address the challenges of cross-modal matching and retrieval between voice and face, especially in applications such as video synchronization and speech-to-face generation. With this dataset, researchers can evaluate and improve the generalization ability and test confidence of models when processing large-scale datasets.
提供机构:
中国人民大学
创建时间:
2019-11-21
二维码
社区交流群
二维码
科研交流群
商业服务