five

VoxCeleb2 Dataset

收藏
paperswithcode.com2025-03-23 收录
下载链接:
https://paperswithcode.com/dataset/voxceleb2
下载链接
链接失效反馈
官方服务:
资源简介:
VoxCeleb2 is a large scale speaker recognition dataset obtained automatically from open-source media. VoxCeleb2 consists of over a million utterances from over 6k speakers. Since the dataset is collected ‘in the wild’, the speech segments are corrupted with real world noise including laughter, cross-talk, channel effects, music and other sounds. The dataset is also multilingual, with speech from speakers of 145 different nationalities, covering a wide range of accents, ages, ethnicities and languages. The dataset is audio-visual, so is also useful for a number of other applications, for example – visual speech synthesis, speech separation, cross-modal transfer from face to voice or vice versa and training face recognition from video to complement existing face recognition datasets.

VoxCeleb2乃是一份从开源媒体中自动获取的规模化语音识别数据集。该数据集由超过6,000名说话者的超过一百万个话语组成。鉴于数据集系在自然环境中收集,其语音片段不可避免地受到现实世界噪声的干扰,包括笑声、串音、信道效应、音乐及其他声响。此外,该数据集亦具备多语言特性,收录了来自145个不同国家的说话者的语音,涵盖了广泛的地域口音、年龄、民族和语言。作为音频-视觉数据集,VoxCeleb2亦适用于众多其他应用,例如视觉语音合成、语音分离、从面部到声音或反之的跨模态迁移,以及从视频中训练面部识别以补充现有的面部识别数据集。
提供机构:
paperswithcode.com
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作