VoxCeleb2

Name: VoxCeleb2
Creator: OpenDataLab
Published: 2026-05-17 04:30:03
License: 暂无描述

OpenDataLab2026-05-17 更新2024-05-09 收录

下载链接：

https://opendatalab.org.cn/OpenDataLab/VoxCeleb2

下载链接

链接失效反馈

官方服务：

资源简介：

VoxCeleb2 是一个从开源媒体自动获得的大规模说话人识别数据集。 VoxCeleb2 包含来自 6k 多个扬声器的超过 100 万个话语。由于数据集是“在野外”收集的，语音片段被现实世界的噪音破坏，包括笑声、串音、频道效果、音乐和其他声音。该数据集也是多语言的，来自 145 个不同国籍的演讲者，涵盖了广泛的口音、年龄、种族和语言。该数据集是视听的，因此对于许多其他应用也很有用，例如 - 视觉语音合成、语音分离、从人脸到语音的跨模态转换（反之亦然）以及从视频中训练人脸识别以补充现有的人脸识别数据集。

VoxCeleb2 is a large-scale speaker recognition dataset automatically acquired from open-source media. It contains over one million utterances from more than 6,000 speakers. As the dataset was collected "in the wild", the speech segments are corrupted by real-world noises, including laughter, cross-talk, channel effects, music and other sounds. This dataset is also multilingual, with speakers from 145 different nationalities covering a wide range of accents, ages, ethnicities and languages. Being audiovisual, it is useful for a variety of other applications, such as visual speech synthesis, speech separation, cross-modal conversion between face and speech (and vice versa), and training face recognition models from videos to complement existing face recognition datasets.

提供机构：

OpenDataLab

创建时间：

2022-04-27

搜集汇总

数据集介绍