VoxBlink
收藏arXiv2023-12-13 更新2024-07-24 收录
下载链接:
https://voxblink.github.io/
下载链接
链接失效反馈官方服务:
资源简介:
VoxBlink是由武汉大学计算机学院创建的大规模视听说话人验证数据集,包含1.45M条语音记录,涉及38K说话人。数据集内容丰富,涵盖多种场景和语言,主要来源于YouTube上的用户上传的短视频。创建过程中采用了自动化的视听数据挖掘管道,确保数据的质量和多样性。VoxBlink数据集广泛应用于说话人验证、语音分离和多模态验证等领域,旨在解决现实生活中的说话人识别问题。
VoxBlink is a large-scale audio-visual speaker verification dataset created by the School of Computer Science, Wuhan University. It contains 1.45 million audio recordings involving 38,000 speakers. The dataset boasts rich content covering diverse scenarios and languages, and is primarily sourced from user-uploaded short videos on YouTube. An automated audio-visual data mining pipeline was adopted during its development to ensure the quality and diversity of the dataset. The VoxBlink dataset is widely applied in fields such as speaker verification, speech separation and multimodal verification, with the goal of addressing real-world speaker recognition problems.
提供机构:
武汉大学计算机学院
创建时间:
2023-08-14



