five

Audio-visual musical voice activity detection corpus with frame-level labels

收藏
NIAID Data Ecosystem2026-03-13 收录
下载链接:
https://zenodo.org/records/6191248
下载链接
链接失效反馈
官方服务:
资源简介:
100-minute musical audio-visual corpus (MAVC100) with frame-level labels. Please download the original video files of the MAVC100 from here, the corresponding frame level audio labels and frame level audio-visual labels. Please note that the difference between audio labels and audio-visual labels is: Audio labels contain 4 classes events: Silence, Speech, Singing, and Others. Speech and Singing contain all speech and singing voice in audio streams, without distinguishing whether the speech or singing voice comes from the anchor (the target speaker) or background; Audio-visual labels also contain 4 classes events: Silence, Speech, Singing, and Others. But Speech and Singing in Audio-visual labels just contain the speech and singing voice from anchor (target speaker), and are different from the Speech and Singing in audio labels which may from background sounds. Label explanation 1. Regardless of whether it is an audio label or an audio-visual label, we have only marked three types of labels: speech, singing and silence. The remaining time in the clip will be considered as others. Therefore, when using labels, you need to add the label of others after calculating the free time segment yourself. 2. The labeling is performed by marking the start time position and the end time position when each event appears, and different classes are represented by different numbers. Homepage and demos: https://github.com/Yuanbo2020/Audio-Visual-VAD  Please feel free to use the open dataset MAVC100 and consider citing our paper as @INPROCEEDINGS{9413418, author={Hou, Yuanbo and Deng, Yi and Zhu, Bilei and Ma, Zejun and Botteldooren, Dick}, booktitle={ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)}, title={Rule-Embedded Network for Audio-Visual Voice Activity Detection in Live Musical Video Streams}, year={2021}, volume={}, number={}, pages={4165-4169}, doi={10.1109/ICASSP39728.2021.9413418}}
创建时间:
2022-02-21
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作