Audio-visual musical voice activity detection corpus with frame-level labels
收藏NIAID Data Ecosystem2026-03-13 收录
下载链接:
https://zenodo.org/records/6191248
下载链接
链接失效反馈官方服务:
资源简介:
100-minute musical audio-visual corpus (MAVC100) with frame-level labels.
Please download the original video files of the MAVC100 from here, the corresponding frame level audio labels and frame level audio-visual labels. Please note that the difference between audio labels and audio-visual labels is:
Audio labels contain 4 classes events: Silence, Speech, Singing, and Others. Speech and Singing contain all speech and singing voice in audio streams, without distinguishing whether the speech or singing voice comes from the anchor (the target speaker) or background;
Audio-visual labels also contain 4 classes events: Silence, Speech, Singing, and Others. But Speech and Singing in Audio-visual labels just contain the speech and singing voice from anchor (target speaker), and are different from the Speech and Singing in audio labels which may from background sounds.
Label explanation
1. Regardless of whether it is an audio label or an audio-visual label, we have only marked three types of labels: speech, singing and silence. The remaining time in the clip will be considered as others. Therefore, when using labels, you need to add the label of others after calculating the free time segment yourself.
2. The labeling is performed by marking the start time position and the end time position when each event appears, and different classes are represented by different numbers.
Homepage and demos: https://github.com/Yuanbo2020/Audio-Visual-VAD
Please feel free to use the open dataset MAVC100 and consider citing our paper as
@INPROCEEDINGS{9413418, author={Hou, Yuanbo and Deng, Yi and Zhu, Bilei and Ma, Zejun and Botteldooren, Dick}, booktitle={ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)}, title={Rule-Embedded Network for Audio-Visual Voice Activity Detection in Live Musical Video Streams}, year={2021}, volume={}, number={}, pages={4165-4169}, doi={10.1109/ICASSP39728.2021.9413418}}
创建时间:
2022-02-21



