AVASpeech-SMAD (AVASpeech-SMAD: A Strongly Labelled Speech and Music Activity Detection Dataset with Label Co-Occurrence)
收藏OpenDataLab2026-05-31 更新2024-05-09 收录
下载链接:
https://opendatalab.org.cn/OpenDataLab/AVASpeech-SMAD
下载链接
链接失效反馈官方服务:
资源简介:
我们提出了一个数据集 AVASpeech-SMAD,以协助语音和音乐活动检测研究。使用帧级音乐标签,建议的数据集扩展了现有的 AVASpeech 数据集,该数据集最初由 45 小时的音频和语音活动标签组成。据我们所知,所提出的 AVASpeech-SMAD 是第一个具有强大的音乐和语音复调标签的开源数据集。数据集通过迭代交叉检查过程手动注释和验证。还实施了简单的自动检查,以进一步提高标签的质量。还提供了两个最先进的 SMAD 系统的评估结果作为未来参考的基准。
We present the AVASpeech-SMAD dataset to support research on speech and music activity detection. By incorporating frame-level music labels, the proposed dataset extends the existing AVASpeech dataset, which originally contained 45 hours of audio paired with speech activity annotations. To the best of our knowledge, the proposed AVASpeech-SMAD is the first open-source dataset featuring robust polyphonic labels for both music and speech. The dataset was manually annotated and verified via an iterative cross-checking process. Simple automatic validation checks were also implemented to further enhance the quality of the labels. Evaluation results from two state-of-the-art SMAD systems are provided as benchmarks for future research reference.
提供机构:
OpenDataLab
创建时间:
2022-08-19
搜集汇总
数据集介绍

背景与挑战
背景概述
AVASpeech-SMAD是一个强标签的语音和音乐活动检测数据集,通过扩展AVASpeech数据集添加帧级音乐标签,成为首个开源且具有标签共现特性的数据集。它包含手动注释和验证,并提供了基准评估结果以支持相关研究。
以上内容由遇见数据集搜集并总结生成



