AVASpeech-SMAD (AVASpeech-SMAD: A Strongly Labelled Speech and Music Activity Detection Dataset with Label Co-Occurrence)

Name: AVASpeech-SMAD (AVASpeech-SMAD: A Strongly Labelled Speech and Music Activity Detection Dataset with Label Co-Occurrence)
Creator: OpenDataLab
Published: 2026-05-31 09:30:26
License: 暂无描述

OpenDataLab2026-05-31 更新2024-05-09 收录

下载链接：

https://opendatalab.org.cn/OpenDataLab/AVASpeech-SMAD

下载链接

链接失效反馈

官方服务：

资源简介：

我们提出了一个数据集 AVASpeech-SMAD，以协助语音和音乐活动检测研究。使用帧级音乐标签，建议的数据集扩展了现有的 AVASpeech 数据集，该数据集最初由 45 小时的音频和语音活动标签组成。据我们所知，所提出的 AVASpeech-SMAD 是第一个具有强大的音乐和语音复调标签的开源数据集。数据集通过迭代交叉检查过程手动注释和验证。还实施了简单的自动检查，以进一步提高标签的质量。还提供了两个最先进的 SMAD 系统的评估结果作为未来参考的基准。

We present the AVASpeech-SMAD dataset to support research on speech and music activity detection. By incorporating frame-level music labels, the proposed dataset extends the existing AVASpeech dataset, which originally contained 45 hours of audio paired with speech activity annotations. To the best of our knowledge, the proposed AVASpeech-SMAD is the first open-source dataset featuring robust polyphonic labels for both music and speech. The dataset was manually annotated and verified via an iterative cross-checking process. Simple automatic validation checks were also implemented to further enhance the quality of the labels. Evaluation results from two state-of-the-art SMAD systems are provided as benchmarks for future research reference.

提供机构：

OpenDataLab

创建时间：

2022-08-19

搜集汇总

数据集介绍