GTZAN music/speech collection
收藏www.kaggle.com2017-10-24 更新2025-03-24 收录
下载链接:
https://www.kaggle.com/lnicalo/gtzan-musicspeech-collection
下载链接
链接失效反馈官方服务:
资源简介:
###Context
The need for music-speech classification is evident in many audio processing tasks which relate to real-life materials such as archives of field recordings, broadcasts and any other contexts which are likely to involve speech and music, concurrent or alternating. Segregating the signal into speech and music segments is an obvious first step before applying speech-specific or music-specific algorithms.
Indeed, speech-music classification has received considerable attention from the research community (for a partial list, see references below) but many of the published algorithms are dataset-specific and are not directly comparable due to non-standardised evaluation.
###Content
Dataset collected for the purposes of music/speech discrimination. The dataset consists of 120 tracks, each 30 seconds long. Each class (music/speech) has 60 examples. The tracks are all 22050Hz Mono 16-bit audio files in .wav format.
音乐与语音分类的需求在众多音频处理任务中显而易见,这些任务涉及现实生活中的材料,如实地录音档案、广播以及其他可能涉及语音和音乐的情境,无论是同时发生还是交替出现。在应用特定于语音或音乐的算法之前,将信号分离为语音和音乐片段是显而易见的第一步。事实上,语音-音乐分类已引起研究界的广泛关注(部分参考文献见下文),但许多已发表的算法针对特定数据集,由于评估标准不统一,因此无法直接比较。
数据集收集旨在进行音乐/语音区分。该数据集包含120首曲目,每首30秒。每个类别(音乐/语音)有60个示例。所有曲目均为22050Hz单声道16位音频文件,格式为.wav。
提供机构:
www.kaggle.com



