jpdiazpardo/scream_detection_heavy_metal
收藏Hugging Face2023-08-22 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/jpdiazpardo/scream_detection_heavy_metal
下载链接
链接失效反馈官方服务:
资源简介:
---
task_categories:
- audio-classification
language:
- en
dataset_info:
features:
- name: audio
dtype: audio
- name: scream_type
dtype: string
- name: song_name
dtype: string
- name: band_name
dtype: string
- name: album_name
dtype: string
- name: release_year
dtype: int64
- name: video_id
dtype: string
- name: timestamp_start
dtype: float64
- name: timestamp_end
dtype: float64
- name: sample_rate
dtype: int64
splits:
- name: train
num_bytes: 114577942.825
num_examples: 1575
download_size: 119156239
dataset_size: 114577942.825
license: mit
tags:
- music
size_categories:
- 1K<n<10K
pretty_name: Scream classification in heavy metal music
---
# Dataset card for Scream Detection in Heavy Metal Music
This dataset contains the processed dataset used in the paper "Scream Detection in Heavy Metal Music" (Kalbag & Lerch, 2022) from the Georgia Institute of Technology.
This dataset contains annotations of 57 songs, distributed over 34 bands and 47 albums. The vocal events are labelled into 5 classes:
* Clean (or sung vocal)
* Low Fry Scream
* Mid Fry Scream
* High Fry Scream
* Layered Vocals
The label "Layered Vocals" has been applied to cases where there are examples of two or more classes present simultaneously.
**Paper:** [Scream Detection in Heavy Metal Music](https://arxiv.org/pdf/2205.05580.pdf)
Kalbag, V., & Lerch, A. (2022). Scream detection in heavy metal music. arXiv preprint arXiv:2205.05580.
### How to use
Load the dataset from huggingface in your notebook:
```python
!pip install datasets[audio]
import datasets
dataset = datasets.load_dataset("jpdiazpardo/scream_detection_heavy_metal")
```
### Data Fields
* `audio`: the trimmed audio file from the song.
* `scream_type`: the target variable for classification i.e. layered, lowfry, highfry, midfry, clean.
* `song_name`: the name of the song.
* `band_name`: the name of the artist performing the song.
* `album_name`: the name of the album where the song was released.
* `release_year`: the release year of the song.
* `video_id`: the YouTube video id.
* `timestamp_start`: the start time of the snippet from the full audio.
* `tiemstamp_end`: the end time of the snippet from the full audio.
* `sample_rate`: the sampling rate of the audio.
### Youtube playlist: [Scream Detection Dataset](https://www.youtube.com/playlist?list=PLnkRJFUtBDzWOEnVOiWTVxGOWD70LDwtC)
### Source Data
| band_name | album_name | song_name | release_year | duration_seconds | video_id | bit_depth | bitrate | channels | sample_rate | 3class_split | 6class_split |
|-------------------|------------------------------|-------------------------------------|--------------|------------------|-------------|-----------|---------|----------|-------------|--------------|--------------|
| Abbath | Abbath | Ashes Of The Damned | 2016 | 238.097415 | K5pMoSECagE | 16 | 1411200 | 2 | 44100 | test/valid | test/valid |
| After The Burial | Dig Deep | Lost In The Static | 2016 | 271.2787302 | hUNAX1UYeAE | 16 | 1411200 | 2 | 44100 | train | train |
| Amon Amarth | Surtur Rising | Destroyer of the Universe | 2011 | 224.1886621 | 5aaOqUYG8Tw | 16 | 1411200 | 2 | 44100 | train | train |
| Amon Amarth | Twilight of the Thunder God | Live For The Kill | 2008 | 249.7538322 | Bh_5ofa__pY | 16 | 1411200 | 2 | 44100 | train | train |
| Amon Amarth | Twilight of the Thunder God | Twilight Of The Thunder God | 2008 | 265.5898413 | edBYB1VCV0k | 16 | 1411200 | 2 | 44100 | train | train |
| Be'lakor | Stone's Reach | Venator | 2009 | 517.9559375 | ainbICPRV8Y | 16 | 1536000 | 2 | 48000 | train | train |
| Behemoth | I Loved You at Your Darkest | Ecclesia Diabolica Catholica | 2018 | 324.3363265 | HKWqzjQAv14 | 16 | 1411200 | 2 | 44100 | train | train |
| Behemoth | I Loved You at Your Darkest | Bartzabel | 2018 | 320.9462132 | Dhfy9TPga-c | 16 | 1411200 | 2 | 44100 | train | train |
| Behemoth | The Satanist | Blow Your Trumpets Gabriel | 2013 | 297.9352381 | Czx-OIyrQwQ | 16 | 1411200 | 2 | 44100 | train | train |
| Born of Osiris | Angel or Alien | White Nile | 2021 | 229.0300417 | 4ShzP_M7W-k | 16 | 1536000 | 2 | 48000 | train | train |
| Cannibal Corpse | A Skeletal Domain | High Velocity Impact Spatter | 2014 | 246.9442177 | B3F10hXdmQY | 16 | 1411200 | 2 | 44100 | train | train |
| Children of Bodom | Hexed | Under Grass And Clover | 2019 | 213.0663039 | 1gpfzCxiQ-A | 16 | 1411200 | 2 | 44100 | test/valid | test/valid |
| Children of Bodom | Are You Dead Yet? | Living Dead Beat | 2005 | 318.1365986 | gG3JZ5vGJsk | 16 | 1411200 | 2 | 44100 | test/valid | test/valid |
| Children Of Bodom | Are You Dead Yet? | Are You Dead Yet | 2005 | 236.2630385 | aNJXS9X0yY0 | 16 | 1411200 | 2 | 44100 | train | train |
| Children of Bodom | Hate Crew Deathroll | Sixpounder | 2003 | 213.3449433 | 09KScSe4hIc | 16 | 1411200 | 2 | 44100 | test/valid | test/valid |
| Children Of Bodom | Follow the Reaper | Everytime I Die | 2000 | 241.650068 | 5cEK1OLhUKQ | 16 | 1411200 | 2 | 44100 | train | train |
| Children Of Bodom | Are You Dead Yet? | In Your Face | 2005 | 236.2630385 | 5SgN5lvWZwQ | 16 | 1411200 | 2 | 44100 | train | train |
| Dark Tranquillity | Lost to Apathy | Lost to Apathy | 2004 | 240.8838095 | GZqfH1LQEOQ | 16 | 1411200 | 2 | 44100 | test/valid | test/valid |
| Dark Tranquillity | Atoma | Atoma | 2016 | 262.8266667 | C_voh9WFbsM | 16 | 1411200 | 2 | 44100 | test/valid | test/valid |
| Death | Leprosy | Pull the Plug | 1988 | 266.8204989 | _duhhVa-dk8 | 16 | 1411200 | 2 | 44100 | train | train |
| Death | Individual Thought Patterns | The Philosopher | 1993 | 216.3867708 | 8256VJ4hkJU | 16 | 1536000 | 2 | 48000 | train | train |
| Decapitated | Anticult | Kill The Cult | 2017 | 296.1705215 | kQUTQTNChbE | 16 | 1411200 | 2 | 44100 | train | train |
| Decapitated | Blood Mantra | Blood Mantra | 2014 | 305.9693424 | 8gILuUdY2cU | 16 | 1411200 | 2 | 44100 | train | train |
| Ensiferum | Unsung Heroes | In My Sword I Trust | 2012 | 330.6753741 | -2WqQY_xSSM | 16 | 1411200 | 2 | 44100 | test/valid | test/valid |
| Enslaved | Caravans To The Outer Worlds | Caravans To The Outer Worlds | 2021 | 382.3165533 | ErTgN2zoTkA | 16 | 1411200 | 2 | 44100 | train | train |
| Godless | Swarm | Deathcult | 2018 | 250.9844898 | 1CdtbR9JHCA | 16 | 1411200 | 2 | 44100 | train | train |
| Gojira | Magma | Stranded | 2016 | 272.4397279 | FNdC_3LR2AI | 16 | 1411200 | 2 | 44100 | test/valid | test/valid |
| Gojira | Magma | Silvera | 2016 | 214.0647619 | iVvXB-Vwnco | 16 | 1411200 | 2 | 44100 | test/valid | test/valid |
| Immortal | Northern Chaos Gods | Northern Chaos Gods | 2018 | 265.4273016 | c5uP9PlEDro | 16 | 1411200 | 2 | 44100 | train | train |
| In Flames | Reroute to Remain | Cloud Connected | 2002 | 223.3295238 | B7iIS91fMAc | 16 | 1411200 | 2 | 44100 | train | train |
| Lamb of God | Lamb of God | Memento Mori | 2020 | 345.3503958 | hBj0-dIU8HI | 16 | 1536000 | 2 | 48000 | train | train |
| Lamb of God | Ashes of the Wake | Laid to Rest | 2004 | 234.1732426 | HL9kaJZw8iw | 16 | 1411200 | 2 | 44100 | train | train |
| Lamb of God | Ashes of the Wake | Omerta | 2004 | 287.5559184 | -xYZM04JxnQ | 16 | 1411200 | 2 | 44100 | train | train |
| Lamb of God | Ashes of the Wake | Now You've Got Something to Die For | 2004 | 219.8000907 | 0m5fIHHfJTw | 16 | 1411200 | 2 | 44100 | train | train |
| Lamb of God | Ashes of the Wake | The Faded Line | 2004 | 278.8019955 | JuRRnVqv2Vc | 16 | 1411200 | 2 | 44100 | train | train |
| Ne Obliviscaris | Citadel | Pyrrhic | 2014 | 590.1351667 | dCyxGNbBWAk | 16 | 1536000 | 2 | 48000 | test/valid | test/valid |
| Ne Obliviscaris | Portal of I | And Plague Flowers the Kaleidoscope | 2012 | 692.8533333 | BNyYiTdqzAY | 16 | 1536000 | 2 | 48000 | test/valid | test/valid |
| Nevermore | This Godless Endeavor | Born | 2005 | 255.8374603 | impRqn44OCA | 16 | 1411200 | 2 | 44100 | train | train |
| Of Mice & Men | Restoring Force | Bones Exposed | 2014 | 271.0697506 | IO-JbFtgeX4 | 16 | 1411200 | 2 | 44100 | test/valid | test/valid |
| Of Mice & Men | Timeless | Obsolete | 2021 | 270.0712925 | hxu3KXVy48w | 16 | 1411200 | 2 | 44100 | test/valid | test/valid |
| Opeth | Blackwater Park | Blackwater Park | 2001 | 732.8798333 | j4xCb_OU_lM | 16 | 1536000 | 2 | 48000 | train | train |
| Parkway Drive | Horizons | Carrion | 2007 | 188.0119728 | BR2kSva4NT8 | 16 | 1411200 | 2 | 44100 | train | train |
| Rings of Saturn | Lugal Ki En | Senseless Massacre | 2014 | 214.3333333 | F3A_3c882us | 16 | 1536000 | 2 | 48000 | test/valid | test/valid |
| Slayer | Seasons in the Abyss | War Ensemble | 1990 | 302.2541497 | jqnC54vbUbU | 16 | 1411200 | 2 | 44100 | train | train |
| Slayer | South of Heaven | South Of Heaven | 1988 | 298.5333333 | 74nTzbgDGWM | 16 | 1536000 | 2 | 48000 | train | train |
| Slipknot | All Hope Is Gone | Psychosocial | 2008 | 302.1148299 | 5abamRO41fE | 16 | 1411200 | 2 | 44100 | test/valid | test/valid |
| Suffocation | …Of the Dark Light | Clarity Through Deprivation | 2017 | 244.1810431 | HUUBI7RJtr8 | 16 | 1411200 | 2 | 44100 | test/valid | test/valid |
| Suicide Silence | The Cleansing | No Pity for a Coward | 2007 | 191.9361451 | hwxTEcHnC1o | 16 | 1411200 | 2 | 44100 | train | train |
| Suicide Silence | No Time to Bleed | Disengage | 2009 | 246.9442177 | FukeNR1ydOA | 16 | 1411200 | 2 | 44100 | train | train |
| Suicide Silence | The Black Crown | You Only Live Once | 2011 | 192.6095238 | ds9s-pzGD0M | 16 | 1411200 | 2 | 44100 | train | train |
| Suicide Silence | The Black Crown | Slaves To Substance | 2011 | 230.1329705 | k27N-jRofrM | 16 | 1411200 | 2 | 44100 | train | train |
| Tesseract | Odyssey | Nocturne | 2015 | 271.0233107 | get0cXOsSXg | 16 | 1411200 | 2 | 44100 | train | train |
| Textures | Silhouettes | Storm Warning | 2008 | 346.8829025 | 4600fGWcn9o | 16 | 1411200 | 2 | 44100 | train | train |
| Textures | Silhouettes | Old Days Born Anew | 2008 | 337.3627211 | 731QmPnjqe4 | 16 | 1411200 | 2 | 44100 | train | train |
| Thy Art Is Murder | Hate | Reign Of Darkness | 2012 | 236.1004989 | 47Plg93oJ1M | 16 | 1411200 | 2 | 44100 | train | train |
| Veil of Maya | False Idol | Overthrow | 2017 | 237.2847166 | GLu-E42-RmA | 16 | 1411200 | 2 | 44100 | test/valid | test/valid |
| Wintersun | Time I | Time | 2012 | 704.7720635 | ebSxxr726_8 | 16 | 1411200 | 2 | 44100 | train | train |
#### Initial Data Collection and Normalization
The data was collected from the YouTube playlist above and trimmed using the timestamps provided in the dataset.
The audio files were passed through the [Spleeter](https://joss.theoj.org/papers/10.21105/joss.02154) (Hennequin et al., 2020) source separation algorithm to separate the vocals from the other components.
### Licensing Information
MIT License
Copyright (c) 2022 Vedant Kalbag
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
### Citation Information
```
@article{
title={Scream Detection in Heavy Metal Music},
author={Vedant Kalbag and Alexabder Lerch},
journal={ArXiv},
year={2022},
volume={2205.05580}
}
```
```
@article{
Hennequin2020,
doi = {10.21105/joss.02154},
url = {https://doi.org/10.21105/joss.02154},
year = {2020}, publisher = {The Open Journal},
volume = {5}, number = {50}, pages = {2154},
author = {Romain Hennequin and Anis Khlif and Felix Voituret and Manuel Moussallam},
title = {Spleeter: a fast and efficient music source separation tool with pre-trained models},
journal = {Journal of Open Source Software}
}
```
提供机构:
jpdiazpardo
原始信息汇总
数据集概述
基本信息
- 任务类别:音频分类
- 语言:英语
- 数据集名称:Scream classification in heavy metal music
- 数据集大小:1K<n<10K
数据集内容
- 特征:
audio:音频文件scream_type:分类目标变量,包括Layered, LowFry, HighFry, MidFry, Cleansong_name:歌曲名称band_name:乐队名称album_name:专辑名称release_year:发行年份video_id:YouTube视频IDtimestamp_start:音频片段开始时间timestamp_end:音频片段结束时间sample_rate:音频采样率
数据集划分
- 训练集:包含1575个样本,总大小为114577942.825字节
许可证
- MIT许可证
引用信息
@article{ title={Scream Detection in Heavy Metal Music}, author={Vedant Kalbag and Alexander Lerch}, journal={ArXiv}, year={2022}, volume={2205.05580} }
搜集汇总
数据集介绍

构建方式
在音频信号处理领域,针对极端音乐中独特人声表现的研究需求,该数据集通过系统化流程构建。原始音频素材源自公开的YouTube播放列表,涵盖34支乐队的57首重金属曲目。研究者依据精确的时间戳截取音频片段,并运用Spleeter源分离算法提取人声轨道,确保数据纯净度。每条样本均标注了起始与结束时间、采样率等元数据,并按照预定义的五类尖叫类型进行人工标注,最终形成包含1575条样本的结构化数据集。
特点
该数据集聚焦于重金属音乐中的人声表现,其核心特征在于对尖叫声音的精细分类。数据涵盖清洁演唱、低中高三种嘶吼以及叠加人声共五种类别,精准捕捉了极端唱法的声学多样性。样本均附带丰富的元信息,包括歌曲、乐队、专辑名称及发行年份,为音乐学分析提供语境。音频数据经过源分离处理,突出了人声信号,且采样率与位深度统一,保障了声学特征提取的一致性。数据已划分为训练与验证测试集,支持即用的模型开发。
使用方法
该数据集主要用于音频分类任务,特别是重金属音乐中尖叫类型的自动检测。使用者可通过Hugging Face的`datasets`库直接加载,获取包含音频波形与类别标签的数据结构。在模型构建阶段,可利用`audio`字段提取声学特征,`scream_type`字段作为监督学习的目标变量。丰富的元数据字段支持多维度分析与数据筛选。研究者可基于此数据集训练深度学习模型,评估其在复杂音乐环境下对极端人声的识别能力,推动音乐信息检索与自动标注技术的发展。
背景与挑战
背景概述
在音乐信息检索领域,对极端音乐中独特人声表现的系统性研究长期处于空白状态。2022年,佐治亚理工学院的Vedant Kalbag与Alexander Lerch教授团队开创性地构建了重金属音乐尖叫检测数据集,旨在解析极端人声的声学特征与分类体系。该数据集收录了涵盖34支乐队的57首作品,通过精细标注将人声事件划分为纯净演唱、低中高三种嘶吼及复合人声五种类别,为音乐情感计算与声音事件检测提供了珍贵的实验材料。这项研究不仅填补了特定音乐流派自动分析的技术缺口,更推动了跨媒体内容理解与音频模式识别的前沿探索。
当前挑战
该数据集致力于解决重金属音乐中极端人声的自动分类问题,其核心挑战在于嘶吼人声在时频域上呈现的高度非线性特征与个体表现差异,导致传统声学模型难以捕捉其细微区别。构建过程中面临多重困难:原始音频需从流媒体平台提取并经过源分离处理,音质一致性控制成为技术瓶颈;人工标注环节要求标注者具备专业的音乐听觉训练与流派知识,标注标准的主观性易引入偏差;此外,数据集规模受限于特定流派的公开资源,样本多样性不足可能影响模型的泛化能力,复合人声事件的界定亦存在语义模糊性。
常用场景
经典使用场景
在音乐信息检索领域,重金属音乐因其独特的嘶吼演唱风格而成为音频分类研究的特殊对象。该数据集通过标注1575个音频片段,将嘶吼声细分为低、中、高三种嘶吼类型以及纯净人声和混合人声五类,为构建嘶吼检测模型提供了标准化的训练与评估基准。研究者利用该数据集训练深度神经网络,实现对重金属音乐中复杂人声风格的自动识别与分类,推动了音乐内容分析技术的精细化发展。
解决学术问题
该数据集有效解决了音乐信号处理中嘶吼声学特征量化不足的学术难题。传统音乐分类研究多集中于旋律、节奏等宏观特征,对极端人声的细微差异缺乏系统标注。通过提供精确的时间戳和分层标签,该数据集使研究者能够深入探究嘶吼声的频谱特性、时域模式及其与音乐情感的关联,为声乐表达分析建立了新的研究范式,填补了音乐人工智能在极端音乐流派中的理论空白。
衍生相关工作
该数据集催生了多项前沿研究,例如基于多模态融合的嘶吼情感分析模型,将音频特征与歌词文本结合以探究嘶吼的表达意图。后续研究进一步扩展了数据集的适用性,开发出跨语种的嘶吼风格迁移算法,实现了不同音乐流派间演唱风格的转换。此外,部分工作聚焦于实时嘶吼检测在交互式音乐系统中的应用,为现场演出中的实时音频处理开辟了新的技术路径。
以上内容由遇见数据集搜集并总结生成



