ghermoso/egtzan_plus
收藏Hugging Face2024-04-09 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/ghermoso/egtzan_plus
下载链接
链接失效反馈官方服务:
资源简介:
---
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
- split: test
path: data/test-*
dataset_info:
features:
- name: image
dtype: image
- name: label
dtype:
class_label:
names:
'0': afro
'1': classical
'2': country
'3': disco
'4': electro
'5': jazz
'6': latin
'7': metal
'8': pop
'9': rap
'10': reggae
'11': rock
splits:
- name: train
num_bytes: 128963338.5857826
num_examples: 1697
- name: test
num_bytes: 14256351.565217393
num_examples: 189
download_size: 143291941
dataset_size: 143219690.151
license: mit
task_categories:
- image-classification
tags:
- music
size_categories:
- 1K<n<10K
---
# Dataset Card
The egtzan_plus dataset is an GTZAN like dataset for musical genre classification in the vision domain.
In egtzan_plus, new classes such as Electro and Afro have been added to the original GTZAN dataset. Each audio track (30s) is transformed into a Mel-frequency spectrogram using Librosa:
```python
# Mel-frequency spectrogram generation
y, sr = librosa.load(audio_file)
ms = librosa.feature.melspectrogram(y=y, sr=sr, n_mels=128, fmax=8000)
log_ms = librosa.power_to_db(ms, ref=np.max)
librosa.display.specshow(log_ms)
```
The dataset contains the following classes:
- Afro
- Classical
- Country
- Disco
- Electro
- Jazz
- Latin
- Metal
- Pop
- Rap
- Reggae
- Rock
The dataset is split into train and test sets as follows:
- Train: 1697 examples
- Test: 189 examples
提供机构:
ghermoso
原始信息汇总
数据集概述
数据集信息
- 名称: egtzan_plus
- 描述: 用于音乐流派分类的GTZAN类数据集,新增了Electro和Afro等类别。每个音频轨道(30秒)通过Librosa转换为Mel频率谱图。
数据结构
- 特征:
image: 图像类型,表示Mel频率谱图。label: 类别标签,包含以下类别:- Afro
- Classical
- Country
- Disco
- Electro
- Jazz
- Latin
- Metal
- Pop
- Rap
- Reggae
- Rock
数据分割
- 训练集:
- 样本数: 1697
- 大小: 128963338.5857826字节
- 测试集:
- 样本数: 189
- 大小: 14256351.565217393字节
数据集大小
- 下载大小: 143291941字节
- 数据集大小: 143219690.151字节
许可
- 许可证: MIT
任务类别
- 图像分类
标签
- 音乐
大小类别
- 1K<n<10K



