mtg-jamendo
收藏魔搭社区2026-05-14 更新2025-04-12 收录
下载链接:
https://modelscope.cn/datasets/pengzhendong/mtg-jamendo
下载链接
链接失效反馈官方服务:
资源简介:
# Dataset Card for MTG Jamendo Dataset
## Dataset Description
- **Repository:** [MTG Jamendo dataset repository](https://github.com/MTG/mtg-jamendo-dataset)
### Dataset Summary
MTG-Jamendo Dataset, a new open dataset for music auto-tagging. It is built using music available at Jamendo under Creative Commons licenses and tags provided by content uploaders. The dataset contains over 55,000 full audio tracks with 195 tags from genre, instrument, and mood/theme categories. We provide elaborated data splits for researchers and report the performance of a simple baseline approach on five different sets of tags: genre, instrument, mood/theme, top-50, and overall.
## Dataset structure
### Data Fields
- `id`: an integer containing the id of the track
- `artist_id`: an integer containing the id of the artist
- `album_id`: an integer containing the id of the album
- `duration_in_sec`: duration of the track as a float
- `genres`: list of strings, describing genres the track is assigned to
- `instruments`: list of strings for the main instruments of the track
- `moods`: list of strings, describing the moods the track is assigned to
- `audio`: audio of the track
### Data Splits
This dataset has 2 balanced splits: _train_ (90%) and _validation_ (10%)
### Licensing Information
This dataset version 1.0.0 is released under the [Apache-2.0 License](http://www.apache.org/licenses/LICENSE-2.0).
### Citation Information
```
@conference {bogdanov2019mtg,
author = "Bogdanov, Dmitry and Won, Minz and Tovstogan, Philip and Porter, Alastair and Serra, Xavier",
title = "The MTG-Jamendo Dataset for Automatic Music Tagging",
booktitle = "Machine Learning for Music Discovery Workshop, International Conference on Machine Learning (ICML 2019)",
year = "2019",
address = "Long Beach, CA, United States",
url = "http://hdl.handle.net/10230/42015"
}
```
# MTG Jamendo 数据集卡片
## 数据集描述
- **仓库地址:** [MTG Jamendo 数据集仓库](https://github.com/MTG/mtg-jamendo-dataset)
### 数据集概述
MTG-Jamendo 数据集是一款全新的开源音乐自动标注数据集。该数据集依托 Jamendo 平台上采用知识共享(Creative Commons)许可的音乐资源,以及内容上传者提供的标签构建而成。数据集包含超过55000条完整音频曲目,涵盖195个来自流派、乐器以及情绪/主题类别的标签。我们为研究人员提供了精细化的数据集划分方案,并针对五类不同的标签集合(流派、乐器、情绪/主题、Top50标签以及整体性能)报告了简单基线方法的实验结果。
## 数据集结构
### 数据字段
- `id`:整数类型,存储曲目唯一标识符
- `artist_id`:整数类型,存储艺术家唯一标识符
- `album_id`:整数类型,存储专辑唯一标识符
- `duration_in_sec`:浮点类型,以秒为单位的曲目时长
- `genres`:字符串列表,描述曲目归属的音乐流派
- `instruments`:字符串列表,标注曲目使用的主要乐器
- `moods`:字符串列表,描述曲目对应的情绪/主题标签
- `audio`:曲目对应的音频数据
### 数据划分
本数据集包含两种均衡划分方案:训练集(train,占比90%)与验证集(validation,占比10%)
### 许可信息
本数据集1.0.0版本采用 [Apache-2.0 许可协议(Apache-2.0 License)](http://www.apache.org/licenses/LICENSE-2.0) 发布。
### 引用信息
@conference {bogdanov2019mtg,
author = "Bogdanov, Dmitry and Won, Minz and Tovstogan, Philip and Porter, Alastair and Serra, Xavier",
title = "The MTG-Jamendo Dataset for Automatic Music Tagging",
booktitle = "国际机器学习大会(ICML 2019)音乐发现机器学习研讨会",
year = "2019",
address = "美国加利福尼亚州长滩市",
url = "http://hdl.handle.net/10230/42015"
}
提供机构:
maas
创建时间:
2025-04-08



