free-music-archive-medium
收藏魔搭社区2025-12-05 更新2025-03-22 收录
下载链接:
https://modelscope.cn/datasets/benjamin-paine/free-music-archive-medium
下载链接
链接失效反馈官方服务:
资源简介:
# FMA: A Dataset for Music Analysis
[Michaël Defferrard](https://deff.ch/), [Kirell Benzi](https://kirellbenzi.com/), [Pierre Vandergheynst](https://people.epfl.ch/pierre.vandergheynst), [Xavier Bresson](https://www.ntu.edu.sg/home/xbresson).
**International Society for Music Information Retrieval Conference (ISMIR), 2017.**
> We introduce the Free Music Archive (FMA), an open and easily accessible dataset suitable for evaluating several tasks in MIR, a field concerned with browsing, searching, and organizing large music collections. The community's growing interest in feature and end-to-end learning is however restrained by the limited availability of large audio datasets. The FMA aims to overcome this hurdle by providing 917 GiB and 343 days of Creative Commons-licensed audio from 106,574 tracks from 16,341 artists and 14,854 albums, arranged in a hierarchical taxonomy of 161 genres. It provides full-length and high-quality audio, pre-computed features, together with track- and user-level metadata, tags, and free-form text such as biographies. We here describe the dataset and how it was created, propose a train/validation/test split and three subsets, discuss some suitable MIR tasks, and evaluate some baselines for genre recognition. Code, data, and usage examples are available at https://github.com/mdeff/fma.
Paper: [arXiv:1612.01840](https://arxiv.org/abs/1612.01840) - [latex and reviews](https://github.com/mdeff/paper-fma-ismir2017)
Slides: [doi:10.5281/zenodo.1066119](https://doi.org/10.5281/zenodo.1066119)
Poster: [doi:10.5281/zenodo.1035847](https://doi.org/10.5281/zenodo.1035847)
# This Pack
This is the **medium** dataset, comprising a total of **24,801 samples** clipped at **30 seconds** over **16** *unbalanced* genres totaling **206.6 hours** hours of audio.
## Repack Notes
- 20 files were unreadable by `libsndfile / libmpg123`, these were removed.
- 179 files had licenses that were unclear on whether or not they permitted redistribution, or the full license text was unavailable. These were removed.
# License
- The [FMA codebase](https://github.com/mdeff/fma) is released under [The MIT License](https://github.com/mdeff/fma/blob/master/LICENSE.txt).
- The FMA metadata is released under [CC-BY 4.0](https://creativecommons.org/licenses/by/4.0).
- The individual files are released under various Creative Commons family licenses, with a small amount of additional licenses. **Each file has its license attached and important details of the license enumerated.** To make it easy to use for developers and trainers, a configuration is available to limit only to commercially-usable data.
Please refer to any of the following URLs for additional details.
| Class Label | License Name | URL |
| ----------- | ------------ | --- |
| 0 | CC-BY 1.0 | https://creativecommons.org/licenses/by/1.0/ |
| 1 | CC-BY 2.0 | https://creativecommons.org/licenses/by/2.0/ |
| 2 | CC-BY 2.5 | https://creativecommons.org/licenses/by/2.5/ |
| 3 | CC-BY 3.0 | https://creativecommons.org/licenses/by/3.0/ |
| 4 | CC-BY 4.0 | https://creativecommons.org/licenses/by/4.0/ |
| 5 | CC-BY-NC 2.0 | https://creativecommons.org/licenses/by-nc/2.0/ |
| 6 | CC-BY-NC 2.1 | https://creativecommons.org/licenses/by-nc/2.1/ |
| 7 | CC-BY-NC 2.5 | https://creativecommons.org/licenses/by-nc/2.5/ |
| 8 | CC-BY-NC 3.0 | https://creativecommons.org/licenses/by-nc/3.0/ |
| 9 | CC-BY-NC 4.0 | https://creativecommons.org/licenses/by-nc/4.0/ |
| 10 | CC-BY-NC-ND 2.0 | https://creativecommons.org/licenses/by-nc-nd/2.0/ |
| 11 | CC-BY-NC-ND 2.1 | https://creativecommons.org/licenses/by-nc-nd/2.1/ |
| 12 | CC-BY-NC-ND 2.5 | https://creativecommons.org/licenses/by-nc-nd/2.5/ |
| 13 | CC-BY-NC-ND 3.0 | https://creativecommons.org/licenses/by-nc-nd/3.0/ |
| 14 | CC-BY-NC-ND 4.0 | https://creativecommons.org/licenses/by-nc-nd/4.0/ |
| 15 | CC-BY-NC-SA 2.0 | https://creativecommons.org/licenses/by-nc-sa/2.0/ |
| 16 | CC-BY-NC-SA 2.1 | https://creativecommons.org/licenses/by-nc-sa/2.1/ |
| 17 | CC-BY-NC-SA 2.5 | https://creativecommons.org/licenses/by-nc-sa/2.5/ |
| 18 | CC-BY-NC-SA 3.0 | https://creativecommons.org/licenses/by-nc-sa/3.0/ |
| 19 | CC-BY-NC-SA 4.0 | https://creativecommons.org/licenses/by-nc-sa/4.0/ |
| 20 | CC-BY-ND 2.0 | https://creativecommons.org/licenses/by-nd/2.0/ |
| 21 | CC-BY-ND 2.5 | https://creativecommons.org/licenses/by-nd/2.5/ |
| 22 | CC-BY-ND 3.0 | https://creativecommons.org/licenses/by-nd/3.0/ |
| 23 | CC-BY-ND 4.0 | https://creativecommons.org/licenses/by-nd/4.0/ |
| 24 | CC-BY-SA 2.0 | https://creativecommons.org/licenses/by-sa/2.0/ |
| 25 | CC-BY-SA 2.5 | https://creativecommons.org/licenses/by-sa/2.5/ |
| 26 | CC-BY-SA 3.0 | https://creativecommons.org/licenses/by-sa/3.0/ |
| 27 | CC-BY-SA 4.0 | https://creativecommons.org/licenses/by-sa/4.0/ |
| 28 | CC-NC-Sampling+ 1.0 | https://creativecommons.org/licenses/nc-sampling+/1.0/ |
| 29 | CC-Sampling+ 1.0 | https://creativecommons.org/licenses/sampling+/1.0/ |
| 30 | FMA Sound Recording Common Law | https://freemusicarchive.org/Sound_Recording_Common_Law |
| 31 | Free Art License | https://artlibre.org/licence/lal/en |
| 32 | Free Music Philosophy (FMP) | https://irdial.com/free_and_easy.htm |
## Total Duration by License
| License | Total Duration (Percentage) |
| ------- | --------------------------- |
| CC-BY-NC-SA 3.0 | 64.4 hours (31.20%) |
| CC-BY-NC-ND 3.0 | 55.2 hours (26.70%) |
| CC-BY-NC-ND 4.0 | 26.8 hours (12.96%) |
| CC-BY-NC-SA 4.0 | 13.7 hours (6.65%) |
| CC-BY 4.0 | 9.3 hours (4.50%) |
| CC-BY-NC 3.0 | 7.1 hours (3.42%) |
| CC-BY-NC 4.0 | 6.4 hours (3.11%) |
| CC-BY 3.0 | 4.7 hours (2.28%) |
| CC-BY-SA 3.0 | 3.8 hours (1.84%) |
| FMA Sound Recording Common Law | 3.4 hours (1.62%) |
| CC-BY-SA 4.0 | 3.4 hours (1.62%) |
| CC-BY-NC-SA 2.0 | 2.0 hours (0.97%) |
| CC-BY-NC-ND 2.0 | 1.7 hours (0.83%) |
| CC0 1.0 | 58.0 minutes (0.47%) |
| CC-BY-ND 3.0 | 51.4 minutes (0.42%) |
| CC-BY-ND 4.0 | 46.4 minutes (0.37%) |
| CC-BY-NC-ND 2.5 | 37.4 minutes (0.30%) |
| CC-BY-NC-SA 2.5 | 34.5 minutes (0.28%) |
| CC-BY-NC 2.5 | 18.5 minutes (0.15%) |
| CC-BY-NC 2.1 | 7.5 minutes(0.06%) |
| CC-NC-Sampling+ 1.0 | 6.0 minutes (0.05%) |
| CC-BY-NC-ND 2.1 | 4.5 minutes (0.04%) |
| CC-BY-SA 2.0 | 4.5 minutes (0.04%) |
| CC-BY-ND 2.0 | 3.5 minutes (0.03%) |
| CC-BY-ND 2.5 | 3.0 minutes (0.02%) |
| Free Art License | 3.0 minutes (0.02%) |
| CC-Sampling+ 1.0 | 2.5 minutes (0.02%) |
| CC-BY 2.0 | 2.0 minutes (0.02%) |
| CC-BY 2.5 | 1.0 minutes (0.01%) |
# Citations
```
@inproceedings{fma_dataset,
title = {{FMA}: A Dataset for Music Analysis},
author = {Defferrard, Micha\"el and Benzi, Kirell and Vandergheynst, Pierre and Bresson, Xavier},
booktitle = {18th International Society for Music Information Retrieval Conference (ISMIR)},
year = {2017},
archiveprefix = {arXiv},
eprint = {1612.01840},
url = {https://arxiv.org/abs/1612.01840},
}
```
```
@inproceedings{fma_challenge,
title = {Learning to Recognize Musical Genre from Audio},
subtitle = {Challenge Overview},
author = {Defferrard, Micha\"el and Mohanty, Sharada P. and Carroll, Sean F. and Salath\'e, Marcel},
booktitle = {The 2018 Web Conference Companion},
year = {2018},
publisher = {ACM Press},
isbn = {9781450356404},
doi = {10.1145/3184558.3192310},
archiveprefix = {arXiv},
eprint = {1803.05337},
url = {https://arxiv.org/abs/1803.05337},
}
```
# FMA:音乐分析数据集
[迈克尔·德费拉德(Michaël Defferrard)](https://deff.ch/)、[基雷尔·本齐(Kirell Benzi)](https://kirellbenzi.com/)、[皮埃尔·范德根斯特(Pierre Vandergheynst)](https://people.epfl.ch/pierre.vandergheynst)、[泽维尔·布雷松(Xavier Bresson)](https://www.ntu.edu.sg/home/xbresson)
**国际音乐信息检索学会会议(ISMIR,International Society for Music Information Retrieval Conference),2017年**
> 本文介绍了自由音乐档案馆(Free Music Archive,FMA),这是一个开放且易于获取的数据集,适用于评估音乐信息检索(Music Information Retrieval,简称MIR)领域的多项任务——该领域专注于浏览、搜索和组织大型音乐馆藏。然而,学界对特征学习与端到端学习日益增长的兴趣,受制于大型音频数据集的匮乏。FMA旨在突破这一障碍,提供来自16341位艺术家、14854张专辑的106574首曲目,总计917 GiB、343天时长的知识共享(Creative Commons)授权音频,曲目按照包含161个类别的分层分类体系进行组织。该数据集提供完整高保真音频、预计算特征,以及曲目级、用户级元数据、标签与传记等自由格式文本。本文详述了该数据集及其构建流程,提出了训练集/验证集/测试集划分方案与三个子集,讨论了若干适配的MIR任务,并评估了若干音乐流派识别基准模型。代码、数据集与使用示例可在https://github.com/mdeff/fma 获取。
论文:[arXiv:1612.01840](https://arxiv.org/abs/1612.01840) - [LaTeX源码与评审意见](https://github.com/mdeff/paper-fma-ismir2017)
演示文稿:[doi:10.5281/zenodo.1066119](https://doi.org/10.5281/zenodo.1066119)
海报:[doi:10.5281/zenodo.1035847](https://doi.org/10.5281/zenodo.1035847)
# 本数据包
本数据包为**中等规模**数据集,包含总计**24801条样本**,所有样本均截取为**30秒时长**,覆盖**16个**非均衡(unbalanced)流派,总音频时长达**206.6小时**。
## 重打包说明
- 20个文件无法被`libsndfile / libmpg123`读取,已被移除。
- 179个文件的授权协议无法明确是否允许再分发,或完整授权文本不可获取,已被移除。
# 授权协议
- [FMA代码库](https://github.com/mdeff/fma)采用[MIT许可证](https://github.com/mdeff/fma/blob/master/LICENSE.txt)发布。
- FMA元数据采用[知识共享署名4.0(CC-BY 4.0)](https://creativecommons.org/licenses/by/4.0/)协议发布。
- 各独立音频文件采用多种知识共享家族许可证及少量其他许可证发布。**每个文件均附带其授权协议,并枚举了授权的关键细节。** 为方便开发者与训练者使用,本数据包提供配置选项,可仅限制使用可商用的数据。
请参考以下任一链接获取更多详情。
| 类别标签 | 许可证名称 | 链接 |
| ----------- | ------------ | --- |
| 0 | CC-BY 1.0 | https://creativecommons.org/licenses/by/1.0/ |
| 1 | CC-BY 2.0 | https://creativecommons.org/licenses/by/2.0/ |
| 2 | CC-BY 2.5 | https://creativecommons.org/licenses/by/2.5/ |
| 3 | CC-BY 3.0 | https://creativecommons.org/licenses/by/3.0/ |
| 4 | CC-BY 4.0 | https://creativecommons.org/licenses/by/4.0/ |
| 5 | CC-BY-NC 2.0 | https://creativecommons.org/licenses/by-nc/2.0/ |
| 6 | CC-BY-NC 2.1 | https://creativecommons.org/licenses/by-nc/2.1/ |
| 7 | CC-BY-NC 2.5 | https://creativecommons.org/licenses/by-nc/2.5/ |
| 8 | CC-BY-NC 3.0 | https://creativecommons.org/licenses/by-nc/3.0/ |
| 9 | CC-BY-NC 4.0 | https://creativecommons.org/licenses/by-nc/4.0/ |
| 10 | CC-BY-NC-ND 2.0 | https://creativecommons.org/licenses/by-nc-nd/2.0/ |
| 11 | CC-BY-NC-ND 2.1 | https://creativecommons.org/licenses/by-nc-nd/2.1/ |
| 12 | CC-BY-NC-ND 2.5 | https://creativecommons.org/licenses/by-nc-nd/2.5/ |
| 13 | CC-BY-NC-ND 3.0 | https://creativecommons.org/licenses/by-nc-nd/3.0/ |
| 14 | CC-BY-NC-ND 4.0 | https://creativecommons.org/licenses/by-nc-nd/4.0/ |
| 15 | CC-BY-NC-SA 2.0 | https://creativecommons.org/licenses/by-nc-sa/2.0/ |
| 16 | CC-BY-NC-SA 2.1 | https://creativecommons.org/licenses/by-nc-sa/2.1/ |
| 17 | CC-BY-NC-SA 2.5 | https://creativecommons.org/licenses/by-nc-sa/2.5/ |
| 18 | CC-BY-NC-SA 3.0 | https://creativecommons.org/licenses/by-nc-sa/3.0/ |
| 19 | CC-BY-NC-SA 4.0 | https://creativecommons.org/licenses/by-nc-sa/4.0/ |
| 20 | CC-BY-ND 2.0 | https://creativecommons.org/licenses/by-nd/2.0/ |
| 21 | CC-BY-ND 2.5 | https://creativecommons.org/licenses/by-nd/2.5/ |
| 22 | CC-BY-ND 3.0 | https://creativecommons.org/licenses/by-nd/3.0/ |
| 23 | CC-BY-ND 4.0 | https://creativecommons.org/licenses/by-nd/4.0/ |
| 24 | CC-BY-SA 2.0 | https://creativecommons.org/licenses/by-sa/2.0/ |
| 25 | CC-BY-SA 2.5 | https://creativecommons.org/licenses/by-sa/2.5/ |
| 26 | CC-BY-SA 3.0 | https://creativecommons.org/licenses/by-sa/3.0/ |
| 27 | CC-BY-SA 4.0 | https://creativecommons.org/licenses/by-sa/4.0/ |
| 28 | CC-NC-Sampling+ 1.0 | https://creativecommons.org/licenses/nc-sampling+/1.0/ |
| 29 | CC-Sampling+ 1.0 | https://creativecommons.org/licenses/sampling+/1.0/ |
| 30 | FMA Sound Recording Common Law | https://freemusicarchive.org/Sound_Recording_Common_Law |
| 31 | Free Art License | https://artlibre.org/licence/lal/en |
| 32 | Free Music Philosophy (FMP) | https://irdial.com/free_and_easy.htm |
## 各许可证总时长
| 许可证类型 | 总时长(占比) |
| ------- | --------------------------- |
| CC-BY-NC-SA 3.0 | 64.4小时(31.20%) |
| CC-BY-NC-ND 3.0 | 55.2小时(26.70%) |
| CC-BY-NC-ND 4.0 | 26.8小时(12.96%) |
| CC-BY-NC-SA 4.0 | 13.7小时(6.65%) |
| CC-BY 4.0 | 9.3小时(4.50%) |
| CC-BY-NC 3.0 | 7.1小时(3.42%) |
| CC-BY-NC 4.0 | 6.4小时(3.11%) |
| CC-BY 3.0 | 4.7小时(2.28%) |
| CC-BY-SA 3.0 | 3.8小时(1.84%) |
| FMA Sound Recording Common Law | 3.4小时(1.62%) |
| CC-BY-SA 4.0 | 3.4小时(1.62%) |
| CC-BY-NC-SA 2.0 | 2.0小时(0.97%) |
| CC-BY-NC-ND 2.0 | 1.7小时(0.83%) |
| CC0 1.0 | 58.0分钟(0.47%) |
| CC-BY-ND 3.0 | 51.4分钟(0.42%) |
| CC-BY-ND 4.0 | 46.4分钟(0.37%) |
| CC-BY-NC-ND 2.5 | 37.4分钟(0.30%) |
| CC-BY-NC-SA 2.5 | 34.5分钟(0.28%) |
| CC-BY-NC 2.5 | 18.5分钟(0.15%) |
| CC-BY-NC 2.1 | 7.5分钟(0.06%) |
| CC-NC-Sampling+ 1.0 | 6.0分钟(0.05%) |
| CC-BY-NC-ND 2.1 | 4.5分钟(0.04%) |
| CC-BY-SA 2.0 | 4.5分钟(0.04%) |
| CC-BY-ND 2.0 | 3.5分钟(0.03%) |
| CC-BY-ND 2.5 | 3.0分钟(0.02%) |
| Free Art License | 3.0分钟(0.02%) |
| CC-Sampling+ 1.0 | 2.5分钟(0.02%) |
| CC-BY 2.0 | 2.0分钟(0.02%) |
| CC-BY 2.5 | 1.0分钟(0.01%) |
# 引用格式
@inproceedings{fma_dataset,
title = {{FMA}: A Dataset for Music Analysis},
author = {Defferrard, Micha"el and Benzi, Kirell and Vandergheynst, Pierre and Bresson, Xavier},
booktitle = {18th International Society for Music Information Retrieval Conference (ISMIR)},
year = {2017},
archiveprefix = {arXiv},
eprint = {1612.01840},
url = {https://arxiv.org/abs/1612.01840},
}
@inproceedings{fma_challenge,
title = {Learning to Recognize Musical Genre from Audio},
subtitle = {Challenge Overview},
author = {Defferrard, Micha"el and Mohanty, Sharada P. and Carroll, Sean F. and Salath"e, Marcel},
booktitle = {The 2018 Web Conference Companion},
year = {2018},
publisher = {ACM Press},
isbn = {9781450356404},
doi = {10.1145/3184558.3192310},
archiveprefix = {arXiv},
eprint = {1803.05337},
url = {https://arxiv.org/abs/1803.05337},
}
提供机构:
maas
创建时间:
2025-03-18



