free-music-archive-commercial-16khz-full
收藏魔搭社区2025-12-05 更新2025-03-22 收录
下载链接:
https://modelscope.cn/datasets/benjamin-paine/free-music-archive-commercial-16khz-full
下载链接
链接失效反馈官方服务:
资源简介:
# FMA: A Dataset for Music Analysis
[Michaël Defferrard](https://deff.ch/), [Kirell Benzi](https://kirellbenzi.com/), [Pierre Vandergheynst](https://people.epfl.ch/pierre.vandergheynst), [Xavier Bresson](https://www.ntu.edu.sg/home/xbresson).
**International Society for Music Information Retrieval Conference (ISMIR), 2017.**
> We introduce the Free Music Archive (FMA), an open and easily accessible dataset suitable for evaluating several tasks in MIR, a field concerned with browsing, searching, and organizing large music collections. The community's growing interest in feature and end-to-end learning is however restrained by the limited availability of large audio datasets. The FMA aims to overcome this hurdle by providing 917 GiB and 343 days of Creative Commons-licensed audio from 106,574 tracks from 16,341 artists and 14,854 albums, arranged in a hierarchical taxonomy of 161 genres. It provides full-length and high-quality audio, pre-computed features, together with track- and user-level metadata, tags, and free-form text such as biographies. We here describe the dataset and how it was created, propose a train/validation/test split and three subsets, discuss some suitable MIR tasks, and evaluate some baselines for genre recognition. Code, data, and usage examples are available at https://github.com/mdeff/fma.
Paper: [arXiv:1612.01840](https://arxiv.org/abs/1612.01840) - [latex and reviews](https://github.com/mdeff/paper-fma-ismir2017)
Slides: [doi:10.5281/zenodo.1066119](https://doi.org/10.5281/zenodo.1066119)
Poster: [doi:10.5281/zenodo.1035847](https://doi.org/10.5281/zenodo.1035847)
# This Pack
This is the **full** dataset, limited only the **commercially licensed** samples comprising a total of **8,802 samples** clips of **untrimmed length** totaling **531 hours** of audio in **10.5 GB** of disk space.
# License
- The [FMA codebase](https://github.com/mdeff/fma) is released under [The MIT License](https://github.com/mdeff/fma/blob/master/LICENSE.txt).
- The FMA metadata is released under [CC-BY 4.0](https://creativecommons.org/licenses/by/4.0).
- The individual files are released under various Creative Commons family licenses, with a small amount of additional licenses. **Each file has its license attached and important details of the license enumerated.** To make it easy to use for developers and trainers, a configuration is available to limit only to commercially-usable data.
Please refer to any of the following URLs for additional details.
| Class Label | License Name | URL |
| ----------- | ------------ | --- |
| 0 | CC-BY 1.0 | https://creativecommons.org/licenses/by/1.0/ |
| 1 | CC-BY 2.0 | https://creativecommons.org/licenses/by/2.0/ |
| 2 | CC-BY 2.5 | https://creativecommons.org/licenses/by/2.5/ |
| 3 | CC-BY 3.0 | https://creativecommons.org/licenses/by/3.0/ |
| 4 | CC-BY 4.0 | https://creativecommons.org/licenses/by/4.0/ |
| 5 | CC-Sampling+ 1.0 | https://creativecommons.org/licenses/sampling+/1.0/ |
| 6 | CC0 1.0 | https://creativecommons.org/publicdomain/zero/1.0/ |
| 7 | FMA Sound Recording Common Law | https://freemusicarchive.org/Sound_Recording_Common_Law |
| 8 | Free Art License | https://artlibre.org/licence/lal/en |
| 9 | Public Domain Mark 1.0 | https://creativecommons.org/publicdomain/mark/1.0/ |
## Total Duration by License
| License | Total Duration (Percentage) |
| ------- | --------------------------- |
| CC-BY 4.0 | 377.0 hours (4.65%) |
| CC-BY 3.0 | 106.9 hours (1.32%) |
| FMA Sound Recording Common Law | 19.9 hours (0.25%) |
| CC0 1.0 | 10.5 hours (0.13%) |
| CC-BY 1.0 | 10.4 hours (0.13%) |
| Free Art License | 2.7 hours (0.03%) |
| CC-BY 2.0 | 2.5 hours (0.03%) |
| CC-Sampling+ 1.0 | 53.9 minutes (0.01%) |
| CC-BY 2.5 | 11.2 minutes (0.00%) |
# Citations
```
@inproceedings{fma_dataset,
title = {{FMA}: A Dataset for Music Analysis},
author = {Defferrard, Micha\"el and Benzi, Kirell and Vandergheynst, Pierre and Bresson, Xavier},
booktitle = {18th International Society for Music Information Retrieval Conference (ISMIR)},
year = {2017},
archiveprefix = {arXiv},
eprint = {1612.01840},
url = {https://arxiv.org/abs/1612.01840},
}
```
```
@inproceedings{fma_challenge,
title = {Learning to Recognize Musical Genre from Audio},
subtitle = {Challenge Overview},
author = {Defferrard, Micha\"el and Mohanty, Sharada P. and Carroll, Sean F. and Salath\'e, Marcel},
booktitle = {The 2018 Web Conference Companion},
year = {2018},
publisher = {ACM Press},
isbn = {9781450356404},
doi = {10.1145/3184558.3192310},
archiveprefix = {arXiv},
eprint = {1803.05337},
url = {https://arxiv.org/abs/1803.05337},
}
```
# FMA:音乐分析数据集
[Michaël Defferrard](https://deff.ch/), [Kirell Benzi](https://kirellbenzi.com/), [Pierre Vandergheynst](https://people.epfl.ch/pierre.vandergheynst), [Xavier Bresson](https://www.ntu.edu.sg/home/xbresson).
**2017年第18届国际音乐信息检索大会(International Society for Music Information Retrieval Conference, ISMIR)**
> 本文介绍了自由音乐档案馆(Free Music Archive, FMA)——一个开放易用的数据集,适用于评估音乐信息检索(Music Information Retrieval, MIR)领域的多项任务,该领域专注于浏览、检索与整理大规模音乐馆藏。当前学界对特征学习与端到端学习的兴趣日益浓厚,但大规模音频数据集的匮乏却制约了这一方向的发展。FMA旨在突破这一瓶颈,其收录了来自16341位艺术家、14854张专辑的106574首曲目,总时长343天,容量达917 GiB,所有音频均采用知识共享(Creative Commons)许可协议,并按照包含161个类别的分层分类体系进行组织。该数据集不仅提供完整高保真的音频文件与预计算特征,还包含曲目级、用户级的元数据、标签,以及艺术家简介等自由格式文本。本文详细阐述了该数据集的构建方案与生成流程,提出了训练/验证/测试集划分方式与三个子集,探讨了其适配的MIR任务,并针对音乐流派识别任务评估了若干基准模型性能。相关代码、数据集与使用示例可访问:https://github.com/mdeff/fma.
论文:[arXiv:1612.01840](https://arxiv.org/abs/1612.01840) - [LaTeX源码与评审意见](https://github.com/mdeff/paper-fma-ismir2017)
演示文稿:[doi:10.5281/zenodo.1066119](https://doi.org/10.5281/zenodo.1066119)
会议海报:[doi:10.5281/zenodo.1035847](https://doi.org/10.5281/zenodo.1035847)
# 本数据集包
本数据集为**完整数据集**,仅包含获得商业许可的音频片段,总计**8802条未剪辑片段**,总音频时长531小时,占用磁盘空间10.5 GB。
# 许可协议
- [FMA代码库](https://github.com/mdeff/fma)采用[MIT许可协议](https://github.com/mdeff/fma/blob/master/LICENSE.txt)发布。
- FMA元数据采用[CC-BY 4.0](https://creativecommons.org/licenses/by/4.0)协议发布。
- 各音频文件采用多种知识共享系列许可协议,少量文件采用其他额外许可协议。**每份音频文件均附带其专属许可协议,并枚举了许可的关键细节。** 为便于开发者与训练者使用,本数据集提供配置选项,可仅筛选出可商用的数据。
如需了解更多细节,请访问以下链接。
| 类别标签 | 许可名称 | 链接 |
| ----------- | ------------ | --- |
| 0 | 知识共享署名1.0(CC-BY 1.0) | https://creativecommons.org/licenses/by/1.0/ |
| 1 | 知识共享署名2.0(CC-BY 2.0) | https://creativecommons.org/licenses/by/2.0/ |
| 2 | 知识共享署名2.5(CC-BY 2.5) | https://creativecommons.org/licenses/by/2.5/ |
| 3 | 知识共享署名3.0(CC-BY 3.0) | https://creativecommons.org/licenses/by/3.0/ |
| 4 | 知识共享署名4.0(CC-BY 4.0) | https://creativecommons.org/licenses/by/4.0/ |
| 5 | 知识共享采样+1.0(CC-Sampling+ 1.0) | https://creativecommons.org/licenses/sampling+/1.0/ |
| 6 | CC0 1.0 | https://creativecommons.org/publicdomain/zero/1.0/ |
| 7 | FMA录音普通法 | https://freemusicarchive.org/Sound_Recording_Common_Law |
| 8 | 自由艺术许可协议 | https://artlibre.org/licence/lal/en |
| 9 | 公共领域标记1.0(Public Domain Mark 1.0) | https://creativecommons.org/publicdomain/mark/1.0/ |
## 各许可协议总时长
| 许可协议 | 总时长(占比) |
| ------- | --------------------------- |
| 知识共享署名4.0(CC-BY 4.0) | 377.0小时(4.65%) |
| 知识共享署名3.0(CC-BY 3.0) | 106.9小时(1.32%) |
| FMA录音普通法 | 19.9小时(0.25%) |
| CC0 1.0 | 10.5小时(0.13%) |
| 知识共享署名1.0(CC-BY 1.0) | 10.4小时(0.13%) |
| 自由艺术许可协议 | 2.7小时(0.03%) |
| 知识共享署名2.0(CC-BY 2.0) | 2.5小时(0.03%) |
| 知识共享采样+1.0(CC-Sampling+ 1.0) | 53.9分钟(0.01%) |
| 知识共享署名2.5(CC-BY 2.5) | 11.2分钟(0.00%) |
# 引用格式
@inproceedings{fma_dataset,
title = {{FMA}: A Dataset for Music Analysis},
author = {Defferrard, Micha"el and Benzi, Kirell and Vandergheynst, Pierre and Bresson, Xavier},
booktitle = {18th International Society for Music Information Retrieval Conference (ISMIR)},
year = {2017},
archiveprefix = {arXiv},
eprint = {1612.01840},
url = {https://arxiv.org/abs/1612.01840},
}
@inproceedings{fma_challenge,
title = {Learning to Recognize Musical Genre from Audio},
subtitle = {Challenge Overview},
author = {Defferrard, Micha"el and Mohanty, Sharada P. and Carroll, Sean F. and Salath"e, Marcel},
booktitle = {The 2018 Web Conference Companion},
year = {2018},
publisher = {ACM Press},
isbn = {9781450356404},
doi = {10.1145/3184558.3192310},
archiveprefix = {arXiv},
eprint = {1803.05337},
url = {https://arxiv.org/abs/1803.05337},
}
提供机构:
maas
创建时间:
2025-03-18
搜集汇总
数据集介绍

背景与挑战
背景概述
FMA是一个专为音乐信息检索设计的开放数据集,包含8,802个商业许可的未修剪音频样本,总时长531小时,数据量为10.5GB。该数据集基于Creative Commons许可证,旨在支持音乐分析任务。
以上内容由遇见数据集搜集并总结生成



