five

free-music-archive-full

收藏
魔搭社区2025-12-05 更新2025-03-22 收录
下载链接:
https://modelscope.cn/datasets/benjamin-paine/free-music-archive-full
下载链接
链接失效反馈
官方服务:
资源简介:
# FMA: A Dataset for Music Analysis [Michaël Defferrard](https://deff.ch/), [Kirell Benzi](https://kirellbenzi.com/), [Pierre Vandergheynst](https://people.epfl.ch/pierre.vandergheynst), [Xavier Bresson](https://www.ntu.edu.sg/home/xbresson). **International Society for Music Information Retrieval Conference (ISMIR), 2017.** > We introduce the Free Music Archive (FMA), an open and easily accessible dataset suitable for evaluating several tasks in MIR, a field concerned with browsing, searching, and organizing large music collections. The community's growing interest in feature and end-to-end learning is however restrained by the limited availability of large audio datasets. The FMA aims to overcome this hurdle by providing 917 GiB and 343 days of Creative Commons-licensed audio from 106,574 tracks from 16,341 artists and 14,854 albums, arranged in a hierarchical taxonomy of 161 genres. It provides full-length and high-quality audio, pre-computed features, together with track- and user-level metadata, tags, and free-form text such as biographies. We here describe the dataset and how it was created, propose a train/validation/test split and three subsets, discuss some suitable MIR tasks, and evaluate some baselines for genre recognition. Code, data, and usage examples are available at https://github.com/mdeff/fma. Paper: [arXiv:1612.01840](https://arxiv.org/abs/1612.01840) - [latex and reviews](https://github.com/mdeff/paper-fma-ismir2017) Slides: [doi:10.5281/zenodo.1066119](https://doi.org/10.5281/zenodo.1066119) Poster: [doi:10.5281/zenodo.1035847](https://doi.org/10.5281/zenodo.1035847) # This Pack This is the **full** dataset, comprising a total of **106,199** clips of **untrimmed length** over **16** *unbalanced* genres totaling **8,104 hours** of audio. Packed as Parquet files, this dataset is 593 GB in size, roughly a 34% size saving over the original ZIP file. ## Repack Notes - 173 files were unreadable by `libsndfile / libmpg123`, these were removed. - 202 files had licenses that were unclear on whether or not they permitted redistribution, or the full license text was unavailable. These were removed. - Many of the remaining files had mixed or inconsistent encoding. To homogenize the dataset, all audio was re-encoded using `libmpg123`. # License - The [FMA codebase](https://github.com/mdeff/fma) is released under [The MIT License](https://github.com/mdeff/fma/blob/master/LICENSE.txt). - The FMA metadata is released under [CC-BY 4.0](https://creativecommons.org/licenses/by/4.0). - The individual files are released under various Creative Commons family licenses, with a small amount of additional licenses. **Each file has its license attached and important details of the license enumerated.** To make it easy to use for developers and trainers, a configuration is available to limit only to commercially-usable data. Please refer to any of the following URLs for additional details. | Class Label | License Name | URL | | ----------- | ------------ | --- | | 0 | CC-BY 1.0 | https://creativecommons.org/licenses/by/1.0/ | | 1 | CC-BY 2.0 | https://creativecommons.org/licenses/by/2.0/ | | 2 | CC-BY 2.5 | https://creativecommons.org/licenses/by/2.5/ | | 3 | CC-BY 3.0 | https://creativecommons.org/licenses/by/3.0/ | | 4 | CC-BY 4.0 | https://creativecommons.org/licenses/by/4.0/ | | 5 | CC-BY-NC 2.0 | https://creativecommons.org/licenses/by-nc/2.0/ | | 6 | CC-BY-NC 2.1 | https://creativecommons.org/licenses/by-nc/2.1/ | | 7 | CC-BY-NC 2.5 | https://creativecommons.org/licenses/by-nc/2.5/ | | 8 | CC-BY-NC 3.0 | https://creativecommons.org/licenses/by-nc/3.0/ | | 9 | CC-BY-NC 4.0 | https://creativecommons.org/licenses/by-nc/4.0/ | | 10 | CC-BY-NC-ND 2.0 | https://creativecommons.org/licenses/by-nc-nd/2.0/ | | 11 | CC-BY-NC-ND 2.1 | https://creativecommons.org/licenses/by-nc-nd/2.1/ | | 12 | CC-BY-NC-ND 2.5 | https://creativecommons.org/licenses/by-nc-nd/2.5/ | | 13 | CC-BY-NC-ND 3.0 | https://creativecommons.org/licenses/by-nc-nd/3.0/ | | 14 | CC-BY-NC-ND 4.0 | https://creativecommons.org/licenses/by-nc-nd/4.0/ | | 15 | CC-BY-NC-SA 2.0 | https://creativecommons.org/licenses/by-nc-sa/2.0/ | | 16 | CC-BY-NC-SA 2.1 | https://creativecommons.org/licenses/by-nc-sa/2.1/ | | 17 | CC-BY-NC-SA 2.5 | https://creativecommons.org/licenses/by-nc-sa/2.5/ | | 18 | CC-BY-NC-SA 3.0 | https://creativecommons.org/licenses/by-nc-sa/3.0/ | | 19 | CC-BY-NC-SA 4.0 | https://creativecommons.org/licenses/by-nc-sa/4.0/ | | 20 | CC-BY-ND 2.0 | https://creativecommons.org/licenses/by-nd/2.0/ | | 21 | CC-BY-ND 2.5 | https://creativecommons.org/licenses/by-nd/2.5/ | | 22 | CC-BY-ND 3.0 | https://creativecommons.org/licenses/by-nd/3.0/ | | 23 | CC-BY-ND 4.0 | https://creativecommons.org/licenses/by-nd/4.0/ | | 24 | CC-BY-SA 2.0 | https://creativecommons.org/licenses/by-sa/2.0/ | | 25 | CC-BY-SA 2.5 | https://creativecommons.org/licenses/by-sa/2.5/ | | 26 | CC-BY-SA 3.0 | https://creativecommons.org/licenses/by-sa/3.0/ | | 27 | CC-BY-SA 4.0 | https://creativecommons.org/licenses/by-sa/4.0/ | | 28 | CC-NC-Sampling+ 1.0 | https://creativecommons.org/licenses/nc-sampling+/1.0/ | | 29 | CC-Sampling+ 1.0 | https://creativecommons.org/licenses/sampling+/1.0/ | | 30 | FMA Sound Recording Common Law | https://freemusicarchive.org/Sound_Recording_Common_Law | | 31 | Free Art License | https://artlibre.org/licence/lal/en | | 32 | Free Music Philosophy (FMP) | https://irdial.com/free_and_easy.htm | ## Total Duration by License | License | Total Duration (Percentage) | | ------- | --------------------------- | | CC-BY-NC-SA 3.0 | 2768.3 hours (34.16%) | | CC-BY-NC-ND 3.0 | 2296.4 hours (28.34%) | | CC-BY-NC-ND 4.0 | 1018.4 hours (12.57%) | | CC-BY-NC-SA 4.0 | 533.2 hours (6.58%) | | CC-BY 4.0 | 377.0 hours (4.65%) | | CC-BY-NC 3.0 | 288.9 hours (3.56%) | | CC-BY-NC 4.0 | 232.6 hours (2.87%) | | CC-BY 3.0 | 106.9 hours (1.32%) | | CC-BY-SA 4.0 | 99.4 hours (1.23%) | | CC-BY-SA 3.0 | 79.7 hours (0.98%) | | CC-BY-NC-SA 2.0 | 65.1 hours (0.80%) | | CC-BY-NC-ND 2.0 | 56.2 hours (0.69%) | | CC-BY-ND 3.0 | 36.8 hours (0.45%) | | CC-BY-ND 4.0 | 25.0 hours (0.31%) | | CC-BY-NC-ND 2.5 | 24.2 hours (0.30%) | | FMA Sound Recording Common Law | 19.9 hours (0.25%) | | CC-BY-NC-SA 2.5 | 18.0 hours (0.22%) | | CC-BY-NC 2.5 | 13.3 hours (0.16%) | | CC0 1.0 | 10.5 hours (0.13%) | | CC-BY 1.0 | 10.4 hours (0.13%) | | Free Music Philosophy (FMP) | 4.4 hours (0.05%) | | Free Art License | 2.7 hours (0.03%) | | CC-BY 2.0 | 2.5 hours (0.03%) | | CC-BY-NC 2.1 | 2.4 hours (0.03%) | | CC-BY-NC-SA 2.1 | 2.3 hours (0.03%) | | CC-BY-SA 2.0 | 1.9 hours (0.02%) | | CC-BY-NC 2.0 | 1.6 hours (0.02%) | | CC-BY-ND 2.5 | 1.6 hours (0.02%) | | CC-NC-Sampling+ 1.0 | 1.4 hours (0.02%) | | CC-BY-NC-ND 2.1 | 65.0 minutes (0.01%) | | CC-Sampling+ 1.0 | 53.9 minutes (0.01%) | | CC-BY-SA 2.5 | 31.8 minutes (0.01%) | | CC-BY-ND 2.0 | 29.7 minutes (0.01%) | | CC-BY 2.5 | 11.2 minutes (0.00%) | # Citations ``` @inproceedings{fma_dataset, title = {{FMA}: A Dataset for Music Analysis}, author = {Defferrard, Micha\"el and Benzi, Kirell and Vandergheynst, Pierre and Bresson, Xavier}, booktitle = {18th International Society for Music Information Retrieval Conference (ISMIR)}, year = {2017}, archiveprefix = {arXiv}, eprint = {1612.01840}, url = {https://arxiv.org/abs/1612.01840}, } ``` ``` @inproceedings{fma_challenge, title = {Learning to Recognize Musical Genre from Audio}, subtitle = {Challenge Overview}, author = {Defferrard, Micha\"el and Mohanty, Sharada P. and Carroll, Sean F. and Salath\'e, Marcel}, booktitle = {The 2018 Web Conference Companion}, year = {2018}, publisher = {ACM Press}, isbn = {9781450356404}, doi = {10.1145/3184558.3192310}, archiveprefix = {arXiv}, eprint = {1803.05337}, url = {https://arxiv.org/abs/1803.05337}, } ```

# FMA:音乐分析数据集 [米夏埃尔·德费尔(Michaël Defferrard)](https://deff.ch/)、[基雷尔·本齐(Kirell Benzi)](https://kirellbenzi.com/)、[皮埃尔·范德根斯特(Pierre Vandergheynst)](https://people.epfl.ch/pierre.vandergheynst)、[泽维尔·布雷斯松(Xavier Bresson)](https://www.ntu.edu.sg/home/xbresson)。 **国际音乐信息检索会议(International Society for Music Information Retrieval Conference, ISMIR),2017年** > 我们介绍了免费音乐档案馆(Free Music Archive, FMA)——一个开放且易于获取的数据集,适用于评估音乐信息检索(Music Information Retrieval, MIR)领域的多项任务。该领域专注于浏览、检索与组织大规模音乐藏品。当前学界对特征学习与端到端学习的兴趣与日俱增,但受限于大规模音频数据集的匮乏,这一发展受到了制约。FMA旨在解决这一难题,提供了来自16341位艺术家、14854张专辑的106574首曲目,总容量达917 GiB,总时长343天,采用包含161个类别的分层流派分类体系。数据集提供完整高音质音频、预计算特征,以及曲目与用户级元数据、标签,还有传记等自由格式文本。本文详述了该数据集的构建方案,提出了训练/验证/测试划分方案与三个子集,讨论了适用于MIR的多项任务,并针对流派识别任务评估了若干基线模型。相关代码、数据集与使用示例可访问https://github.com/mdeff/fma获取。 论文:[arXiv:1612.01840](https://arxiv.org/abs/1612.01840) - [LaTeX源码与评审意见](https://github.com/mdeff/paper-fma-ismir2017) 幻灯片:[doi:10.5281/zenodo.1066119](https://doi.org/10.5281/zenodo.1066119) 海报:[doi:10.5281/zenodo.1035847](https://doi.org/10.5281/zenodo.1035847) # 本数据集包 这是**完整**数据集,总计包含106199段**未剪辑**的音频片段,涵盖16个**非均衡**的音乐流派,总音频时长达8104小时。 本数据集以Parquet文件格式打包,总大小为593 GB,相较于原始ZIP文件节省了约34%的存储空间。 ## 重新打包说明 - 173个文件无法被`libsndfile / libmpg123`读取,已被移除。 - 202个文件的授权协议未明确是否允许再分发,或完整授权文本无法获取,已被移除。 - 剩余文件中存在多种编码格式不一致的情况。为统一数据集规范,所有音频均通过`libmpg123`进行了重编码。 # 授权协议 - [FMA代码库](https://github.com/mdeff/fma)采用[MIT许可证](https://github.com/mdeff/fma/blob/master/LICENSE.txt)发布。 - FMA元数据采用[CC-BY 4.0](https://creativecommons.org/licenses/by/4.0/)协议发布。 - 单个音频文件采用多种知识共享系列许可协议,另有少量其他授权协议。**每个文件均附带其授权协议,并枚举了该协议的关键条款。** 为方便开发者与训练者使用,我们提供了配置选项,可仅限制使用可商用的数据。 如需更多细节,请访问以下任一链接。 | 类别标签 | 授权协议名称 | 协议链接 | | ----------- | ------------ | --- | | 0 | CC-BY 1.0 | https://creativecommons.org/licenses/by/1.0/ | | 1 | CC-BY 2.0 | https://creativecommons.org/licenses/by/2.0/ | | 2 | CC-BY 2.5 | https://creativecommons.org/licenses/by/2.5/ | | 3 | CC-BY 3.0 | https://creativecommons.org/licenses/by/3.0/ | | 4 | CC-BY 4.0 | https://creativecommons.org/licenses/by/4.0/ | | 5 | CC-BY-NC 2.0 | https://creativecommons.org/licenses/by-nc/2.0/ | | 6 | CC-BY-NC 2.1 | https://creativecommons.org/licenses/by-nc/2.1/ | | 7 | CC-BY-NC 2.5 | https://creativecommons.org/licenses/by-nc/2.5/ | | 8 | CC-BY-NC 3.0 | https://creativecommons.org/licenses/by-nc/3.0/ | | 9 | CC-BY-NC 4.0 | https://creativecommons.org/licenses/by-nc/4.0/ | | 10 | CC-BY-NC-ND 2.0 | https://creativecommons.org/licenses/by-nc-nd/2.0/ | | 11 | CC-BY-NC-ND 2.1 | https://creativecommons.org/licenses/by-nc-nd/2.1/ | | 12 | CC-BY-NC-ND 2.5 | https://creativecommons.org/licenses/by-nc-nd/2.5/ | | 13 | CC-BY-NC-ND 3.0 | https://creativecommons.org/licenses/by-nc-nd/3.0/ | | 14 | CC-BY-NC-ND 4.0 | https://creativecommons.org/licenses/by-nc-nd/4.0/ | | 15 | CC-BY-NC-SA 2.0 | https://creativecommons.org/licenses/by-nc-sa/2.0/ | | 16 | CC-BY-NC-SA 2.1 | https://creativecommons.org/licenses/by-nc-sa/2.1/ | | 17 | CC-BY-NC-SA 2.5 | https://creativecommons.org/licenses/by-nc-sa/2.5/ | | 18 | CC-BY-NC-SA 3.0 | https://creativecommons.org/licenses/by-nc-sa/3.0/ | | 19 | CC-BY-NC-SA 4.0 | https://creativecommons.org/licenses/by-nc-sa/4.0/ | | 20 | CC-BY-ND 2.0 | https://creativecommons.org/licenses/by-nd/2.0/ | | 21 | CC-BY-ND 2.5 | https://creativecommons.org/licenses/by-nd/2.5/ | | 22 | CC-BY-ND 3.0 | https://creativecommons.org/licenses/by-nd/3.0/ | | 23 | CC-BY-ND 4.0 | https://creativecommons.org/licenses/by-nd/4.0/ | | 24 | CC-BY-SA 2.0 | https://creativecommons.org/licenses/by-sa/2.0/ | | 25 | CC-BY-SA 2.5 | https://creativecommons.org/licenses/by-sa/2.5/ | | 26 | CC-BY-SA 3.0 | https://creativecommons.org/licenses/by-sa/3.0/ | | 27 | CC-BY-SA 4.0 | https://creativecommons.org/licenses/by-sa/4.0/ | | 28 | CC-NC-Sampling+ 1.0 | https://creativecommons.org/licenses/nc-sampling+/1.0/ | | 29 | CC-Sampling+ 1.0 | https://creativecommons.org/licenses/sampling+/1.0/ | | 30 | FMA Sound Recording Common Law | https://freemusicarchive.org/Sound_Recording_Common_Law | | 31 | Free Art License | https://artlibre.org/licence/lal/en | | 32 | Free Music Philosophy (FMP) | https://irdial.com/free_and_easy.htm | ## 各授权协议总时长 | 授权协议 | 总时长(占比) | | ------- | --------------------------- | | CC-BY-NC-SA 3.0 | 2768.3 小时(34.16%) | | CC-BY-NC-ND 3.0 | 2296.4 小时(28.34%) | | CC-BY-NC-ND 4.0 | 1018.4 小时(12.57%) | | CC-BY-NC-SA 4.0 | 533.2 小时(6.58%) | | CC-BY 4.0 | 377.0 小时(4.65%) | | CC-BY-NC 3.0 | 288.9 小时(3.56%) | | CC-BY-NC 4.0 | 232.6 小时(2.87%) | | CC-BY 3.0 | 106.9 小时(1.32%) | | CC-BY-SA 4.0 | 99.4 小时(1.23%) | | CC-BY-SA 3.0 | 79.7 小时(0.98%) | | CC-BY-NC-SA 2.0 | 65.1 小时(0.80%) | | CC-BY-NC-ND 2.0 | 56.2 小时(0.69%) | | CC-BY-ND 3.0 | 36.8 小时(0.45%) | | CC-BY-ND 4.0 | 25.0 小时(0.31%) | | CC-BY-NC-ND 2.5 | 24.2 小时(0.30%) | | FMA Sound Recording Common Law | 19.9 小时(0.25%) | | CC-BY-NC-SA 2.5 | 18.0 小时(0.22%) | | CC-BY-NC 2.5 | 13.3 小时(0.16%) | | CC0 1.0 | 10.5 小时(0.13%) | | CC-BY 1.0 | 10.4 小时(0.13%) | | Free Music Philosophy (FMP) | 4.4 小时(0.05%) | | Free Art License | 2.7 小时(0.03%) | | CC-BY 2.0 | 2.5 小时(0.03%) | | CC-BY-NC 2.1 | 2.4 小时(0.03%) | | CC-BY-NC-SA 2.1 | 2.3 小时(0.03%) | | CC-BY-SA 2.0 | 1.9 小时(0.02%) | | CC-BY-NC 2.0 | 1.6 小时(0.02%) | | CC-BY-ND 2.5 | 1.6 小时(0.02%) | | CC-NC-Sampling+ 1.0 | 1.4 小时(0.02%) | | CC-BY-NC-ND 2.1 | 65.0 分钟(0.01%) | | CC-Sampling+ 1.0 | 53.9 分钟(0.01%) | | CC-BY-SA 2.5 | 31.8 分钟(0.01%) | | CC-BY-ND 2.0 | 29.7 分钟(0.01%) | | CC-BY 2.5 | 11.2 分钟(0.00%) | # 引用格式 @inproceedings{fma_dataset, title = {{FMA}: A Dataset for Music Analysis}, author = {Defferrard, Micha"el and Benzi, Kirell and Vandergheynst, Pierre and Bresson, Xavier}, booktitle = {18th International Society for Music Information Retrieval Conference (ISMIR)}, year = {2017}, archiveprefix = {arXiv}, eprint = {1612.01840}, url = {https://arxiv.org/abs/1612.01840}, } @inproceedings{fma_challenge, title = {Learning to Recognize Musical Genre from Audio}, subtitle = {Challenge Overview}, author = {Defferrard, Micha"el and Mohanty, Sharada P. and Carroll, Sean F. and Salath"e, Marcel}, booktitle = {The 2018 Web Conference Companion}, year = {2018}, publisher = {ACM Press}, isbn = {9781450356404}, doi = {10.1145/3184558.3192310}, archiveprefix = {arXiv}, eprint = {1803.05337}, url = {https://arxiv.org/abs/1803.05337}, }
提供机构:
maas
创建时间:
2025-03-18
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作