five

free-music-archive-large

收藏
魔搭社区2025-10-09 更新2025-03-22 收录
下载链接:
https://modelscope.cn/datasets/benjamin-paine/free-music-archive-large
下载链接
链接失效反馈
官方服务:
资源简介:
# FMA: A Dataset for Music Analysis [Michaël Defferrard](https://deff.ch/), [Kirell Benzi](https://kirellbenzi.com/), [Pierre Vandergheynst](https://people.epfl.ch/pierre.vandergheynst), [Xavier Bresson](https://www.ntu.edu.sg/home/xbresson). **International Society for Music Information Retrieval Conference (ISMIR), 2017.** > We introduce the Free Music Archive (FMA), an open and easily accessible dataset suitable for evaluating several tasks in MIR, a field concerned with browsing, searching, and organizing large music collections. The community's growing interest in feature and end-to-end learning is however restrained by the limited availability of large audio datasets. The FMA aims to overcome this hurdle by providing 917 GiB and 343 days of Creative Commons-licensed audio from 106,574 tracks from 16,341 artists and 14,854 albums, arranged in a hierarchical taxonomy of 161 genres. It provides full-length and high-quality audio, pre-computed features, together with track- and user-level metadata, tags, and free-form text such as biographies. We here describe the dataset and how it was created, propose a train/validation/test split and three subsets, discuss some suitable MIR tasks, and evaluate some baselines for genre recognition. Code, data, and usage examples are available at https://github.com/mdeff/fma. Paper: [arXiv:1612.01840](https://arxiv.org/abs/1612.01840) - [latex and reviews](https://github.com/mdeff/paper-fma-ismir2017) Slides: [doi:10.5281/zenodo.1066119](https://doi.org/10.5281/zenodo.1066119) Poster: [doi:10.5281/zenodo.1035847](https://doi.org/10.5281/zenodo.1035847) # This Pack This is the **large** dataset, comprising a total of **105,024 samples** clipped at **30 seconds** over **16** *unbalanced* genres totaling **869.2 hours** hours of audio. ## Repack Notes - 173 files were unreadable by `libsndfile / libmpg123`, these were removed. - 1377 files had licenses that were unclear on whether or not they permitted redistribution, or the full license text was unavailable. These were removed. # License - The [FMA codebase](https://github.com/mdeff/fma) is released under [The MIT License](https://github.com/mdeff/fma/blob/master/LICENSE.txt). - The FMA metadata is released under [CC-BY 4.0](https://creativecommons.org/licenses/by/4.0). - The individual files are released under various Creative Commons family licenses, with a small amount of additional licenses. **Each file has its license attached and important details of the license enumerated.** To make it easy to use for developers and trainers, a configuration is available to limit only to commercially-usable data. Please refer to any of the following URLs for additional details. | Class Label | License Name | URL | | ----------- | ------------ | --- | | 0 | CC-BY 1.0 | https://creativecommons.org/licenses/by/1.0/ | | 1 | CC-BY 2.0 | https://creativecommons.org/licenses/by/2.0/ | | 2 | CC-BY 2.5 | https://creativecommons.org/licenses/by/2.5/ | | 3 | CC-BY 3.0 | https://creativecommons.org/licenses/by/3.0/ | | 4 | CC-BY 4.0 | https://creativecommons.org/licenses/by/4.0/ | | 5 | CC-BY-NC 2.0 | https://creativecommons.org/licenses/by-nc/2.0/ | | 6 | CC-BY-NC 2.1 | https://creativecommons.org/licenses/by-nc/2.1/ | | 7 | CC-BY-NC 2.5 | https://creativecommons.org/licenses/by-nc/2.5/ | | 8 | CC-BY-NC 3.0 | https://creativecommons.org/licenses/by-nc/3.0/ | | 9 | CC-BY-NC 4.0 | https://creativecommons.org/licenses/by-nc/4.0/ | | 10 | CC-BY-NC-ND 2.0 | https://creativecommons.org/licenses/by-nc-nd/2.0/ | | 11 | CC-BY-NC-ND 2.1 | https://creativecommons.org/licenses/by-nc-nd/2.1/ | | 12 | CC-BY-NC-ND 2.5 | https://creativecommons.org/licenses/by-nc-nd/2.5/ | | 13 | CC-BY-NC-ND 3.0 | https://creativecommons.org/licenses/by-nc-nd/3.0/ | | 14 | CC-BY-NC-ND 4.0 | https://creativecommons.org/licenses/by-nc-nd/4.0/ | | 15 | CC-BY-NC-SA 2.0 | https://creativecommons.org/licenses/by-nc-sa/2.0/ | | 16 | CC-BY-NC-SA 2.1 | https://creativecommons.org/licenses/by-nc-sa/2.1/ | | 17 | CC-BY-NC-SA 2.5 | https://creativecommons.org/licenses/by-nc-sa/2.5/ | | 18 | CC-BY-NC-SA 3.0 | https://creativecommons.org/licenses/by-nc-sa/3.0/ | | 19 | CC-BY-NC-SA 4.0 | https://creativecommons.org/licenses/by-nc-sa/4.0/ | | 20 | CC-BY-ND 2.0 | https://creativecommons.org/licenses/by-nd/2.0/ | | 21 | CC-BY-ND 2.5 | https://creativecommons.org/licenses/by-nd/2.5/ | | 22 | CC-BY-ND 3.0 | https://creativecommons.org/licenses/by-nd/3.0/ | | 23 | CC-BY-ND 4.0 | https://creativecommons.org/licenses/by-nd/4.0/ | | 24 | CC-BY-SA 2.0 | https://creativecommons.org/licenses/by-sa/2.0/ | | 25 | CC-BY-SA 2.5 | https://creativecommons.org/licenses/by-sa/2.5/ | | 26 | CC-BY-SA 3.0 | https://creativecommons.org/licenses/by-sa/3.0/ | | 27 | CC-BY-SA 4.0 | https://creativecommons.org/licenses/by-sa/4.0/ | | 28 | CC-NC-Sampling+ 1.0 | https://creativecommons.org/licenses/nc-sampling+/1.0/ | | 29 | CC-Sampling+ 1.0 | https://creativecommons.org/licenses/sampling+/1.0/ | | 30 | FMA Sound Recording Common Law | https://freemusicarchive.org/Sound_Recording_Common_Law | | 31 | Free Art License | https://artlibre.org/licence/lal/en | | 32 | Free Music Philosophy (FMP) | https://irdial.com/free_and_easy.htm | ## Total Duration by License | License | Total Duration (Percentage) | | ------- | --------------------------- | | CC-BY-NC-SA 3.0 | 291.6 hours (33.55%) | | CC-BY-NC-ND 3.0 | 237.3 hours (27.30%) | | CC-BY-NC-ND 4.0 | 100.6 hours (11.57%) | | CC-BY-NC-SA 4.0 | 57.0 hours (6.56%) | | CC-BY 4.0 | 41.3 hours (4.75%) | | CC-BY-NC 3.0 | 38.8 hours (4.47%) | | CC-BY-NC 4.0 | 28.1 hours (3.23%) | | CC-BY 3.0 | 15.3 hours (1.76%) | | CC-BY-SA 4.0 | 12.9 hours (1.48%) | | CC-BY-SA 3.0 | 9.8 hours(1.13%) | | CC-BY-NC-SA 2.0 | 7.0 hours (0.81%) | | CC-BY-NC-ND 2.0 | 6.4 hours (0.74%) | | CC-BY-ND 3.0 | 4.5 hours (0.52%) | | FMA Sound Recording Common Law | 3.5 hours (0.40%) | | CC-BY-ND 4.0 | 3.3 hours (0.38%) | | CC-BY-NC-ND 2.5 | 2.9 hours (0.33%) | | CC-BY-NC-SA 2.5 | 2.2 hours (0.25%) | | CC0 1.0 | 1.3 hours (0.15%) | | CC-BY-NC 2.5 | 1.2 hours (0.14%) | | Free Music Philosophy (FMP) | 1.1 hours (0.13%) | | CC-BY 1.0 | 52.0 minutes (0.10%) | | CC-BY-SA 2.0 | 24.0 minutes (0.05%) | | CC-BY 2.0 | 16.5 minutes (0.03%) | | CC-BY-NC-SA 2.1 | 15.0 minutes (0.03%) | | CC-BY-NC 2.1 | 15.0 minutes (0.03%) | | CC-BY-NC 2.0 | 12.0 minutes (0.02%) | | CC-NC-Sampling+ 1.0 | 9.5 minutes (0.02%) | | CC-BY-NC-ND 2.1 | 8.5 minutes (0.02%) | | Free Art License | 7.8 minutes (0.02%) | | CC-Sampling+ 1.0 | 7.0 minutes (0.01%) | | CC-BY-ND 2.5 | 5.7 minutes (0.01%) | | CC-BY-SA 2.5 | 4.5 minutes (0.01%) | | CC-BY-ND 2.0 | 3.5 minutes (0.01%) | | CC-BY 2.5 | 1.0 minutes (0.00%) | # Citations ``` @inproceedings{fma_dataset, title = {{FMA}: A Dataset for Music Analysis}, author = {Defferrard, Micha\"el and Benzi, Kirell and Vandergheynst, Pierre and Bresson, Xavier}, booktitle = {18th International Society for Music Information Retrieval Conference (ISMIR)}, year = {2017}, archiveprefix = {arXiv}, eprint = {1612.01840}, url = {https://arxiv.org/abs/1612.01840}, } ``` ``` @inproceedings{fma_challenge, title = {Learning to Recognize Musical Genre from Audio}, subtitle = {Challenge Overview}, author = {Defferrard, Micha\"el and Mohanty, Sharada P. and Carroll, Sean F. and Salath\'e, Marcel}, booktitle = {The 2018 Web Conference Companion}, year = {2018}, publisher = {ACM Press}, isbn = {9781450356404}, doi = {10.1145/3184558.3192310}, archiveprefix = {arXiv}, eprint = {1803.05337}, url = {https://arxiv.org/abs/1803.05337}, } ```

# FMA:音乐分析数据集 [迈克尔·德费拉德](https://deff.ch/)、[基雷尔·本齐](https://kirellbenzi.com/)、[皮埃尔·凡德根斯特](https://people.epfl.ch/pierre.vandergheynst)、[泽维尔·布雷松](https://www.ntu.edu.sg/home/xbresson) **国际音乐信息检索学会会议(ISMIR, 2017)** > 本文介绍了免费音乐档案馆(Free Music Archive, FMA)——一个开放易用的数据集,适用于评估音乐信息检索(Music Information Retrieval, MIR)领域的多项任务,该领域专注于浏览、检索与组织大规模音乐馆藏。当前学界对特征学习与端到端学习的兴趣日益增长,但受限于大规模音频数据集的匮乏。FMA旨在解决这一难题,共提供来自16341位艺术家、14854张专辑的106574首曲目,总时长343天,容量达917 GiB,所有音频均采用知识共享(Creative Commons)许可协议发布,并按照包含161个类别的分层分类体系进行组织。该数据集不仅提供完整高品质音频与预计算特征,还包含曲目级、用户级元数据、标签以及传记等自由格式文本。本文将详细阐述该数据集的构成与构建流程,提出训练集/验证集/测试集划分方案与三个子集,讨论适用于MIR的部分任务,并针对音乐流派识别任务评估了若干基线模型。代码、数据集与使用示例可访问:https://github.com/mdeff/fma。 论文:[arXiv:1612.01840](https://arxiv.org/abs/1612.01840) - [LaTeX源码与审稿记录](https://github.com/mdeff/paper-fma-ismir2017) 幻灯片:[doi:10.5281/zenodo.1066119](https://doi.org/10.5281/zenodo.1066119) 海报:[doi:10.5281/zenodo.1035847](https://doi.org/10.5281/zenodo.1035847) # 本数据集包 本为**大型**数据集包,总计包含**105024个样本**,所有音频均被裁剪为**30秒**片段,涵盖**16个**非平衡音乐流派,总音频时长达**869.2小时**。 ## 重新打包说明 - 173个文件无法被`libsndfile / libmpg123`读取,已移除。 - 1377个文件的许可协议不明确是否允许重新分发,或完整许可文本不可用,已移除。 ## 许可协议 - [FMA代码库](https://github.com/mdeff/fma)采用[MIT许可协议](https://github.com/mdeff/fma/blob/master/LICENSE.txt)发布。 - FMA元数据采用[CC-BY 4.0](https://creativecommons.org/licenses/by/4.0/)许可协议发布。 - 单个音频文件采用多种知识共享系列许可协议,附带少量其他许可协议。**每个文件均附带其许可协议,并枚举了许可协议的关键细节。** 为方便开发者与训练者使用,我们提供了配置选项,可仅限定使用可商用的数据。 如需了解更多细节,请访问以下任一链接。 | 类别标签 | 许可协议名称 | 链接 | | ----------- | ------------ | --- | | 0 | CC-BY 1.0 | https://creativecommons.org/licenses/by/1.0/ | | 1 | CC-BY 2.0 | https://creativecommons.org/licenses/by/2.0/ | | 2 | CC-BY 2.5 | https://creativecommons.org/licenses/by/2.5/ | | 3 | CC-BY 3.0 | https://creativecommons.org/licenses/by/3.0/ | | 4 | CC-BY 4.0 | https://creativecommons.org/licenses/by/4.0/ | | 5 | CC-BY-NC 2.0 | https://creativecommons.org/licenses/by-nc/2.0/ | | 6 | CC-BY-NC 2.1 | https://creativecommons.org/licenses/by-nc/2.1/ | | 7 | CC-BY-NC 2.5 | https://creativecommons.org/licenses/by-nc/2.5/ | | 8 | CC-BY-NC 3.0 | https://creativecommons.org/licenses/by-nc/3.0/ | | 9 | CC-BY-NC 4.0 | https://creativecommons.org/licenses/by-nc/4.0/ | | 10 | CC-BY-NC-ND 2.0 | https://creativecommons.org/licenses/by-nc-nd/2.0/ | | 11 | CC-BY-NC-ND 2.1 | https://creativecommons.org/licenses/by-nc-nd/2.1/ | | 12 | CC-BY-NC-ND 2.5 | https://creativecommons.org/licenses/by-nc-nd/2.5/ | | 13 | CC-BY-NC-ND 3.0 | https://creativecommons.org/licenses/by-nc-nd/3.0/ | | 14 | CC-BY-NC-ND 4.0 | https://creativecommons.org/licenses/by-nc-nd/4.0/ | | 15 | CC-BY-NC-SA 2.0 | https://creativecommons.org/licenses/by-nc-sa/2.0/ | | 16 | CC-BY-NC-SA 2.1 | https://creativecommons.org/licenses/by-nc-sa/2.1/ | | 17 | CC-BY-NC-SA 2.5 | https://creativecommons.org/licenses/by-nc-sa/2.5/ | | 18 | CC-BY-NC-SA 3.0 | https://creativecommons.org/licenses/by-nc-sa/3.0/ | | 19 | CC-BY-NC-SA 4.0 | https://creativecommons.org/licenses/by-nc-sa/4.0/ | | 20 | CC-BY-ND 2.0 | https://creativecommons.org/licenses/by-nd/2.0/ | | 21 | CC-BY-ND 2.5 | https://creativecommons.org/licenses/by-nd/2.5/ | | 22 | CC-BY-ND 3.0 | https://creativecommons.org/licenses/by-nd/3.0/ | | 23 | CC-BY-ND 4.0 | https://creativecommons.org/licenses/by-nd/4.0/ | | 24 | CC-BY-SA 2.0 | https://creativecommons.org/licenses/by-sa/2.0/ | | 25 | CC-BY-SA 2.5 | https://creativecommons.org/licenses/by-sa/2.5/ | | 26 | CC-BY-SA 3.0 | https://creativecommons.org/licenses/by-sa/3.0/ | | 27 | CC-BY-SA 4.0 | https://creativecommons.org/licenses/by-sa/4.0/ | | 28 | CC-NC-Sampling+ 1.0 | https://creativecommons.org/licenses/nc-sampling+/1.0/ | | 29 | CC-Sampling+ 1.0 | https://creativecommons.org/licenses/sampling+/1.0/ | | 30 | FMA Sound Recording Common Law | https://freemusicarchive.org/Sound_Recording_Common_Law | | 31 | Free Art License | https://artlibre.org/licence/lal/en | | 32 | Free Music Philosophy (FMP) | https://irdial.com/free_and_easy.htm | ## 各许可协议总时长 | 许可协议 | 总时长(占比) | | ------- | --------------------------- | | CC-BY-NC-SA 3.0 | 291.6小时(33.55%) | | CC-BY-NC-ND 3.0 | 237.3小时(27.30%) | | CC-BY-NC-ND 4.0 | 100.6小时(11.57%) | | CC-BY-NC-SA 4.0 | 57.0小时(6.56%) | | CC-BY 4.0 | 41.3小时(4.75%) | | CC-BY-NC 3.0 | 38.8小时(4.47%) | | CC-BY-NC 4.0 | 28.1小时(3.23%) | | CC-BY 3.0 | 15.3小时(1.76%) | | CC-BY-SA 4.0 | 12.9小时(1.48%) | | CC-BY-SA 3.0 | 9.8小时(1.13%) | | CC-BY-NC-SA 2.0 | 7.0小时(0.81%) | | CC-BY-NC-ND 2.0 | 6.4小时(0.74%) | | CC-BY-ND 3.0 | 4.5小时(0.52%) | | FMA Sound Recording Common Law | 3.5小时(0.40%) | | CC-BY-ND 4.0 | 3.3小时(0.38%) | | CC-BY-NC-ND 2.5 | 2.9小时(0.33%) | | CC-BY-NC-SA 2.5 | 2.2小时(0.25%) | | CC0 1.0 | 1.3小时(0.15%) | | CC-BY-NC 2.5 | 1.2小时(0.14%) | | Free Music Philosophy (FMP) | 1.1小时(0.13%) | | CC-BY 1.0 | 52.0分钟(0.10%) | | CC-BY-SA 2.0 | 24.0分钟(0.05%) | | CC-BY 2.0 | 16.5分钟(0.03%) | | CC-BY-NC-SA 2.1 | 15.0分钟(0.03%) | | CC-BY-NC 2.1 | 15.0分钟(0.03%) | | CC-BY-NC 2.0 | 12.0分钟(0.02%) | | CC-NC-Sampling+ 1.0 | 9.5分钟(0.02%) | | CC-BY-NC-ND 2.1 | 8.5分钟(0.02%) | | Free Art License | 7.8分钟(0.02%) | | CC-Sampling+ 1.0 | 7.0分钟(0.01%) | | CC-BY-ND 2.5 | 5.7分钟(0.01%) | | CC-BY-SA 2.5 | 4.5分钟(0.01%) | | CC-BY-ND 2.0 | 3.5分钟(0.01%) | | CC-BY 2.5 | 1.0分钟(0.00%) | ## 引用格式 @inproceedings{fma_dataset, title = {{FMA}: A Dataset for Music Analysis}, author = {Defferrard, Michaël and Benzi, Kirell and Vandergheynst, Pierre and Bresson, Xavier}, booktitle = {18th International Society for Music Information Retrieval Conference (ISMIR)}, year = {2017}, archiveprefix = {arXiv}, eprint = {1612.01840}, url = {https://arxiv.org/abs/1612.01840}, } @inproceedings{fma_challenge, title = {Learning to Recognize Musical Genre from Audio}, subtitle = {Challenge Overview}, author = {Defferrard, Michaël and Mohanty, Sharada P. and Carroll, Sean F. and Salathé, Marcel}, booktitle = {The 2018 Web Conference Companion}, year = {2018}, publisher = {ACM Press}, isbn = {9781450356404}, doi = {10.1145/3184558.3192310}, archiveprefix = {arXiv}, eprint = {1803.05337}, url = {https://arxiv.org/abs/1803.05337}, }
提供机构:
maas
创建时间:
2025-03-18
搜集汇总
数据集介绍
main_image_url
背景与挑战
背景概述
该数据集是FMA音乐分析数据集的一个大规模子集,包含105,024个30秒音频样本,覆盖16个不平衡流派,总时长约869.2小时。所有音频采用Creative Commons系列许可,适用于音乐信息检索任务。
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作