FMA Dataset

Name: FMA Dataset
Creator: Papers with Code
License: 暂无描述

paperswithcode.com2025-03-22 收录

下载链接：

https://paperswithcode.com/dataset/fma

下载链接

链接失效反馈

官方服务：

资源简介：

The Free Music Archive (FMA) is a large-scale dataset for evaluating several tasks in Music Information Retrieval. It consists of 343 days of audio from 106,574 tracks from 16,341 artists and 14,854 albums, arranged in a hierarchical taxonomy of 161 genres. It provides full-length and high-quality audio, pre-computed features, together with track- and user-level metadata, tags, and free-form text such as biographies. There are four subsets defined by the authors: Full: the complete dataset, Large: the full dataset with audio limited to 30 seconds clips extracted from the middle of the tracks (or entire track if shorter than 30 seconds), Medium: a selection of 25,000 30s clips having a single root genre, Small: a balanced subset containing 8,000 30s clips with 1,000 clips per one of 8 root genres. The official split into training, validation and test sets (80/10/10) uses stratified sampling to preserve the percentage of tracks per genre. Songs of the same artists are part of one set only.

自由音乐档案馆（FMA）是一个用于评估音乐信息检索领域多项任务的大型数据集。该数据集汇聚了16,341位艺术家所创作的14,854张专辑中的106,574首曲目，共计343天的音频资料，这些曲目按照161个类别的层级分类体系进行排列。该数据集提供了完整长度和高质量的音频、预先计算的特征，以及曲目和用户级别的元数据、标签以及如传记等自由形式文本。数据集的作者定义了四个子集： - 完整版：包含整个数据集； - 大型版：完整数据集，但音频仅限于从曲目中间提取的30秒片段（如果曲目长度小于30秒，则为整个曲目）； - 中型版：从单一根类别中选取的25,000个30秒片段； - 小型版：包含8,000个30秒片段的平衡子集，其中每个根类别有1,000个片段。官方的训练集、验证集和测试集（80/10/10）划分采用分层抽样方法，以保留每个类别的曲目百分比。同一艺术家的歌曲仅属于一个集合。

提供机构：

Papers with Code

5,000+

优质数据集

54 个

任务类型

进入经典数据集