Reubencf/fma-labeled
收藏Hugging Face2026-04-15 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/Reubencf/fma-labeled
下载链接
链接失效反馈官方服务:
资源简介:
---
task_categories:
- audio-classification
- text-to-audio
- automatic-speech-recognition
language:
- en
size_categories:
- 10K<n<100K
tags:
- music
- fma
- music-labeling
- genre-classification
- mood-detection
- lyrics
- creative-commons
pretty_name: FMA Labeled (Gemini)
---
# FMA Labeled — Multi-Attribute Music Dataset
> 🏆 **Submitted to the [Uncharted Data Challenge](https://www.adaptionlabs.ai/blog/the-uncharted-data-challenge)
> hosted by [Adaption Labs](https://www.adaptionlabs.ai)** — credit to
> **Adaptive Data by Adaption** for organizing the hackathon.
A **large-scale labeled music dataset** built on top of the Creative-Commons
subset of the [Free Music Archive (FMA)](https://freemusicarchive.org/). Every
track has been automatically annotated with lyrics, genre, mood, instruments,
tempo, key, and more using **Google Gemini (`gemini-flash-latest`)**.
Intended for training and evaluating **music tagging**, **genre / mood
classification**, **auto-lyrics transcription**, **music retrieval**, and
**music-text multimodal** models.
## Dataset Summary
- **Total tracks**: 29,275
- **Source**: FMA Creative-Commons (CC BY, CC BY-SA, CC BY-ND, CC0) tracks
- **Average duration**: ~3–4 min per track
- **Labeler**: `gemini-flash-latest` (Flex + Batch tiers)
- **Audio**: referenced by `file_name`; audio files live in the companion
`dataset/fma_cc/audio/` directory (or fetch from FMA directly via `track_url`)
## Schema
| Field | Type | Description |
|---|---|---|
| `description` | `string` | One-sentence natural-language track description |
| `file_name` | `string` | Relative path to `.mp3` audio file |
| `lyrics` | `string` | Transcribed lyrics (empty if instrumental) |
| `genre` | `string` | Primary predicted genre |
| `has_lyrics` | `bool` | Whether the track contains vocals with lyrics |
| `language` | `string` | ISO code of the lyrics, or `instrumental` |
| `sub_genres` | `list[string]` | Sub-genre tags |
| `mood` | `list[string]` | Mood / emotion tags (e.g. `Uplifting`, `Melancholic`) |
| `instruments` | `list[string]` | Detected instruments |
| `vocal_type` | `string` | e.g. `male clean`, `female clean`, `spoken word`, `none` |
| `bpm` | `int` | Estimated tempo |
| `key` | `string` | Musical key (e.g. `G major`, `D minor`) |
| `time_signature` | `string` | e.g. `4/4`, `3/4`, `free` |
| `energy_level` | `string` | `low`, `medium`, `high` |
| `era_style` | `string` | Temporal / stylistic era (e.g. `modern`, `80s synthwave`) |
| `audio_quality` | `string` | `studio`, `lo-fi`, `live`, `poor` |
| `id` | `string` | FMA track id |
| `title` | `string` | Track title |
| `artist` | `string` | Artist name |
| `artist_url` | `string` | FMA artist page |
| `fma_genres` | `list[string]` | Original FMA genre labels |
| `duration` | `float` | Length in seconds |
| `license` | `string` | e.g. `CC BY`, `CC0 / Public Domain` |
| `license_url` | `string` | Link to license terms |
| `track_url` | `string` | FMA track page |
| `label_seconds` | `float` | Time Gemini took to label this track |
| `label_model` | `string` | Labeling model id |
## Label Statistics
### Top Genres
| Genre | Tracks |
|---|---|
| Electronic | 5,545 |
| Avant-Garde | 1,874 |
| Experimental | 1,820 |
| Rock | 1,552 |
| Hip-Hop | 1,392 |
| Ambient | 1,251 |
| Folk | 1,237 |
| Pop | 1,142 |
| Classical | 1,139 |
| Soundtrack | 933 |
### Lyrics Language
| Language | Tracks |
|---|---|
| instrumental | 19,975 |
| en | 7,287 |
| fr | 423 |
| si | 277 |
| ru | 255 |
| es | 175 |
| la | 146 |
| de | 98 |
| pt | 72 |
| it | 70 |
### Vocal Type
| Vocal | Tracks |
|---|---|
| none | 19,300 |
| male clean | 3,935 |
| female clean | 725 |
| spoken word | 724 |
| male raspy | 295 |
### Energy Level
| Level | Tracks |
|---|---|
| high | 11,062 |
| medium | 9,877 |
| low | 8,320 |
### Audio Quality
| Quality | Tracks |
|---|---|
| studio | 25,060 |
| lo-fi | 3,817 |
| live | 247 |
| poor | 126 |
### Licenses
| License | Tracks |
|---|---|
| CC BY | 15,476 |
| CC0 / Public Domain | 7,098 |
| CC BY-SA | 3,766 |
| CC BY-ND | 2,568 |
## Loading
```python
from datasets import load_dataset
ds = load_dataset("parquet", data_files="labels.parquet", split="train")
# Filter vocal tracks in English
eng_vocals = ds.filter(lambda r: r["has_lyrics"] and r["language"] == "en")
# All high-energy electronic tracks
rave = ds.filter(lambda r: r["genre"] == "Electronic" and r["energy_level"] == "high")
# Full BPM histogram
import collections
print(collections.Counter(r["bpm"] for r in ds))
```
## Generation Pipeline
1. **Source selection** — filtered FMA to CC-licensed tracks only (~31k).
2. **Labeling** — audio uploaded to Gemini Files API; `gemini-flash-latest`
called with a structured JSON schema covering lyrics, genre, mood,
instruments, BPM, key, etc.
3. **Cost optimization** — 50%-off **Flex tier** for streaming requests;
remainder processed via **Batch API** (50% off, async).
4. **Output** — rows merged into `labels.jsonl` and `labels.parquet`; retries
on 503 / JSON-decode failures up to 3 times.
## Intended Uses
- Training **music tag / genre / mood classifiers** with rich supervision.
- **Auto-lyrics / ASR for music** — paired audio + transcribed lyrics in 10+ languages.
- **Music retrieval / recommendation** — filter by tempo, key, mood, instruments.
- **Music-text multimodal LMs** — description field provides natural-language
captions per track.
## Limitations
- **Labels are model-generated** — expect noise. Gemini `Flash`-class output,
not human-annotated.
- **Long tail of small languages**: non-English lyric languages have few
samples each; useful for probing but thin for training.
- **Instrumental bias**: 68% of tracks are labeled `instrumental` — the
`has_lyrics` filter is important for lyrics-centric work.
- **BPM / key estimation** is derived from the acoustic model inside Gemini
and is not guaranteed tempo-accurate; use a dedicated beat-tracker for
rhythm-critical tasks.
## License
Per-track license is stored in the `license` field and follows the original
FMA release (CC BY / CC BY-SA / CC BY-ND / CC0). The label metadata itself
is released under CC0 — reuse freely.
## Citation
```
@dataset{fma_labeled_gemini_2026,
title = {FMA Labeled — Multi-Attribute Music Dataset (Gemini)},
author = {Fernandes, Reuben},
year = {2026},
note = {Labels generated with gemini-flash-latest on the Creative-Commons subset of the Free Music Archive}
}
```
Also cite the original FMA release:
```
@inproceedings{defferrard2017fma,
title = {FMA: A Dataset For Music Analysis},
author = {Defferrard, Michaël and Benzi, Kirell and Vandergheynst, Pierre and Bresson, Xavier},
booktitle = {ISMIR},
year = {2017}
}
```
提供机构:
Reubencf



