five

Reubencf/fma-labeled

收藏
Hugging Face2026-04-15 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/Reubencf/fma-labeled
下载链接
链接失效反馈
官方服务:
资源简介:
--- task_categories: - audio-classification - text-to-audio - automatic-speech-recognition language: - en size_categories: - 10K<n<100K tags: - music - fma - music-labeling - genre-classification - mood-detection - lyrics - creative-commons pretty_name: FMA Labeled (Gemini) --- # FMA Labeled — Multi-Attribute Music Dataset > 🏆 **Submitted to the [Uncharted Data Challenge](https://www.adaptionlabs.ai/blog/the-uncharted-data-challenge) > hosted by [Adaption Labs](https://www.adaptionlabs.ai)** — credit to > **Adaptive Data by Adaption** for organizing the hackathon. A **large-scale labeled music dataset** built on top of the Creative-Commons subset of the [Free Music Archive (FMA)](https://freemusicarchive.org/). Every track has been automatically annotated with lyrics, genre, mood, instruments, tempo, key, and more using **Google Gemini (`gemini-flash-latest`)**. Intended for training and evaluating **music tagging**, **genre / mood classification**, **auto-lyrics transcription**, **music retrieval**, and **music-text multimodal** models. ## Dataset Summary - **Total tracks**: 29,275 - **Source**: FMA Creative-Commons (CC BY, CC BY-SA, CC BY-ND, CC0) tracks - **Average duration**: ~3–4 min per track - **Labeler**: `gemini-flash-latest` (Flex + Batch tiers) - **Audio**: referenced by `file_name`; audio files live in the companion `dataset/fma_cc/audio/` directory (or fetch from FMA directly via `track_url`) ## Schema | Field | Type | Description | |---|---|---| | `description` | `string` | One-sentence natural-language track description | | `file_name` | `string` | Relative path to `.mp3` audio file | | `lyrics` | `string` | Transcribed lyrics (empty if instrumental) | | `genre` | `string` | Primary predicted genre | | `has_lyrics` | `bool` | Whether the track contains vocals with lyrics | | `language` | `string` | ISO code of the lyrics, or `instrumental` | | `sub_genres` | `list[string]` | Sub-genre tags | | `mood` | `list[string]` | Mood / emotion tags (e.g. `Uplifting`, `Melancholic`) | | `instruments` | `list[string]` | Detected instruments | | `vocal_type` | `string` | e.g. `male clean`, `female clean`, `spoken word`, `none` | | `bpm` | `int` | Estimated tempo | | `key` | `string` | Musical key (e.g. `G major`, `D minor`) | | `time_signature` | `string` | e.g. `4/4`, `3/4`, `free` | | `energy_level` | `string` | `low`, `medium`, `high` | | `era_style` | `string` | Temporal / stylistic era (e.g. `modern`, `80s synthwave`) | | `audio_quality` | `string` | `studio`, `lo-fi`, `live`, `poor` | | `id` | `string` | FMA track id | | `title` | `string` | Track title | | `artist` | `string` | Artist name | | `artist_url` | `string` | FMA artist page | | `fma_genres` | `list[string]` | Original FMA genre labels | | `duration` | `float` | Length in seconds | | `license` | `string` | e.g. `CC BY`, `CC0 / Public Domain` | | `license_url` | `string` | Link to license terms | | `track_url` | `string` | FMA track page | | `label_seconds` | `float` | Time Gemini took to label this track | | `label_model` | `string` | Labeling model id | ## Label Statistics ### Top Genres | Genre | Tracks | |---|---| | Electronic | 5,545 | | Avant-Garde | 1,874 | | Experimental | 1,820 | | Rock | 1,552 | | Hip-Hop | 1,392 | | Ambient | 1,251 | | Folk | 1,237 | | Pop | 1,142 | | Classical | 1,139 | | Soundtrack | 933 | ### Lyrics Language | Language | Tracks | |---|---| | instrumental | 19,975 | | en | 7,287 | | fr | 423 | | si | 277 | | ru | 255 | | es | 175 | | la | 146 | | de | 98 | | pt | 72 | | it | 70 | ### Vocal Type | Vocal | Tracks | |---|---| | none | 19,300 | | male clean | 3,935 | | female clean | 725 | | spoken word | 724 | | male raspy | 295 | ### Energy Level | Level | Tracks | |---|---| | high | 11,062 | | medium | 9,877 | | low | 8,320 | ### Audio Quality | Quality | Tracks | |---|---| | studio | 25,060 | | lo-fi | 3,817 | | live | 247 | | poor | 126 | ### Licenses | License | Tracks | |---|---| | CC BY | 15,476 | | CC0 / Public Domain | 7,098 | | CC BY-SA | 3,766 | | CC BY-ND | 2,568 | ## Loading ```python from datasets import load_dataset ds = load_dataset("parquet", data_files="labels.parquet", split="train") # Filter vocal tracks in English eng_vocals = ds.filter(lambda r: r["has_lyrics"] and r["language"] == "en") # All high-energy electronic tracks rave = ds.filter(lambda r: r["genre"] == "Electronic" and r["energy_level"] == "high") # Full BPM histogram import collections print(collections.Counter(r["bpm"] for r in ds)) ``` ## Generation Pipeline 1. **Source selection** — filtered FMA to CC-licensed tracks only (~31k). 2. **Labeling** — audio uploaded to Gemini Files API; `gemini-flash-latest` called with a structured JSON schema covering lyrics, genre, mood, instruments, BPM, key, etc. 3. **Cost optimization** — 50%-off **Flex tier** for streaming requests; remainder processed via **Batch API** (50% off, async). 4. **Output** — rows merged into `labels.jsonl` and `labels.parquet`; retries on 503 / JSON-decode failures up to 3 times. ## Intended Uses - Training **music tag / genre / mood classifiers** with rich supervision. - **Auto-lyrics / ASR for music** — paired audio + transcribed lyrics in 10+ languages. - **Music retrieval / recommendation** — filter by tempo, key, mood, instruments. - **Music-text multimodal LMs** — description field provides natural-language captions per track. ## Limitations - **Labels are model-generated** — expect noise. Gemini `Flash`-class output, not human-annotated. - **Long tail of small languages**: non-English lyric languages have few samples each; useful for probing but thin for training. - **Instrumental bias**: 68% of tracks are labeled `instrumental` — the `has_lyrics` filter is important for lyrics-centric work. - **BPM / key estimation** is derived from the acoustic model inside Gemini and is not guaranteed tempo-accurate; use a dedicated beat-tracker for rhythm-critical tasks. ## License Per-track license is stored in the `license` field and follows the original FMA release (CC BY / CC BY-SA / CC BY-ND / CC0). The label metadata itself is released under CC0 — reuse freely. ## Citation ``` @dataset{fma_labeled_gemini_2026, title = {FMA Labeled — Multi-Attribute Music Dataset (Gemini)}, author = {Fernandes, Reuben}, year = {2026}, note = {Labels generated with gemini-flash-latest on the Creative-Commons subset of the Free Music Archive} } ``` Also cite the original FMA release: ``` @inproceedings{defferrard2017fma, title = {FMA: A Dataset For Music Analysis}, author = {Defferrard, Michaël and Benzi, Kirell and Vandergheynst, Pierre and Bresson, Xavier}, booktitle = {ISMIR}, year = {2017} } ```
提供机构:
Reubencf
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作