treadon/banger-scorer-generated-songs
收藏Hugging Face2026-03-26 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/treadon/banger-scorer-generated-songs
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
tags:
- music
- audio
- ai-generated-music
- ace-step
- quality-evaluation
- music-generation
- banger-scorer
task_categories:
- audio-classification
size_categories:
- n<1K
language:
- en
- es
- hi
- pa
- zh
---
# Banger Scorer Generated Songs
230 AI-generated songs across 10 genres and 5 languages, all scored by the [banger scorer](https://huggingface.co/treadon/banger-scorer). Includes 10 genre tests (20 songs each) plus 1 banger-optimized run (30 songs) that used data-driven parameter selection to maximize scores.
Every song includes its MP3 audio, generation metadata (BPM, key, seed, caption prompt), and banger score. Useful for research into AI music quality, training better scorers, or just listening to what works and what does not.

## Dataset Description
### What is this?
A systematic evaluation of AI music generation quality. 230 songs were generated with [ACE-Step 1.5](https://huggingface.co/ACE-Step/Ace-Step1.5) on an Apple M4 Pro, spanning a grid of caption styles, BPMs, keys, and languages. Each song was then scored by the [banger scorer](https://huggingface.co/treadon/banger-scorer) (MERT + MLP, trained on FMA play counts).
The dataset captures what an automated quality model thinks about AI-generated music -- and reveals clear patterns in what makes some generated songs score higher than others.
### Structure
```
230 songs total:
├── test01/ (20 songs) -- Hip Hop (English)
├── test02/ (20 songs) -- Pop/Dance (English)
├── test03/ (20 songs) -- R&B/Soul (English)
├── test04/ (20 songs) -- Latin/Reggaeton (Spanish)
├── test05/ (20 songs) -- Bollywood (Hindi)
├── test06/ (20 songs) -- Punjabi/Bhangra (Punjabi)
├── test07/ (20 songs) -- C-Pop (Chinese)
├── test08/ (20 songs) -- Rock/Alternative (English)
├── test09/ (20 songs) -- Electronic/EDM (Instrumental)
├── test10/ (20 songs) -- Acoustic/Folk (English)
└── bangers/ (30 songs) -- Banger-optimized run (mixed EDM/Bhangra/Bollywood)
```
### Per-Song Fields
| Field | Type | Description |
|-------|------|-------------|
| `audio` | MP3 (192kbps) | The generated song, 2 minutes, 48kHz stereo |
| `genre` | string | Genre category (e.g., "EDM", "Hip Hop", "Bollywood") |
| `bpm` | int | Beats per minute used for generation |
| `key` | string | Musical key used for generation (e.g., "Eb minor", "D major") |
| `seed` | int | Random seed for deterministic reproduction |
| `caption` | string | Full text prompt given to ACE-Step |
| `banger_score` | float (0-10) | Score from the banger scorer model |
| `language` | string | Lyrics language (en/es/hi/pa/zh or instrumental) |
## Generation Details
### Model
[ACE-Step 1.5](https://huggingface.co/ACE-Step/Ace-Step1.5) -- a full-song AI music generator with vocals, instruments, and structure. Generates ~2-minute songs from a text caption and optional lyrics.
### Hardware
Apple M4 Pro with 64 GB unified memory. Each song took ~110 seconds to generate (language model planning ~27s, DiT diffusion ~9s for 8 turbo steps, VAE decoding ~18s, plus overhead). Total generation time for all 230 songs: ~7 hours. Scoring all 230 songs took ~8 minutes.
**200/200 success rate on genre tests, 30/30 on the banger run. Zero generation failures.** Achieved by spawning a fresh subprocess per song to avoid MPS memory accumulation.
### Parameter Grid
Each genre test sampled 20 combinations from a grid of 4 caption styles x 5 BPMs x 5 keys (100 possible combos). Within each genre, lyrics were held constant while musical parameters varied, isolating the effect of BPM, key, and caption style on score.
## Score Statistics

### By Genre


| Rank | Genre | Language | Mean | Best | Worst | Range |
|------|-------|----------|------|------|-------|-------|
| 1 | Punjabi/Bhangra | Punjabi | 3.77 | 4.26 | 2.79 | 1.47 |
| 2 | Electronic/EDM | Instrumental | 3.71 | **5.29** | 2.80 | 2.49 |
| 3 | Bollywood | Hindi | 3.53 | 4.38 | 2.70 | 1.68 |
| 4 | C-Pop | Chinese | 3.20 | 4.47 | 2.13 | 2.34 |
| 5 | Latin/Reggaeton | Spanish | 3.19 | 3.90 | 2.36 | 1.54 |
| 6 | Pop/Dance | English | 3.05 | 4.31 | 1.98 | 2.33 |
| 7 | Rock/Alternative | English | 3.03 | 3.66 | 2.02 | 1.64 |
| 8 | Hip Hop | English | 2.92 | 3.38 | 2.52 | 0.86 |
| 9 | Acoustic/Folk | English | 2.63 | 3.31 | 2.03 | 1.28 |
| 10 | R&B/Soul | English | 2.62 | 3.21 | 2.14 | 1.07 |
| -- | **Banger-optimized** | **Mixed** | **3.48** | **5.29** | **2.01** | **3.28** |
### Top 10 Overall (out of 230 songs)
| Rank | Score | Genre | BPM | Key | Caption Style |
|------|-------|-------|-----|-----|---------------|
| #1 | **5.29** | EDM | 130 | Eb minor | Melodic techno, atmospheric, driving |
| #2 | **5.29** | Banger Run | 128 | D minor | Dark electronic, industrial, driving |
| #3 | **4.90** | EDM | 138 | F minor | Deep house, groovy bassline |
| #4 | **4.87** | EDM | 126 | Eb minor | Progressive house, euphoric drop |
| #5 | **4.66** | EDM | 138 | Bb minor | Melodic techno, dark and dreamy |
| #6 | **4.47** | C-Pop | 120 | A minor | Emotional ballad, piano-driven |
| #7 | **4.40** | Banger Run | 128 | Bb minor | Progressive house, festival anthem |
| #8 | **4.38** | Bollywood | 120 | E minor | Club-ready beat, heavy bass |
| #9 | **4.36** | Banger Run | 130 | D minor | Deep house, groovy bassline |
| #10 | **4.31** | Pop/Dance | 128 | D major | Electronic pop, shimmering arpeggios |
### Best vs Worst Examples

The highest-scoring songs share common traits: driving four-on-the-floor beats, minor keys, 126-138 BPM, electronic/dark textures. The lowest-scoring songs tend to be slow R&B, acoustic folk, or jazzy hip hop with mellow dynamics.
## Musical Analysis
### Optimization Impact
The banger-optimized run (30 songs) used only parameter combinations that scored highest in the 200-song random sweep: dark electronic/industrial captions, 126-138 BPM, minor keys only.

| Metric | Random (200 songs) | Optimized (30 songs) | Change |
|--------|-------------------|---------------------|--------|
| Mean score | 3.17 | **3.48** | +10% |
| Songs >= 3.5 | 20% | **60%** | 3x |
| Songs >= 4.0 | 5% | **20%** | 4x |
| Top score | 5.29 | 5.29 | Same |
Data-driven parameter selection raised the floor and consistency without raising the ceiling. The winning formula: dark electronic/industrial styles, 128-130 BPM, D minor or Bb minor.

### Key and Tonality
Minor keys consistently outperformed major keys across all genres. All top-5 overall songs were in minor keys.


### BPM
Clear BPM sweet spots per genre. Slower BPMs (< 85) consistently underperformed. The highest-scoring region across all genres: 126-138 BPM.


### Caption Style
The caption prompt significantly affects scores. Descriptive, production-oriented captions ("melodic techno, atmospheric pads, driving beat") outperformed vague ones. The "traditional/ethnic" keyword cluster (dhol, tumbi, bhangra) achieved the highest style-level mean (3.8).

### Generated vs Training Data
AI-generated songs cluster tightly around the FMA training mean (3.17 generated vs 3.27 FMA), while real music has a long right tail up to 10. Even optimized generation peaks at 5.29 (67th percentile). The scorer correctly identifies that AI music does not yet match the quality distribution of human-made music.

## Use Cases
- **AI music quality research.** Study what musical parameters (BPM, key, style, language) lead to higher/lower quality scores. The structured parameter grid makes controlled comparisons possible.
- **Listening comparison.** Download the top-5 and bottom-5 from each genre to hear what the scorer prefers. Do you agree with the model?
- **Train better scorers.** Use these scored songs as additional training data or evaluation benchmarks for music quality models.
- **ACE-Step evaluation.** Benchmark ACE-Step 1.5 across genres and languages with quantitative quality scores.
- **Preference modeling.** Use the ranked pairs (best vs worst within each genre) to train preference models or reward models for music generation.
## Limitations
- **Scorer bias.** All scores come from a single model trained on FMA play counts. The model favors high-energy, beat-driven music and underrates mellow genres. Scores reflect learned statistical patterns, not absolute quality.
- **Single generator.** All songs are from ACE-Step 1.5. Results may not generalize to other music generation models.
- **Limited seeds per combo.** Each parameter combination was tested with only 1 random seed. Different seeds produce different songs from the same parameters.
- **Lyrics held constant per genre.** Within each genre test, all 20 songs share the same lyrics. Only musical parameters (BPM, key, caption style) vary.
- **2-minute songs.** All songs are approximately 2 minutes. Longer or shorter durations may produce different quality characteristics.
## How the Scores Were Computed
Each song was:
1. Loaded at 24kHz mono
2. Truncated to 30 seconds
3. Encoded through [MERT-v1-330M](https://huggingface.co/m-a-p/MERT-v1-330M) to produce a 1024-dim embedding
4. Scored by the [banger scorer](https://huggingface.co/treadon/banger-scorer) MLP (1024 -> 512 -> 256 -> 128 -> 1)
5. Clamped to the 0-10 range
The scorer model (MAE 0.858, Spearman 0.468 on FMA test set) was trained on pre-computed MERT embeddings from [treadon/fma-mert-embeddings](https://huggingface.co/datasets/treadon/fma-mert-embeddings).
## Genre Test Details
| Test | Genre | BPMs | Keys | Language | Lyrics Theme |
|------|-------|------|------|----------|-------------|
| 01 | Hip Hop | 78, 85, 90, 95, 100 | C/Bb/D/E/Ab minor | English | East coast, gritty, street |
| 02 | Pop/Dance | 110, 118, 124, 128, 135 | C/G/D/A/F major | English | Upbeat, summer, dance |
| 03 | R&B/Soul | 65, 72, 80, 88, 95 | Eb/Ab/Db maj, Bb/F min | English | Romantic, intimate |
| 04 | Latin/Reggaeton | 90, 95, 100, 105, 110 | A/D/E/G/C minor | Spanish | Party, dancing, heat |
| 05 | Bollywood | 100, 110, 120, 130, 140 | C/D/A/E min, G maj | Hindi | Film dance number |
| 06 | Punjabi/Bhangra | 95, 105, 115, 125, 135 | G/D/A/C maj, E min | Punjabi | Celebration, dance |
| 07 | C-Pop | 75, 90, 105, 120, 130 | C/G/F maj, A/D min | Chinese | Modern pop, emotional |
| 08 | Rock/Alt | 120, 130, 140, 150, 160 | E/A/D/G/B minor | English | Indie, garage, alt |
| 09 | Electronic/EDM | 122, 126, 130, 138, 150 | A/F/C/Eb/Bb minor | Instrumental | House, techno, trance |
| 10 | Acoustic/Folk | 85, 95, 105, 115, 125 | G/C/D/A/E major | English | Singer-songwriter |
| -- | Banger Run | 126, 128, 130, 135, 138 | Eb/F/Bb/C/D minor | Mixed | Optimized for score |
## Citation
```bibtex
@misc{acestep2025,
title={ACE-Step: A Step Towards Music Generation Foundation Model},
author={ACE-Step Team},
year={2025},
url={https://huggingface.co/ACE-Step/Ace-Step1.5}
}
@article{li2023mert,
title={MERT: Acoustic Music Understanding Model with Large-Scale Self-supervised Training},
author={Li, Yizhi and Yuan, Ruibin and Zhang, Ge and Ma, Yinghao and others},
journal={arXiv preprint arXiv:2306.00107},
year={2023}
}
```
## Dataset Card Contact
[treadon](https://huggingface.co/treadon) on HuggingFace
提供机构:
treadon



