SongFormBench
收藏魔搭社区2025-12-04 更新2025-09-27 收录
下载链接:
https://modelscope.cn/datasets/ASLP-lab/SongFormBench
下载链接
链接失效反馈官方服务:
资源简介:
# SongFormBench 🏆
[English | [中文](README_ZH.md)]
**A High-Quality Benchmark for Music Structure Analysis**
<div align="center">


[](https://arxiv.org/abs/2510.02797)
[](https://github.com/ASLP-lab/SongFormer)
[](https://huggingface.co/spaces/ASLP-lab/SongFormer)
[](https://huggingface.co/ASLP-lab/SongFormer)
[](https://huggingface.co/datasets/ASLP-lab/SongFormDB)
[](https://huggingface.co/datasets/ASLP-lab/SongFormBench)
[](https://discord.gg/p5uBryC4Zs)
[](http://www.npu-aslp.org/)
</div>
<div align="center">
<h3>
Chunbo Hao<sup>1*</sup>, Ruibin Yuan<sup>2,5*</sup>, Jixun Yao<sup>1</sup>, Qixin Deng<sup>3,5</sup>,<br>Xinyi Bai<sup>4,5</sup>, Wei Xue<sup>2</sup>, Lei Xie<sup>1†</sup>
</h3>
<p>
<sup>*</sup>Equal contribution <sup>†</sup>Corresponding author
</p>
<p>
<sup>1</sup>Audio, Speech and Language Processing Group (ASLP@NPU),<br>Northwestern Polytechnical University<br>
<sup>2</sup>Hong Kong University of Science and Technology<br>
<sup>3</sup>Northwestern University<br>
<sup>4</sup>Cornell University<br>
<sup>5</sup>Multimodal Art Projection (M-A-P)
</p>
</div>
---
## 🌟 What is SongFormBench?
SongFormBench is a **carefully curated, expert-annotated benchmark** designed to revolutionize music structure analysis (MSA) evaluation. Our dataset provides a unified standard for comparing MSA models.
### 📊 Dataset Composition
- **🎸 SongFormBench-HarmonixSet (BHX)**: 200 songs from HarmonixSet
- **🎤 SongFormBench-CN (BC)**: 100 Chinese popular songs
**Total: 300 high-quality annotated songs**
---
## ✨ Key Highlights
### 🎯 **Unified Evaluation Standard**
- Establishes a **standardized benchmark** for fair comparison across MSA models
- Eliminates inconsistencies in evaluation protocols
### 🏷️ **Simple Label System**
- Adopts the widely used 7-class classification system (as described in [arxiv.org/abs/2205.14700](https://arxiv.org/abs/2205.14700)
)
- Preserves **pre-chorus** segments for enhanced granularity
- Easy conversion to 7-class (pre-chorus → verse) for compatibility
### 👨🔬 **Expert-Verified Quality**
- Multi-source validation
- Manual corrections by expert annotators
### 🌏 **Multilingual Coverage**
- **First Chinese MSA dataset** (100 songs)
- Bridges the gap in Chinese music structure analysis
- Enables cross-lingual MSA research
---
## 🚀 Getting Started
### Quick Load
```python
from datasets import load_dataset
# Load the complete benchmark
dataset = load_dataset("ASLP-lab/SongFormBench")
```
---
## 📚 Resources & Links
- 📖 Paper: *coming soon*
- 💻 Code: [GitHub Repository](https://github.com/ASLP-lab/SongFormer)
- 🧑💻 Model: [SongFormer](https://huggingface.co/ASLP-lab/SongFormer)
- 📂 Dataset: [SongFormDB](https://huggingface.co/datasets/ASLP-lab/SongFormDB)
---
## 🤝 Citation
```bibtex
@misc{hao2025songformer,
title = {SongFormer: Scaling Music Structure Analysis with Heterogeneous Supervision},
author = {Chunbo Hao and Ruibin Yuan and Jixun Yao and Qixin Deng and Xinyi Bai and Wei Xue and Lei Xie},
year = {2025},
eprint = {2510.02797},
archivePrefix = {arXiv},
primaryClass = {eess.AS},
url = {https://arxiv.org/abs/2510.02797}
}
```
---
## 🎼 Mel Spectrogram Details
<details>
<summary>Click to expand/collapse</summary>
Environment configuration can refer to the official implementation of BigVGan. If the audio source becomes invalid, you can reconstruct the audio using the following method.
### 🎸 SongFormBench-HarmonixSet
Uses official HarmonixSet mel spectrograms. To reproduce:
```bash
# Clone BigVGAN repository
git clone https://github.com/NVIDIA/BigVGAN.git
# Navigate to utils
cd utils/HarmonixSet
# Update BIGVGAN_REPO_DIR in inference_e2e.sh
# Run the inference script
bash inference_e2e.sh
```
### 🎤 SongFormBench-CN
Reproduce using [**bigvgan_v2_44khz_128band_256x**](https://huggingface.co/nvidia/bigvgan_v2_44khz_128band_256x)
You should first download bigvgan_v2_44khz_128band_256x, then add its project directory to your PYTHONPATH, after which you can use the code below:
```python
# See implementation
utils/CN/infer.py
```
</details>
---
## 📧 Contact
For questions, issues, or collaboration opportunities, please visit our [GitHub repository](https://github.com/ASLP-lab/SongFormer) or open an issue.
# SongFormBench 🏆
[English | [中文](README_ZH.md)]
**高质量音乐结构分析(Music Structure Analysis, MSA)基准数据集**
<div align="center">


[](https://arxiv.org/abs/2510.02797)
[](https://github.com/ASLP-lab/SongFormer)
[](https://huggingface.co/spaces/ASLP-lab/SongFormer)
[](https://huggingface.co/ASLP-lab/SongFormer)
[](https://huggingface.co/datasets/ASLP-lab/SongFormDB)
[](https://huggingface.co/datasets/ASLP-lab/SongFormBench)
[](https://discord.gg/p5uBryC4Zs)
[](http://www.npu-aslp.org/)
</div>
<div align="center">
<h3>
郝春博<sup>1*</sup>, 袁瑞彬<sup>2,5*</sup>, 姚继勋<sup>1</sup>, 邓启昕<sup>3,5</sup>,<br/>
白欣怡<sup>4,5</sup>, 薛巍<sup>2</sup>, 谢磊<sup>1†</sup>
</h3>
<p>
<sup>*</sup>共同第一作者 <sup>†</sup>通讯作者
</p>
<p>
<sup>1</sup>音频、语音与语言处理课题组(ASLP@NPU),<br/>西北工业大学<br/>
<sup>2</sup>香港科技大学<br/>
<sup>3</sup>美国西北大学<br/>
<sup>4</sup>康奈尔大学<br/>
<sup>5</sup>多模态艺术投影(M-A-P)
</p>
</div>
---
## 🌟 什么是SongFormBench?
SongFormBench是一个**经过精心筛选、专家标注的基准数据集**,旨在革新音乐结构分析(MSA)的评估范式。本数据集为音乐结构分析模型的横向对比提供了统一标准。
### 📊 数据集构成
- **🎸 SongFormBench-HarmonixSet(BHX)**:取自HarmonixSet的200首歌曲
- **🎤 SongFormBench-CN(BC)**:100首中文流行歌曲
**总计:300首经过高质量标注的歌曲**
---
## ✨ 核心亮点
### 🎯 **统一评估标准**
- 建立了**标准化基准数据集**,实现音乐结构分析模型间的公平对比
- 消除了评估流程中的不一致性
### 🏷️ **简洁标注体系**
- 采用学界广泛使用的7分类系统(详见[arxiv.org/abs/2205.14700](https://arxiv.org/abs/2205.14700))
- 保留**预副歌段(pre-chorus)**以提升标注粒度
- 可轻松转换为7分类体系(预副歌段(pre-chorus)→ 主歌(verse))以适配不同模型需求
### 👨🔬 **专家验证的高质量标注**
- 采用多源验证机制
- 由专业标注人员进行手动修正
### 🌏 **多语言覆盖**
- 推出**首个中文音乐结构分析(MSA)数据集**(100首歌曲)
- 填补了中文音乐结构分析研究的空白
- 支持跨语言音乐结构分析研究
---
## 🚀 快速上手
### 快速加载
python
from datasets import load_dataset
# 加载完整基准数据集
dataset = load_dataset("ASLP-lab/SongFormBench")
---
## 📚 资源与链接
- 📖 论文:*即将上线*
- 💻 代码:[GitHub仓库](https://github.com/ASLP-lab/SongFormer)
- 🧑💻 模型:[SongFormer](https://huggingface.co/ASLP-lab/SongFormer)
- 📂 数据集:[SongFormDB](https://huggingface.co/datasets/ASLP-lab/SongFormDB)
---
## 🤝 引用格式
bibtex
@misc{hao2025songformer,
title = {SongFormer: Scaling Music Structure Analysis with Heterogeneous Supervision},
author = {Chunbo Hao and Ruibin Yuan and Jixun Yao and Qixin Deng and Xinyi Bai and Wei Xue and Lei Xie},
year = {2025},
eprint = {2510.02797},
archivePrefix = {arXiv},
primaryClass = {eess.AS},
url = {https://arxiv.org/abs/2510.02797}
}
---
## 🎼 梅尔频谱图细节
<details>
<summary>点击展开/折叠</summary>
环境配置可参考BigVGAN的官方实现。若音频源失效,可通过以下方法重构音频。
### 🎸 SongFormBench-HarmonixSet
使用官方HarmonixSet梅尔频谱图。复现步骤如下:
bash
# 克隆BigVGAN仓库
git clone https://github.com/NVIDIA/BigVGAN.git
# 进入utils目录
cd utils/HarmonixSet
# 更新inference_e2e.sh中的BIGVGAN_REPO_DIR变量
# 运行推理脚本
bash inference_e2e.sh
### 🎤 SongFormBench-CN
使用[**bigvgan_v2_44khz_128band_256x**](https://huggingface.co/nvidia/bigvgan_v2_44khz_128band_256x)进行复现。
你需先下载bigvgan_v2_44khz_128band_256x,将其项目目录添加至PYTHONPATH环境变量,随后可使用以下代码:
python
# 详见实现代码:utils/CN/infer.py
</details>
---
## 📧 联系方式
如有疑问、问题或合作意向,请访问我们的[GitHub仓库](https://github.com/ASLP-lab/SongFormer)或提交Issue。
提供机构:
maas
创建时间:
2025-09-15



