SongFormBench

Name: SongFormBench
Creator: maas
Published: 2025-12-04 16:50:03
License: 暂无描述

魔搭社区2025-12-04 更新2025-09-27 收录

下载链接：

https://modelscope.cn/datasets/ASLP-lab/SongFormBench

下载链接

链接失效反馈

官方服务：

资源简介：

# SongFormBench 🏆 [English ｜ [中文](README_ZH.md)] **A High-Quality Benchmark for Music Structure Analysis** <div align="center"> ![Python](https://img.shields.io/badge/Python-3.10-brightgreen) ![License](https://img.shields.io/badge/License-CC%20BY%204.0-lightblue) [![arXiv Paper](https://img.shields.io/badge/arXiv-2510.02797-blue)](https://arxiv.org/abs/2510.02797) [![GitHub](https://img.shields.io/badge/GitHub-SongFormer-black)](https://github.com/ASLP-lab/SongFormer) [![HuggingFace Space](https://img.shields.io/badge/HuggingFace-space-yellow)](https://huggingface.co/spaces/ASLP-lab/SongFormer) [![HuggingFace Model](https://img.shields.io/badge/HuggingFace-model-blue)](https://huggingface.co/ASLP-lab/SongFormer) [![Dataset SongFormDB](https://img.shields.io/badge/HF%20Dataset-SongFormDB-green)](https://huggingface.co/datasets/ASLP-lab/SongFormDB) [![Dataset SongFormBench](https://img.shields.io/badge/HF%20Dataset-SongFormBench-orange)](https://huggingface.co/datasets/ASLP-lab/SongFormBench) [![Discord](https://img.shields.io/badge/Discord-join%20us-purple?logo=discord&logoColor=white)](https://discord.gg/p5uBryC4Zs) [![lab](https://img.shields.io/badge/🏫-ASLP-grey?labelColor=lightgrey)](http://www.npu-aslp.org/) </div> <div align="center"> <h3> Chunbo Hao1*, Ruibin Yuan2,5*, Jixun Yao1, Qixin Deng3,5, Xinyi Bai4,5, Wei Xue2, Lei Xie1† </h3> *Equal contribution    †Corresponding author 1Audio, Speech and Language Processing Group (ASLP@NPU), Northwestern Polytechnical University 2Hong Kong University of Science and Technology 3Northwestern University 4Cornell University 5Multimodal Art Projection (M-A-P) </div> --- ## 🌟 What is SongFormBench? SongFormBench is a **carefully curated, expert-annotated benchmark** designed to revolutionize music structure analysis (MSA) evaluation. Our dataset provides a unified standard for comparing MSA models. ### 📊 Dataset Composition - **🎸 SongFormBench-HarmonixSet (BHX)**: 200 songs from HarmonixSet - **🎤 SongFormBench-CN (BC)**: 100 Chinese popular songs **Total: 300 high-quality annotated songs** --- ## ✨ Key Highlights ### 🎯 **Unified Evaluation Standard** - Establishes a **standardized benchmark** for fair comparison across MSA models - Eliminates inconsistencies in evaluation protocols ### 🏷️ **Simple Label System** - Adopts the widely used 7-class classification system (as described in [arxiv.org/abs/2205.14700](https://arxiv.org/abs/2205.14700) ) - Preserves **pre-chorus** segments for enhanced granularity - Easy conversion to 7-class (pre-chorus → verse) for compatibility ### 👨‍🔬 **Expert-Verified Quality** - Multi-source validation - Manual corrections by expert annotators ### 🌏 **Multilingual Coverage** - **First Chinese MSA dataset** (100 songs) - Bridges the gap in Chinese music structure analysis - Enables cross-lingual MSA research --- ## 🚀 Getting Started ### Quick Load ```python from datasets import load_dataset # Load the complete benchmark dataset = load_dataset("ASLP-lab/SongFormBench") ``` --- ## 📚 Resources & Links - 📖 Paper: *coming soon* - 💻 Code: [GitHub Repository](https://github.com/ASLP-lab/SongFormer) - 🧑‍💻 Model: [SongFormer](https://huggingface.co/ASLP-lab/SongFormer) - 📂 Dataset: [SongFormDB](https://huggingface.co/datasets/ASLP-lab/SongFormDB) --- ## 🤝 Citation ```bibtex @misc{hao2025songformer, title = {SongFormer: Scaling Music Structure Analysis with Heterogeneous Supervision}, author = {Chunbo Hao and Ruibin Yuan and Jixun Yao and Qixin Deng and Xinyi Bai and Wei Xue and Lei Xie}, year = {2025}, eprint = {2510.02797}, archivePrefix = {arXiv}, primaryClass = {eess.AS}, url = {https://arxiv.org/abs/2510.02797} } ``` --- ## 🎼 Mel Spectrogram Details <details> <summary>Click to expand/collapse</summary> Environment configuration can refer to the official implementation of BigVGan. If the audio source becomes invalid, you can reconstruct the audio using the following method. ### 🎸 SongFormBench-HarmonixSet Uses official HarmonixSet mel spectrograms. To reproduce: ```bash # Clone BigVGAN repository git clone https://github.com/NVIDIA/BigVGAN.git # Navigate to utils cd utils/HarmonixSet # Update BIGVGAN_REPO_DIR in inference_e2e.sh # Run the inference script bash inference_e2e.sh ``` ### 🎤 SongFormBench-CN Reproduce using [**bigvgan_v2_44khz_128band_256x**](https://huggingface.co/nvidia/bigvgan_v2_44khz_128band_256x) You should first download bigvgan_v2_44khz_128band_256x, then add its project directory to your PYTHONPATH, after which you can use the code below: ```python # See implementation utils/CN/infer.py ``` </details> --- ## 📧 Contact For questions, issues, or collaboration opportunities, please visit our [GitHub repository](https://github.com/ASLP-lab/SongFormer) or open an issue.

# SongFormBench 🏆 [English ｜ [中文](README_ZH.md)] **高质量音乐结构分析（Music Structure Analysis, MSA）基准数据集** <div align="center"> ![Python](https://img.shields.io/badge/Python-3.10-brightgreen) ![License](https://img.shields.io/badge/License-CC%20BY%204.0-lightblue) [![arXiv Paper](https://img.shields.io/badge/arXiv-2510.02797-blue)](https://arxiv.org/abs/2510.02797) [![GitHub](https://img.shields.io/badge/GitHub-SongFormer-black)](https://github.com/ASLP-lab/SongFormer) [![HuggingFace Space](https://img.shields.io/badge/HuggingFace-space-yellow)](https://huggingface.co/spaces/ASLP-lab/SongFormer) [![HuggingFace Model](https://img.shields.io/badge/HuggingFace-model-blue)](https://huggingface.co/ASLP-lab/SongFormer) [![HF Dataset SongFormDB](https://img.shields.io/badge/HF%20Dataset-SongFormDB-green)](https://huggingface.co/datasets/ASLP-lab/SongFormDB) [![HF Dataset SongFormBench](https://img.shields.io/badge/HF%20Dataset-SongFormBench-orange)](https://huggingface.co/datasets/ASLP-lab/SongFormBench) [![Discord](https://img.shields.io/badge/Discord-join%20us-purple?logo=discord&logoColor=white)](https://discord.gg/p5uBryC4Zs) [![lab](https://img.shields.io/badge/%F0%9F%8F%86-ASLP-grey?labelColor=lightgrey)](http://www.npu-aslp.org/) </div> <div align="center"> <h3> 郝春博1*, 袁瑞彬2,5*, 姚继勋1, 邓启昕3,5, 白欣怡4,5, 薛巍2, 谢磊1† </h3> *共同第一作者    †通讯作者 1音频、语音与语言处理课题组（ASLP@NPU）, 西北工业大学 2香港科技大学 3美国西北大学 4康奈尔大学 5多模态艺术投影（M-A-P） </div> --- ## 🌟 什么是SongFormBench？ SongFormBench是一个**经过精心筛选、专家标注的基准数据集**，旨在革新音乐结构分析（MSA）的评估范式。本数据集为音乐结构分析模型的横向对比提供了统一标准。 ### 📊 数据集构成 - **🎸 SongFormBench-HarmonixSet（BHX）**：取自HarmonixSet的200首歌曲 - **🎤 SongFormBench-CN（BC）**：100首中文流行歌曲 **总计：300首经过高质量标注的歌曲** --- ## ✨ 核心亮点 ### 🎯 **统一评估标准** - 建立了**标准化基准数据集**，实现音乐结构分析模型间的公平对比 - 消除了评估流程中的不一致性 ### 🏷️ **简洁标注体系** - 采用学界广泛使用的7分类系统（详见[arxiv.org/abs/2205.14700](https://arxiv.org/abs/2205.14700)） - 保留**预副歌段（pre-chorus）**以提升标注粒度 - 可轻松转换为7分类体系（预副歌段（pre-chorus）→ 主歌（verse））以适配不同模型需求 ### 👨‍🔬 **专家验证的高质量标注** - 采用多源验证机制 - 由专业标注人员进行手动修正 ### 🌏 **多语言覆盖** - 推出**首个中文音乐结构分析（MSA）数据集**（100首歌曲） - 填补了中文音乐结构分析研究的空白 - 支持跨语言音乐结构分析研究 --- ## 🚀 快速上手 ### 快速加载 python from datasets import load_dataset # 加载完整基准数据集 dataset = load_dataset("ASLP-lab/SongFormBench") --- ## 📚 资源与链接 - 📖 论文：*即将上线* - 💻 代码：[GitHub仓库](https://github.com/ASLP-lab/SongFormer) - 🧑‍💻 模型：[SongFormer](https://huggingface.co/ASLP-lab/SongFormer) - 📂 数据集：[SongFormDB](https://huggingface.co/datasets/ASLP-lab/SongFormDB) --- ## 🤝 引用格式 bibtex @misc{hao2025songformer, title = {SongFormer: Scaling Music Structure Analysis with Heterogeneous Supervision}, author = {Chunbo Hao and Ruibin Yuan and Jixun Yao and Qixin Deng and Xinyi Bai and Wei Xue and Lei Xie}, year = {2025}, eprint = {2510.02797}, archivePrefix = {arXiv}, primaryClass = {eess.AS}, url = {https://arxiv.org/abs/2510.02797} } --- ## 🎼 梅尔频谱图细节 <details> <summary>点击展开/折叠</summary> 环境配置可参考BigVGAN的官方实现。若音频源失效，可通过以下方法重构音频。 ### 🎸 SongFormBench-HarmonixSet 使用官方HarmonixSet梅尔频谱图。复现步骤如下： bash # 克隆BigVGAN仓库 git clone https://github.com/NVIDIA/BigVGAN.git # 进入utils目录 cd utils/HarmonixSet # 更新inference_e2e.sh中的BIGVGAN_REPO_DIR变量 # 运行推理脚本 bash inference_e2e.sh ### 🎤 SongFormBench-CN 使用[**bigvgan_v2_44khz_128band_256x**](https://huggingface.co/nvidia/bigvgan_v2_44khz_128band_256x)进行复现。你需先下载bigvgan_v2_44khz_128band_256x，将其项目目录添加至PYTHONPATH环境变量，随后可使用以下代码： python # 详见实现代码：utils/CN/infer.py </details> --- ## 📧 联系方式如有疑问、问题或合作意向，请访问我们的[GitHub仓库](https://github.com/ASLP-lab/SongFormer)或提交Issue。

提供机构：

maas

创建时间：

2025-09-15

5,000+

优质数据集

54 个

任务类型

进入经典数据集