SongEval
收藏魔搭社区2025-12-05 更新2025-09-13 收录
下载链接:
https://modelscope.cn/datasets/ASLP-lab/SongEval
下载链接
链接失效反馈官方服务:
资源简介:
# SongEval 🎵
**A Large-Scale Benchmark Dataset for Aesthetic Evaluation of Complete Songs**
<!-- [](https://huggingface.co/datasets/ASLP-lab/SongEval) -->
[](https://github.com/ASLP-lab/SongEval)
[](https://arxiv.org/pdf/2505.10793)
[](https://creativecommons.org/licenses/by-nc-sa/4.0/)
---
## 📖 Overview
**SongEval** is the first open-source, large-scale benchmark dataset designed for **aesthetic evaluation of complete songs**. It provides over **2,399 songs** (~140 hours) annotated by **16 expert raters** across **five perceptual dimensions**. The dataset enables research in evaluating and improving music generation systems from a human aesthetic perspective.
<p align="center"> <img src="assets/intro.png" alt="SongEval" width="800"/> </p>
---
## 🌟 Features
- 🎧 **2,399 complete songs** (with vocals and accompaniment)
- ⏱️ **~140 hours** of high-quality audio
- 🌍 **English and Chinese** songs
- 🎼 **9 mainstream genres**
- 📝 **5 aesthetic dimensions**:
- Overall Coherence
- Memorability
- Naturalness of Vocal Breathing and Phrasing
- Clarity of Song Structure
- Overall Musicality
- 📊 Ratings on a **5-point Likert scale** by **musically trained annotators**
- 🎙️ Includes outputs from **five generation models** + a subset of real/bad-case samples
<div style="display: flex; justify-content: space-between;">
<img src="assets/score.png" alt="Image 1" style="width: 48%;" />
<img src="assets/distribution.png" alt="Image 2" style="width: 48%;" />
</div>
---
## 📂 Dataset Structure
Each sample includes:
- `audio`: WAV audio of the full song
- `gender`: male or female
- `aesthetic_scores`: dict of five human-annotated scores (1–5)
---
## 🔍 Use Cases
- Benchmarking song generation models from an aesthetic viewpoint
- Training perceptual quality predictors for song
- Exploring alignment between objective metrics and human judgments
---
## 🧪 Evaluation Toolkit
We provide an open-source evaluation toolkit trained on SongEval to help researchers evaluate new music generation outputs:
👉 GitHub: [https://github.com/ASLP-lab/SongEval](https://github.com/ASLP-lab/SongEval)
---
## 📥 Download
You can load the dataset directly using 🤗 Datasets:
```python
from datasets import load_dataset
dataset = load_dataset("ASLP-lab/SongEval")
```
## 🙏 Acknowledgement
This project is mainly organized by the audio, speech and language processing lab [(ASLP@NPU)](http://www.npu-aslp.org/).
We sincerely thank the **Shanghai Conservatory of Music** for their expert guidance on music theory, aesthetics, and annotation design.
Meanwhile, we thank AISHELL to help with the orgnization of the song annotations.
<p align="center"> <img src="assets/logo.png" alt="Shanghai Conservatory of Music Logo"/> </p>
---
## 📬 Citation
If you use this toolkit or the SongEval dataset, please cite the following:
```
@article{yao2025songeval,
title = {SongEval: A Benchmark Dataset for Song Aesthetics Evaluation},
author = {Yao, Jixun and Ma, Guobin and Xue, Huixin and Chen, Huakang and Hao, Chunbo and Jiang, Yuepeng and Liu, Haohe and Yuan, Ruibin and Xu, Jin and Xue, Wei and others},
journal = {arXiv preprint arXiv:2505.10793},
year={2025}
}
```
# SongEval 🎵
**面向完整歌曲审美评价的大规模基准数据集**
[](https://github.com/ASLP-lab/SongEval)
[](https://arxiv.org/pdf/2505.10793)
[](https://creativecommons.org/licenses/by-nc-sa/4.0/)
---
## 📖 数据集概览
**SongEval** 是首个开源的大规模基准数据集,专为完整歌曲的审美评价任务设计。该数据集包含2399余首歌曲(总时长约140小时),由16名专业评分人员基于5个感知维度完成标注,可支撑从人类审美视角开展音乐生成系统的评估与优化相关研究。
<p align="center"> <img src="assets/intro.png" alt="SongEval 数据集介绍图" width="800"/> </p>
---
## 🌟 数据集特性
- 🎧 **2399余首完整歌曲**(含人声与伴奏)
- ⏱️ **总时长约140小时**的高质量音频
- 🌍 **涵盖英文与中文歌曲**
- 🎼 **9种主流音乐风格**
- 📝 **5项审美维度**:
- 整体连贯性(Overall Coherence)
- 记忆点显著性(Memorability)
- 人声呼吸与分句自然度(Naturalness of Vocal Breathing and Phrasing)
- 歌曲结构清晰度(Clarity of Song Structure)
- 整体音乐性(Overall Musicality)
- 📊 由具备音乐训练背景的标注人员基于**5级李克特量表(Likert scale)**完成评分
- 🎙️ 包含**5款生成模型**的输出结果,以及部分真实歌曲/不合格样本子集
<div style="display: flex; justify-content: space-between;">
<img src="assets/score.png" alt="评分分布示意图" style="width: 48%;" />
<img src="assets/distribution.png" alt="数据分布示意图" style="width: 48%;" />
</div>
---
## 📂 数据集结构
每个样本包含以下字段:
- `audio`:完整歌曲的WAV格式音频
- `gender`:演唱者性别(男/女)
- `aesthetic_scores`:包含5项人工标注得分(1至5分)的字典
---
## 🔍 应用场景
- 从审美视角对歌曲生成模型进行基准测试
- 训练歌曲感知质量预测模型
- 探索客观指标与人类主观判断的一致性
---
## 🧪 评估工具包
我们基于SongEval构建了开源评估工具包,助力研究人员对新型音乐生成输出进行评估:
👉 GitHub 仓库:[https://github.com/ASLP-lab/SongEval](https://github.com/ASLP-lab/SongEval)
---
## 📥 数据集下载
可直接通过🤗 Datasets库加载本数据集:
python
from datasets import load_dataset
dataset = load_dataset("ASLP-lab/SongEval")
---
## 🙏 致谢
本项目主要由音频、语音与语言处理实验室(ASLP@NPU)牵头组织。
衷心感谢**上海音乐学院**在音乐理论、审美评价及标注方案设计方面提供的专业指导。
同时感谢AISHELL协助完成歌曲标注的组织工作。
<p align="center"> <img src="assets/logo.png" alt="上海音乐学院校徽"/> </p>
---
## 📬 引用声明
若您使用本工具包或SongEval数据集,请引用如下文献:
@article{yao2025songeval,
title = {SongEval: A Benchmark Dataset for Song Aesthetics Evaluation},
author = {Yao, Jixun and Ma, Guobin and Xue, Huixin and Chen, Huakang and Hao, Chunbo and Jiang, Yuepeng and Liu, Haohe and Yuan, Ruibin and Xu, Jin and Xue, Wei and others},
journal = {arXiv preprint arXiv:2505.10793},
year={2025}
}
提供机构:
maas
创建时间:
2025-09-04
搜集汇总
数据集介绍

背景与挑战
背景概述
SongEval是一个包含2,399首完整歌曲的大规模数据集,用于美学评估,涵盖五个感知维度和九种主流音乐类型。数据集还包括一个开源评估工具包,支持歌曲生成模型的基准测试和感知质量预测。
以上内容由遇见数据集搜集并总结生成



