Soul-AILab/SoulX-Singer-Eval-Dataset
收藏Hugging Face2026-02-12 更新2026-04-05 收录
下载链接:
https://hf-mirror.com/datasets/Soul-AILab/SoulX-Singer-Eval-Dataset
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-nc-4.0
task_categories:
- text-to-speech
- text-to-audio
language:
- zh
- en
tags:
- music
- SVS
- SingVoiceSynthesis
pretty_name: SoulX-Singer Eval Dataset
size_categories:
- 100M<n<1B
---
# SoulX-Singer-Eval
This corpus contains 100 singing segments from 50 distinct individuals (25 Mandarin and 25 English speakers), with 2 segments provided per speaker. Additionally, 30 target samples are cross-selected from the Opencpop, M4Singer, and GTSinger, which are also in GMO-SVS. The annotation files are organized in JSONL format and categorized by text formulation (phoneme or word) and usage (prompt or target). All waveforms are stored in the audio directory. Metadata includes source dataset, song title, singer ID, language, and audio paths.
Word-based annotations include `note_text`, `note_dur`, `note_pitch`, `note_type`, all labeled at the individual singing note level. `note_text` specifies the lyric word, while `note_dur`, `note_pitch` and `note_type` denote the duration, pitch class, and category (1 for rest, 2 for lyric, and 3 for slur) of each note. Phone-based annotations follow the GTSinger style, including `ph`, `ep_pitches`, `ep_notedurs`, `ep_types` and `ph_durs`.
# GMO-SVS
This corpus chooses 802 samples from Opencpop, M4Singer, and GTSinger. The audio and annotation files follow the same organizational structure as the SoulX-Singer-Eval. Metadata, word-based annotations, and phone-based annotations are all strictly aligned with the SoulX-Singer-Evalset specifications to ensure consistency.
---
许可证:CC BY-NC 4.0
任务类别:
- 文本转语音
- 文本转音频
语言:
- 中文(zh)
- 英语(en)
标签:
- 音乐
- SVS(SingVoiceSynthesis,歌声合成)
- 歌声合成
友好名称:SoulX-Singer Eval Dataset
样本量区间:100兆 < n < 10亿
---
# SoulX-Singer-Eval
本语料库包含来自50名不同演唱者的100段歌声片段(其中25名华语使用者、25名英语使用者),每位演唱者提供2段片段。此外,从Opencpop、M4Singer、GTSinger中交叉选取了30条目标样本,这三个数据集同样属于GMO-SVS范畴。标注文件以JSONL格式组织,并按文本表示形式(音素或词语)与使用用途(提示用或目标用)进行分类。所有波形音频均存储于audio目录中。元数据包含源数据集、歌曲标题、演唱者ID、语言及音频路径。
基于词语的标注包含`note_text`、`note_dur`、`note_pitch`、`note_type`,所有标注均以单个歌唱音符为粒度进行标记。其中`note_text`用于指定歌词词语,`note_dur`、`note_pitch`与`note_type`分别表示每个音符的时长、音高类别以及类别类型(1代表休止符,2代表歌词音符,3代表连音)。
基于音素的标注遵循GTSinger的标注规范,包含`ph`、`ep_pitches`、`ep_notedurs`、`ep_types`与`ph_durs`。
# GMO-SVS
本语料库从Opencpop、M4Singer及GTSinger中选取了802条样本。其音频与标注文件的组织架构与SoulX-Singer-Eval完全一致。元数据、基于词语的标注以及基于音素的标注均严格遵循SoulX-Singer-Evalset的规范标准,以确保一致性。
提供机构:
Soul-AILab



