Children's Song Dataset
收藏OpenDataLab2026-05-24 更新2024-05-09 收录
下载链接:
https://opendatalab.org.cn/OpenDataLab/Children_s_Song_Dataset
下载链接
链接失效反馈官方服务:
资源简介:
Children's Song Dataset 是用于歌声研究的开源数据集。该数据集包含一位韩国女性职业流行歌手演唱的 50 首韩语和 50 首英文歌曲。每首歌曲都记录在两个单独的键中,总共有 200 个录音。每个录音都与 MIDI 转录和字素级别和音素级别的歌词注释配对。数据集结构 整个数据分为韩语和英语,每种语言分为“wav”、“mid”、“lyric”、“txt”和“csv”文件夹。对于每种格式,每首歌曲都有相同的文件名。每种格式代表以下信息。可以在“metadata.json”中找到每首歌曲的原始歌曲名称、速度和拍号等附加信息。 'wav':44.1kHz 16bit wav 格式的人声录音 'mid':MIDI 格式的乐谱信息 'lyric':字形级别的歌词信息 'txt':音节和音位级别的歌词信息逗号分隔值 (CSV) 格式的偏移量和音节时序
The Children's Song Dataset is an open-source dataset for vocal music research. This dataset includes 50 Korean songs and 50 English songs performed by a professional female Korean pop singer. Each song is stored under two separate entries, leading to a total of 200 audio recordings across the entire dataset. Each recording is paired with MIDI transcriptions and lyric annotations at both the grapheme and phoneme levels.
Dataset Structure: The entire dataset is split into Korean and English subsets. Each subset contains subfolders named "wav", "mid", "lyric", "txt", and "csv". For each format category, all songs use identical filenames across all subfolders. Each format corresponds to the following type of data:
Additional metadata such as the original track title, tempo, and time signature for each song can be accessed in "metadata.json".
- "wav": Vocal audio recordings in 44.1kHz, 16-bit WAV format
- "mid": Musical score information stored in MIDI format
- "lyric": Lyric annotations at the grapheme level
- "txt": Offset and syllable timing information in comma-separated values (CSV) format, paired with lyric annotations at the syllable and phoneme levels
提供机构:
OpenDataLab
创建时间:
2022-05-30
搜集汇总
数据集介绍

背景与挑战
背景概述
Children's Song Dataset是一个开源歌声研究数据集,包含200个由韩国女性职业流行歌手演唱的韩语和英语歌曲录音,每个录音均配有MIDI转录和多层次歌词注释。数据集以wav、mid、lyric、txt和csv格式组织,并提供歌曲元数据,由韩国科学技术院于2021年发布。
以上内容由遇见数据集搜集并总结生成



