JukeBox
收藏arXiv2020-08-08 更新2024-06-21 收录
下载链接:
http://iprobe.cse.msu.edu/datasets/jukebox.html
下载链接
链接失效反馈官方服务:
资源简介:
JukeBox数据集是由密歇根州立大学创建的多语种歌手识别数据集,包含467小时的歌唱音频数据,涵盖936位不同歌手的录音。数据集中的歌曲涵盖18种语言,录音环境多样,从专业录音室到现场音乐会不等。创建过程中,研究团队从维基百科和Spotify数据库中筛选艺术家和歌曲,通过Internet Archive下载并使用语音激活检测技术进行数据清洗。该数据集主要用于评估和改进基于歌唱语音的说话人识别系统的性能,特别是在性别和语言多样性方面的应用。
JukeBox dataset is a multilingual singer identification dataset developed by Michigan State University. It contains 467 hours of singing audio data, covering recordings from 936 distinct singers. The songs in the dataset span 18 languages, with diverse recording environments ranging from professional recording studios to live concerts. During the dataset creation process, the research team screened artists and songs from Wikipedia and Spotify databases, downloaded source materials via the Internet Archive, and conducted data cleaning using voice activity detection technology. This dataset is primarily utilized to evaluate and improve the performance of singing voice-based speaker recognition systems, particularly for applications involving gender and linguistic diversity.
提供机构:
密歇根州立大学
创建时间:
2020-08-08



