five

Deep Xi dataset

收藏
ieee-dataport.org2025-03-23 收录
下载链接:
https://ieee-dataport.org/open-access/deep-xi-dataset
下载链接
链接失效反馈
官方服务:
资源简介:
The training, validation, and test set used for Deep Xi (https://github.com/anicolson/DeepXi). Training set:The clean-speech recordings are from the test-clean-100 set of Librispeech (http://www.openslr.org/12/) and from the CSTR VCTK corpus (https://homepages.inf.ed.ac.uk/jyamagis/page3/page58/page58.html) (the recordings from speakers p232 and p257 are excluded as they are used in the test set of the DEMAND Voicebank dataset (http://ssw9.talp.cat/papers/ssw9_PS2-4_Valentini-Botinhao.pdf)). The noise recordings are from the Environmental Background Noise dataset (https://personal.utdallas.edu/~nxk019000/VAD-dataset/), the Nonspeech dataset (http://web.cse.ohio-state.edu/pnl/corpus/HuNonspeech/HuCorpus.html), the QUT-NOISE dataset (https://research.qut.edu.au/saivt/databases/qut-noise-databases-and-protocols/), multiple Freesound packs (https://freesound.org/), the noise set of the MUSAN corpus (https://www.openslr.org/17/), the RSG-10 noise database (http://www.steeneken.nl/wp-content/uploads/2014/04/RSG-10_Noise-data-base.pdf) (voice babble, F16, and factory (welding) are excluded as they are used in the Deep Xi Test Set and the Test Set From 10.1016/J.SPECOM.2019.06.002) and the Urban Sound dataset (http://www.justinsalamon.com/uploads/4/3/9/4/4394963/salamon_urbansound_acmmm14.pdf) (street music no. 26,270 is excluded as it is used in the Deep Xi Test Set and the Test Set From 10.1016/J.SPECOM.2019.06.002). Note that the clean-speech and noise recordings used for this training set are separate from those used in the test set and the Test Set From 10.1016/J.SPECOM.2019.06.002, and the DEMAND Voicebank test set (http://ssw9.talp.cat/papers/ssw9_PS2-4_Valentini-Botinhao.pdf). Test set:Noisy-speech set used to test Deep Xi (https://github.com/anicolson/DeepXi). The clean speech and noise used to create the noisy-speech set are also available. The clean-speech recordings are from Librispeech test-clean (http://www.openslr.org/12/). The noise recordings are from the RSG-10 noise database (http://www.steeneken.nl/wp-content/uploads/2014/04/RSG-10_Noise-data-base.pdf) and the Urban Sound dataset (http://www.justinsalamon.com/uploads/4/3/9/4/4394963/salamon_urbansound_acmmm14.pdf).The noise recordings are as follows:26270 - A recording of street music. It is recording no. 26,270 from the Urband Sound dataset.SIGNAL019 - A recording of voice babble from the RSG-10 dataset.SIGNAL020 - A recording of an F16 fighter jet from the RSG-10 dataset.SIGNAL021 - A recording of factory welding from the RSG-10 dataset.

本数据集的培训、验证和测试集均应用于深度Xi模型(https://github.com/anicolson/DeepXi)。其中,培训集的清洁语音录音源自Librispeech的test-clean-100集(http://www.openslr.org/12/)以及CSTR VCTK语料库(https://homepages.inf.ed.ac.uk/jyamagis/page3/page58/page58.html)(排除p232和p257发言者的录音,因其被用于DEMAND Voicebank数据集的测试集,详见http://ssw9.talp.cat/papers/ssw9_PS2-4_Valentini-Botinhao.pdf)。噪声录音则取自环境背景噪声数据集(https://personal.utdallas.edu/~nxk019000/VAD-dataset/)、非语音数据集(http://web.cse.ohio-state.edu/pnl/corpus/HuNonspeech/HuCorpus.html)、QUT-NOISE数据集(https://research.qut.edu.au/saivt/databases/qut-noise-databases-and-protocols/)、多个Freesound包(https://freesound.org/)、MUSAN语料库的噪声集(https://www.openslr.org/17/)、RSG-10噪声数据库(http://www.steeneken.nl/wp-content/uploads/2014/04/RSG-10_Noise-data-base.pdf)(排除用于Deep Xi测试集和10.1016/J.SPECOM.2019.06.002测试集的语音嘈杂、F16和工厂焊接录音)以及城市声音数据集(http://www.justinsalamon.com/uploads/4/3/9/4/4394963/salamon_urbansound_acmmm14.pdf)(排除用于Deep Xi测试集和10.1016/J.SPECOM.2019.06.002测试集的街头音乐编号26,270)。需注意的是,本培训集所使用的清洁语音和噪声录音与测试集及10.1016/J.SPECOM.2019.06.002测试集,以及DEMAND Voicebank测试集的录音相区分。测试集:用于测试Deep Xi(https://github.com/anicolson/DeepXi)的含噪语音集。用于创建含噪语音集的清洁语音和噪声录音亦一并提供。清洁语音录音源自Librispeech的test-clean集(http://www.openslr.org/12/)。噪声录音则取自RSG-10噪声数据库(http://www.steeneken.nl/wp-content/uploads/2014/04/RSG-10_Noise-data-base.pdf)和城市声音数据集(http://www.justinsalamon.com/uploads/4/3/9/4/4394963/salamon_urbansound_acmmm14.pdf)。具体的噪声录音如下:26270 - 来自城市声音数据集的街头音乐录音,编号为26,270。SIGNAL019 - 来自RSG-10数据集的语音嘈杂录音。SIGNAL020 - 来自RSG-10数据集的F16战斗机录音。SIGNAL021 - 来自RSG-10数据集的工厂焊接录音。
提供机构:
IEEE Dataport
搜集汇总
数据集介绍
main_image_url
背景与挑战
背景概述
Deep Xi dataset是一个专为语音增强和语音分离研究设计的音频数据集,包含训练集和测试集,训练集数据来自Librispeech和CSTR VCTK语料库,测试集数据则来自Librispeech test-clean和RSG-10噪声数据库等。数据集以.wav格式提供,总大小为21.69 GB。
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作