five

Dataset in support of the thesis 'Speech enhancement by using deep learning algorithms'

收藏
DataCite Commons2024-07-18 更新2025-04-17 收录
下载链接:
https://eprints.soton.ac.uk/492128/
下载链接
链接失效反馈
官方服务:
资源简介:
The source code and audio datasets of my PhD project. 1. https://www.openslr.org/12 LibriSpeech is a corpus of approximately 1000 hours of 16kHz read English speech, prepared by Vassil Panayotov with the assistance of Daniel Povey. The data is derived from read audiobooks from the LibriVox project, and has been carefully segmented and aligned. Acoustic models, trained on this data set, are available at kaldi-asr.org and language models, suitable for evaluation can be found at http://www.openslr.org/11/. For more information, see the paper "LibriSpeech: an ASR corpus based on public domain audio books", Vassil Panayotov, Guoguo Chen, Daniel Povey and Sanjeev Khudanpur, ICASSP 2015 2.https://www.openslr.org/17 MUSAN is a corpus of music, speech, and noise recordings. This work was supported by the National Science Foundation Graduate Research Fellowship under Grant No. 1232825 and by Spoken Communications. You can cite the data using the following BibTeX entry: @misc{musan2015, author = {David Snyder and Guoguo Chen and Daniel Povey}, title = {{MUSAN}: {A} {M}usic, {S}peech, and {N}oise {C}orpus}, year = {2015}, eprint = {1510.08484}, note = {arXiv:1510.08484v1} } 3. source_code.zip The program from parts of my PhD project. 4.SJ_EXP.zip The program of the subjective experiment corresponding to the last chapter.
提供机构:
University of Southampton
创建时间:
2024-07-18
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作