Dataset in support of the thesis 'Speech enhancement by using deep learning algorithms'
收藏DataCite Commons2024-07-18 更新2025-04-17 收录
下载链接:
https://eprints.soton.ac.uk/492128/
下载链接
链接失效反馈官方服务:
资源简介:
The source code and audio datasets of my PhD project.
1. https://www.openslr.org/12
LibriSpeech is a corpus of approximately 1000 hours of 16kHz read English speech, prepared by Vassil Panayotov with the assistance of Daniel Povey. The data is derived from read audiobooks from the LibriVox project, and has been carefully segmented and aligned.
Acoustic models, trained on this data set, are available at kaldi-asr.org and language models, suitable for evaluation can be found at http://www.openslr.org/11/.
For more information, see the paper "LibriSpeech: an ASR corpus based on public domain audio books", Vassil Panayotov, Guoguo Chen, Daniel Povey and Sanjeev Khudanpur, ICASSP 2015
2.https://www.openslr.org/17
MUSAN is a corpus of music, speech, and noise recordings.
This work was supported by the National Science Foundation Graduate Research Fellowship under Grant No. 1232825 and by Spoken Communications.
You can cite the data using the following BibTeX entry:
@misc{musan2015,
author = {David Snyder and Guoguo Chen and Daniel Povey},
title = {{MUSAN}: {A} {M}usic, {S}peech, and {N}oise {C}orpus},
year = {2015},
eprint = {1510.08484},
note = {arXiv:1510.08484v1}
}
3. source_code.zip
The program from parts of my PhD project.
4.SJ_EXP.zip
The program of the subjective experiment corresponding to the last chapter.
提供机构:
University of Southampton
创建时间:
2024-07-18



