five

"TRAD-MNSC Dataset : Traditional Multilingual Noisy Speech Corpus for Speech Enhancement Research."

收藏
DataCite Commons2026-01-30 更新2026-05-03 收录
下载链接:
https://ieee-dataport.org/documents/trad-mnsc-dataset-traditional-multilingual-noisy-speech-corpus-speech-enhancement
下载链接
链接失效反馈
官方服务:
资源简介:
"The Traditional Multilingual Noisy Speech Corpus (TRAD-MNSC) is a comprehensive derived dataset specifically designed for evaluating traditional signal processing-based speech enhancement algorithms. This corpus comprises 5,600 audio files spanning seven Indian languages (Telugu, Tamil, Kannada, Malayalam, Bengali, Hindi, and Marathi) across four Signal-to-Noise Ratio (SNR) levels (5, 10, 15, and 20 dB), providing 2,800 clean-noisy speech pairs. The dataset was created by systematically mixing clean speech samples from the Kaggle dataset \u201cAudio Dataset with 10 Indian Languages\u201d with cafeteria environmentalnoise from the DEMAND database using precise SNR calibration. All audio files are sampled at 16 kHz with 16-bit resolution in mono channel format. This technical documentation presents the complete dataset structure, source attribution, mathematical formulations, noise characteristics analysis, quality control verification, and usage guidelines. The TRAD-MNSC dataset addresses the critical need for standardized, multilingual test corpora in speech processing research, particularly for traditional enhancement algorithms including spectral subtraction, Wiener filtering, Kalman filtering, and subspace methods."
提供机构:
IEEE DataPort
创建时间:
2026-01-30
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作