"MOZHI-DLSE-ML: A Deep Learning-Ready Malayalam Noisy Speech Dataset for Neural Speech Enhancement Using TRAD-MNSC Formulation "

Name: "MOZHI-DLSE-ML: A Deep Learning-Ready Malayalam Noisy Speech Dataset for Neural Speech Enhancement Using TRAD-MNSC Formulation "
Creator: IEEE DataPort
Published: 2026-03-28 12:47:30
License: 暂无描述

DataCite Commons2026-03-28 更新2026-05-03 收录

下载链接：

https://ieee-dataport.org/documents/mozhi-dlse-ml-deep-learning-ready-malayalam-noisy-speech-dataset-neural-speech

下载链接

链接失效反馈

官方服务：

资源简介：

"MOZHI-DLSE-ML is a rigorously verified, deep learning-ready noisy speech dataset for the Malayalam language (ISO 639-1: ml), developed at the Indian Institute of Technology Kharagpur as part of a PhD research programme on multilingual speech enhancement. The name is drawn directly from the Malayalam word Mozhi (\u0d2e\u0d4a\u0d34\u0d3f: language, speech, word, utterance) \u2014 the very word that Malayalam speakers use to refer to their own mother tongue \u2014 making MOZHI-DLSE-ML a dataset named not in description of its language but in the language itself. Malayalam was granted Classical Language status by the Government of India in 2013, recognising its literary tradition of over a thousand years, its rich corpus of Manipravalam literature blending Sanskrit and Malayalam, and the unique phonological complexity that makes it one of the most phonetically intricate languages in the world. The language has produced luminaries including Thunchaththu Ezhuthachan, considered the father of modern Malayalam literature, the social reformer-poet Kumaran Asan, and G. Sankara Kurup, who received the inaugural Jnanpith Award in 1965 \u2014 the first ever awarded in any Indian language. The dataset contains 64,000 WAV audio files, consisting of 32,000 clean\u2013noisy paired samples generated at two controlled Signal-to-Noise Ratio (SNR) levels: 10 dB and 20 dB. Noisy speech signals are produced by systematically mixing clean Malayalam speech recordings with real-world environmental noise from the DEMAND acoustic noise database using the TRAD-MNSC mixing formulation with joint normalisation, ensuring precise SNR preservation across all pairs. All audio files are stored in 16 kHz mono 16-bit PCM WAV format, enabling direct compatibility with modern deep learning frameworks including PyTorch, TensorFlow, and JAX. To ensure dataset reliability, a six-stage verification pipeline was implemented, covering dataset structure validation, sampling-rate verification, SNR verification, perceptual quality evaluation, and acoustic differentiation analysis between clean and noisy signals. The verification process achieved 100% SNR compliance across all 32,000 pairs, with average PESQ scores of 2.05 at 10 dB and 2.95 at 20 dB, consistent with expected perceptual degradation characteristics. MOZHI-DLSE-ML provides a large-scale, fully verified Malayalam clean\u2013noisy speech corpus designed for machine learning and deep learning research in speech enhancement, addressing the critical shortage of paired noisy speech datasets for Indian languages and enabling robust development and benchmarking of speech enhancement algorithms for the Malayalam-speaking communities of Kerala, Lakshadweep, and Puducherry."

提供机构：

IEEE DataPort

创建时间：

2026-03-28

5,000+

优质数据集

54 个

任务类型

进入经典数据集