"VAAKSHAKTI-DLSE-MA: A Deep Learning-Ready Marathi Noisy Speech Dataset for Neural Speech Enhancement Using TRAD-MNSC Formulation"
收藏DataCite Commons2026-03-28 更新2026-05-03 收录
下载链接:
https://ieee-dataport.org/documents/vaakshakti-dlse-ma-deep-learning-ready-marathi-noisy-speech-dataset-neural-speech
下载链接
链接失效反馈官方服务:
资源简介:
"VAAKSHAKTI-DLSE-MA is a rigorously verified, deep learning-ready noisy speech dataset for the Marathi language (ISO 639-1: mr), developed at the Indian Institute of Technology Kharagpur as part of a PhD research programme on multilingual speech enhancement. The name combines two Sanskrit words \u2014 Vaak (\u0935\u093e\u0915\u094d: speech, voice, the sacred faculty of articulate utterance) and Shakti (\u0936\u0915\u094d\u0924\u093f: power, energy, force \u2014 the primordial creative energy in Indian philosophy) \u2014 forming Vaakshakti (\u0935\u093e\u0915\u094d\u0936\u0915\u094d\u0924\u093f): the power of speech, a name that honours Marathi as one of India's oldest literary languages with a continuous tradition spanning over nine centuries, the language of the Warkari saints Dnyaneshwar, Namdev, Eknath, and Tukaram, of the social reformers Mahatma Jyotirao Phule and Savitribai Phule, and of the revolutionary thinker Dr. B. R. Ambedkar. Marathi is the official language of the state of Maharashtra, constituted on 1 May 1960, with Mumbai as its capital, and is one of the twenty-two scheduled languages listed in the Eighth Schedule of the Constitution of India. The dataset contains 64,000 WAV audio files, consisting of 32,000 clean\u2013noisy paired samples generated at two controlled Signal-to-Noise Ratio (SNR) levels: 10 dB and 20 dB. Noisy speech signals are produced by systematically mixing clean Marathi speech recordings with real-world environmental noise from the DEMAND acoustic noise database using the TRAD-MNSC mixing formulation with joint normalisation, ensuring precise SNR preservation across all pairs. All audio files are stored in 16 kHz mono 16-bit PCM WAV format, enabling direct compatibility with modern deep learning frameworks including PyTorch, TensorFlow, and JAX. To ensure dataset reliability, a six-stage verification pipeline was implemented, covering dataset structure validation, sampling-rate verification, SNR verification, perceptual quality evaluation, and acoustic differentiation analysis between clean and noisy signals. The verification process achieved 100% SNR compliance across all 32,000 pairs, with average PESQ scores of 2.05 at 10 dB and 2.95 at 20 dB, consistent with expected perceptual degradation characteristics. VAAKSHAKTI-DLSE-MA provides a large-scale, fully verified Marathi clean\u2013noisy speech corpus designed for machine learning and deep learning research in speech enhancement, addressing the critical shortage of paired noisy speech datasets for Indian languages and enabling robust development and benchmarking of speech enhancement algorithms for the Marathi-speaking communities of Maharashtra, Goa, and the broader Marathi diaspora."
提供机构:
IEEE DataPort
创建时间:
2026-03-28



