"Kannada Noisy Dataset"
收藏DataCite Commons2026-03-12 更新2026-05-03 收录
下载链接:
https://ieee-dataport.org/documents/kannada-noisy-dataset
下载链接
链接失效反馈官方服务:
资源简介:
"This dataset contains noisy Kannada language transcriptions with corresponding unique identifiers, compiled from crowd-sourced speech data. The collection methodology incorporates:Source Material: Crowd-sourced audio recordings with added sea noiseAudio Enhancement: Data combined with environmental noise effects (via bandlab.com) and sea ambient sounds (from freesound.com) to simulate realistic acoustic conditionsContent: Diverse Kannada text covering multiple domains including:Scientific and technical topics (power-to-weight ratios, oxygen deficiency, nitrate chemistry)Historical and cultural references (philosophers, historical figures)General knowledge (geography, demographics, materials science)Literary and philosophical contentFormat: Tab-separated TXT file with ID-text pairsLanguage: Kannada (Indic script)Purpose: Intended for training speech recognition, text-to-speech, or machine translation systems for KannadaScale: 30+ recorded samples represented in the dataset"
提供机构:
IEEE DataPort
创建时间:
2026-03-12



