"Kannada Noisy Dataset"

Name: "Kannada Noisy Dataset"
Creator: IEEE DataPort
Published: 2026-03-12 04:59:08
License: 暂无描述

DataCite Commons2026-03-12 更新2026-05-03 收录

下载链接：

https://ieee-dataport.org/documents/kannada-noisy-dataset

下载链接

链接失效反馈

官方服务：

资源简介：

"This dataset contains noisy Kannada language transcriptions with corresponding unique identifiers, compiled from crowd-sourced speech data. The collection methodology incorporates:Source Material: Crowd-sourced audio recordings with added sea noiseAudio Enhancement: Data combined with environmental noise effects (via bandlab.com) and sea ambient sounds (from freesound.com) to simulate realistic acoustic conditionsContent: Diverse Kannada text covering multiple domains including:Scientific and technical topics (power-to-weight ratios, oxygen deficiency, nitrate chemistry)Historical and cultural references (philosophers, historical figures)General knowledge (geography, demographics, materials science)Literary and philosophical contentFormat: Tab-separated TXT file with ID-text pairsLanguage: Kannada (Indic script)Purpose: Intended for training speech recognition, text-to-speech, or machine translation systems for KannadaScale: 30+ recorded samples represented in the dataset"

提供机构：

IEEE DataPort

创建时间：

2026-03-12

5,000+

优质数据集

54 个

任务类型

进入经典数据集