five

Indonesian Speech Data by Mobile Phone_R - 359 Hours

收藏
catalogue.elra.info2025-03-26 收录
下载链接:
https://catalogue.elra.info/en-us/repository/browse/ELRA-S0470/
下载链接
链接失效反馈
官方服务:
资源简介:
Indonesia speech data (reading) is collected from 496 Indonesian native speakers and is recorded in quiet environment. The recording is rich in content, covering multiple categories such as econimics, entertainment, news, figure, letter, and oral. Around 400 sentences for each speaker. The valid data volumn is 360 hours. All texts are manual transcribed with high accuray.Format:16kHz, 16bit, uncompressed wav, mono channelRecording environment:quiet indoor environment, without echoRecording content (read speech):economy, entertainment, news, oral language, numbers, lettersSpeaker:496 people from Indonesian; 280 females, accounting for 56%Device:Android mobile phone: iPhone=3:1Language:IndonesianTranscription content:text, time point of speech data, 5 noise symbols, special identifiersAccuracy rate:95% (the accuracy rate of noise symbols and other identifiers is not included)Application scenarios:speech recognition, voiceprint recognition

印尼语音数据(朗读)由496名印尼母语者提供,录音环境安静。内容丰富,涵盖经济、娱乐、新闻、人物、信件和口语等多个类别。每位说话者的录音内容约为400句。有效数据量达360小时。所有文本均采用人工转录,精度极高。格式为16kHz、16位、未压缩的WAV单声道。录音环境为无回声的室内环境。录音内容(朗读)包括经济、娱乐、新闻、口语、数字、字母。说话者来自印尼,其中女性280名,占总数的56%。录音设备为Android智能手机,iPhone与Android的比例为3:1。语言为印尼语。转录内容包括文本、语音数据的时间点、5个噪声符号和特殊标识符。准确率高达95%(不包括噪声符号和其他标识符的准确率)。应用场景包括语音识别和声纹识别。
提供机构:
catalogue.elra.info
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作