Nexdata | Multilingual Code-switching Speech Data | 5,000 Hours |Audio Data| Speech Recognition Data|AI & ML Training Data
收藏Datarade2024-04-19 收录
下载链接:
https://datarade.ai/data-products/nexdata-multilingual-code-switching-speech-data-5-000-hou-nexdata
下载链接
链接失效反馈官方服务:
资源简介:
1. Specifications Format : 16kHz, 16bit, uncompressed wav, mono channel Recording environment : quiet indoor environment, without echo Recording content (read speech) : general category; human-machine interaction category Demographics : Speakers are evenly distributed across all age groups, covering children, teenagers, middle-aged, elderly, etc. Device : Android mobile phone, iPhone; Language : Mandarin,English,Korean Application scenarios : speech recognition; voiceprint recognition. Accuracy rate : 97% 2. About Nexdata Nexdata owns off-the-shelf 200,000 hours of speech recognition data, 800TB of Annotated Imagery Data, about 2 billion pieces of Natural Language Processing (NLP) Data. These ready-to-go Natural Language Processing (NLP) Data support instant delivery, quickly improve the accuracy of AI models. For more details, please visit us at https://www.nexdata.ai/speechRecognition?source=Datarade
提供机构:
Nexdata
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集为多语言语码转换语音数据,包含5000小时16kHz/16bit单声道WAV格式的安静室内录音,涵盖普通话、英语和韩语,说话人年龄分布均匀,准确率达97%,适用于语音识别和声纹识别模型训练。数据由拥有20万小时语音识别数据的Nexdata提供,支持即时交付以提升AI模型精度。
以上内容由遇见数据集搜集并总结生成



