Nexdata | Multilingual Speech Synthesis Data | 400 Hours | TTS Data|Audio Data |AI & ML Training Data

Datarade2024-04-19 收录

下载链接：

https://datarade.ai/data-products/nexdata-multilingual-speech-synthesis-data-400-hours-a-nexdata

下载链接

链接失效反馈

官方服务：

资源简介：

1. Specifications Format : 44.1 kHz/48 kHz, 16bit/24bit, uncompressed wav, mono channel. Recording environment : professional recording studio. Recording content : general narrative sentences, interrogative sentences, etc. Speaker : native speaker Annotation Feature : word transcription, part-of-speech, phoneme boundary, four-level accents, four-level prosodic boundary. Device : Microphone Language : American English, British English, Japanese, French, Dutch,Mandarin Chinese, Catonese, Canadian French,Australian English, Italian, New Zealand English, Spanish, Mexican Spanish Application scenarios : speech synthesis Accuracy rate: Word transcription: the sentences accuracy rate is not less than 99%. Part-of-speech annotation: the sentences accuracy rate is not less than 98%. Phoneme annotation: the sentences accuracy rate is not less than 98% (the error rate of voiced and swallowed phonemes is not included, because the labelling is more subjective). Accent annotation: the word accuracy rate is not less than 95%. Prosodic boundary annotation: the sentences accuracy rate is not less than 97% Phoneme boundary annotation: the phoneme accuracy rate is not less than 95% (the error range of boundary is within 5%) 2. About Nexdata Nexdata owns off-the-shelf 200,000 hours of speech recognition data, 800TB of image/video data, about 2 billion pieces of NLP data. These ready-to-go AI & ML Training Data support instant delivery, quickly improve the accuracy of AI models. For more details, please visit us at https://www.nexdata.ai/speechSynthesis?source=Datarade

提供机构：

Nexdata

搜集汇总

数据集介绍

背景与挑战

背景概述

该数据集是一个400小时的多语言语音合成数据集，包含英语、日语、法语、中文等多种语言，录音在专业录音室完成，采用高音质格式。数据提供了详细的文本和语音标注，如词性、音素边界和韵律信息，各项标注准确率均超过95%，适用于语音合成和AI模型训练。

以上内容由遇见数据集搜集并总结生成

5,000+

优质数据集

54 个

任务类型

进入经典数据集