five

SAMPLE 16kHz Conversational Speech Data | 35,000 Hours | Large Language Model(LLM) Data | Speech ...

收藏
Databricks2025-01-04 收录
下载链接:
https://marketplace.databricks.com/details/0142f1a6-dff0-4e82-86b6-eba23a07a581/Nexdata_SAMPLE-16kHz-Conversational-Speech-Data-35,000-Hours-Large-Language-Model(LLM)-Data-Speech-
下载链接
链接失效反馈
官方服务:
资源简介:
1. Specifications Format : 16kHz 16bit, uncompressed wav, mono channel; Environment : quiet indoor environment, without echo; Recording content : No preset linguistic data,dozens of topics are specified, and the speakers make dialogue under those topics while the recording is performed; Demographics : Speakers are evenly distributed across all age groups, covering children, teenagers, middle-aged, elderly, etc. Annotation : annotating for the transcription text, speaker identification, gender and noise symbols; Device : Android mobile phone, iPhone; Language : 100+ Languages; Application scenarios : speech recognition; voiceprint recognition; Accuracy rate : the word accuracy rate is not less than 98% 2. About Nexdata Nexdata owns off-the-shelf PB-level Large Language Model(LLM) Data, 1 million hours of Audio Data and 800TB of Annotated Imagery Data. These ready-to-go Machine Learning (ML) Data support instant delivery, quickly improve the accuracy of AI models. For more details, please visit us at https://www.nexdata.ai/datasets/speechrecog?source=Datarade
提供机构:
Nexdata
搜集汇总
数据集介绍
main_image_url
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作