five

Nexdata | Multilingual Unsupervised Speech Data |1 Million Hours | Spontaneous Speech | LLM | Pre-training |Large Language Model(LLM) Data

收藏
Datarade2025-01-04 收录
下载链接:
https://datarade.ai/data-products/nexdata-multilingual-unsupervised-speech-data-1-million-ho-nexdata
下载链接
链接失效反馈
官方服务:
资源简介:
1. Specifications Format: 16k Hz, 16 bit, wav, mono channel Content category: Dialogue or monologue in several common domains, such as daily vlogs, travel, podcast, technology, beauty, etc Language: English(USA, UK, Canada, Australia, India, Philippine, etc.), French, German, Japanese, Arabic(MSA, Gulf, Levantine, Egyptian accents, etc.), Mandarin, etc. Recording condition: Mixed(indoor, public place, entertainment,etc.) 2. About Nexdata Nexdata owns off-the-shelf PB-level Large Language Model(LLM) Data, 1 million hours of speech data and 800TB of Annotated Imagery Data. These ready-to-go data supports instant delivery, quickly improve the accuracy of AI models. For more details, please visit us at https://www.nexdata.ai/datasets/speechrecog?source=Datarade
提供机构:
Nexdata
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作