five

Databoost/TTS_Multilingual_Data

收藏
Hugging Face2025-02-11 更新2025-02-15 收录
下载链接:
https://hf-mirror.com/datasets/Databoost/TTS_Multilingual_Data
下载链接
链接失效反馈
官方服务:
资源简介:
TTS_Multilingual_Data是一个大规模的多语言语料库,设计用于语言分析和语音处理模型的发展。它支持文本到语音(TTS)、自动语音识别(ASR)和说话人识别等任务。该数据集以Parquet格式组织,是训练和评估模型的关键资源,使用了针对ASR和语音技术的定制指标。数据集分为演讲与会议、对话与访谈、媒体内容与娱乐、指令与语音助手、非正式语言与常用表达、无障碍与包容性、文学与文化等主题类别。

TTS_Multilingual_Data is a large-scale multilingual corpus designed for linguistic analysis and the development of speech processing models. It supports tasks such as Text-to-Speech (TTS), Automatic Speech Recognition (ASR), and speaker identification. The dataset is organized in Parquet format and serves as a key resource for training and evaluating models, using metrics tailored to ASR and speech technologies. The dataset is categorized into thematic areas such as Speeches & Conferences, Conversations & Dialogues, Media Content & Entertainment, Instructions & Voice Assistants, Informal Language & Common Expressions, Accessibility & Inclusion, and Literature & Culture.
提供机构:
Databoost
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作