MohamedRashad/multilingual-tts

Name: MohamedRashad/multilingual-tts
Creator: MohamedRashad
Published: 2023-12-12 21:04:06
License: 暂无描述

Hugging Face2023-12-12 更新2024-03-04 收录

下载链接：

https://hf-mirror.com/datasets/MohamedRashad/multilingual-tts

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: gpl-3.0 dataset_info: features: - name: text dtype: string - name: speaker dtype: string - name: languages dtype: string - name: audio dtype: audio splits: - name: train num_bytes: 1561588634.72 num_examples: 25540 download_size: 1548036818 dataset_size: 1561588634.72 task_categories: - text-to-speech language: - ar - en - zh - es - fr - hi - ru - pt - ja - de - tr - bn - id - ur - vi pretty_name: Multilingual TTS size_categories: - 10K<n<100K --- # Before Anything and Everything ⚱ _In the time of writing this Dataset Card, ~**17,490**~ **18,412** civilian has been killed in Palestine (~**7,870**~ **8,000** are children and ~**6,121**~ **6,200** are women)._ **Se**ek **a**ny **n**on-**pro**fit **organi**zation **t**o **he**lp **th**em **wi**th **wh**at **y**ou **c**an (For myself, [I use Mersal](https://www.every.org/mersal/f/support-humanitarian)) 🇵🇸 ## Dataset Description The Multilingual TTS dataset is an exceptional compilation of text-to-speech (TTS) samples, meticulously crafted to showcase the richness and diversity of human languages. This dataset encompasses a variety of real-world sentences in fifteen prominent languages, carefully chosen to reflect global linguistic diversity. Each sample is accompanied by its corresponding high-quality audio output. <style> .image-container { display: flex; justify-content: center; align-items: center; height: 65vh; margin: 0; } .image-container img { max-width: 48%; /* Adjust the width as needed */ height: auto; } </style> <div class="image-container"> <img src="https://cdn-uploads.huggingface.co/production/uploads/6116d0584ef9fdfbf45dc4d9/UX0s8S2yWSJ3NbbvmOJOi.png"> <img src="https://cdn-uploads.huggingface.co/production/uploads/6116d0584ef9fdfbf45dc4d9/zIyPCWH7Y58gLVCeIfq4n.png"> </div> ## Key Features: 1. **Language Diversity**: The dataset covers a spectrum of languages, including **Beng**ali, **Mand**arin **Chin**ese, **Turk**ish, **Hin**di, **Fre**nch, **Vietn**amese, **Portu**guese, **Span**ish, **Japa**nese, **Ger**man, **Russ**ian, **Indon**esian, **Stan**dard **Ara**bic, **Engl**ish, **a**nd **Ur**du. This wide linguistic representation ensures inclusivity and applicability to a global audience. 3. **Real-World Sentences**: Comprising 25,000 samples, the dataset mirrors authentic communication scenarios. Sentences span diverse topics, ranging from everyday conversations to informative texts and news snippets, providing a comprehensive linguistic landscape. 4. **Multilingual Sentences**: A distinctive feature of this dataset is its inclusion of sentences that seamlessly integrate multiple languages. Each sample combines at least two languages, capturing the intricate dynamics of multilingual communication and rendering the dataset particularly valuable for training and evaluating multilingual TTS systems. 5. **Audio Quality**: Special attention has been given to the audio quality of each sample. The audio outputs are meticulously designed to be clear, natural-sounding, and faithful representations of the corresponding text, ensuring a rich auditory experience. 6. **Generated by GPT-4 and elevenlabs**: The dataset is the result of a collaboration between GPT-4 and elevenlabs, combining cutting-edge language generation capabilities with domain expertise. This collaboration guarantees a high level of accuracy, coherence, and linguistic nuance in both the text and audio components. ## Potential Use Cases: 1. **Multilingual TTS Model Training**: Researchers and developers can leverage this dataset to train and refine multilingual TTS models, enhancing their proficiency across a diverse array of languages. 2. **Cross-Language Evaluation**: The dataset serves as a valuable resource for evaluating TTS systems in handling multilingual scenarios, offering a benchmark for assessing model capabilities across different languages. 3. **Language Integration Testing**: Developers working on applications requiring multilingual TTS functionality can utilize this dataset to test and optimize language integration, ensuring a seamless user experience across various linguistic contexts. ## Acknowledgments: The creation of the Multilingual TTS dataset was made possible through the collaborative efforts of **OpenAI's GPT-4** and the expertise of **Elevenlabs Multilingual V2**. We extend our gratitude to the AI and language processing communities for their continuous support in advancing the field of multilingual TTS. This dataset stands as a significant contribution, fostering innovation and progress in language technologies.

提供机构：

MohamedRashad

原始信息汇总

数据集描述

Multilingual TTS数据集是一个精心编制的文本到语音（TTS）样本集合，旨在展示人类语言的丰富性和多样性。该数据集包含十五种主要语言的多种现实世界句子，精心挑选以反映全球语言多样性。每个样本都附有相应的高质量音频输出。

关键特性

语言多样性：数据集涵盖了一系列语言，包括孟加拉语、普通话、土耳其语、印地语、法语、越南语、葡萄牙语、西班牙语、日语、德语、俄语、印度尼西亚语、标准阿拉伯语、英语和乌尔都语。这种广泛的语言代表性确保了包容性和全球适用性。
现实世界句子：包含25,000个样本，数据集反映了真实的交流场景。句子涵盖了从日常对话到信息文本和新闻片段的广泛主题，提供了全面的语言景观。
多语言句子：该数据集的一个独特特点是其包含无缝集成多种语言的句子。每个样本至少结合了两种语言，捕捉了多语言交流的复杂动态，使得该数据集对训练和评估多语言TTS系统特别有价值。
音频质量：特别关注每个样本的音频质量。音频输出经过精心设计，清晰、自然且忠实于相应文本，确保了丰富的听觉体验。
由GPT-4和elevenlabs生成：该数据集是GPT-4和elevenlabs合作的成果，结合了尖端的语言生成能力和领域专业知识。这种合作保证了文本和音频组件的高准确性、连贯性和语言细微差别。

潜在用例

多语言TTS模型训练：研究人员和开发者可以利用该数据集训练和改进多语言TTS模型，提高其在多种语言中的熟练度。
跨语言评估：该数据集作为评估TTS系统处理多语言场景的宝贵资源，提供了评估模型在不同语言中能力的基准。
语言集成测试：开发需要多语言TTS功能的应用程序的开发者可以利用该数据集测试和优化语言集成，确保在各种语言环境中提供无缝的用户体验。

致谢

Multilingual TTS数据集的创建得益于OpenAI的GPT-4和Elevenlabs Multilingual V2的协作努力。我们感谢AI和语言处理社区在推进多语言TTS领域方面的持续支持。该数据集作为一个重要贡献，促进了语言技术的创新和进步。

搜集汇总

数据集介绍

以上内容由遇见数据集搜集并总结生成

5,000+

优质数据集

54 个

任务类型

进入经典数据集