Emilia

Name: Emilia
Creator: maas
Published: 2026-05-06 10:22:43
License: 暂无描述

魔搭社区2026-05-06 更新2024-08-31 收录

下载链接：

https://modelscope.cn/datasets/amphion/Emilia

下载链接

链接失效反馈

官方服务：

资源简介：

# Emilia: An Extensive, Multilingual, and Diverse Speech Dataset for Large-Scale Speech Generation The Emilia dataset is the first open-source, multilingual, in-the-wild dataset designed for speech generation. It offers over 101,000 hours of high-quality speech data across six languages: Chinese (zh), English (en), Japanese (ja), Korean (ko), German (de), and French (fr). The dataset includes various speaking styles and their corresponding transcriptions. ## README 🚧🚧🚧🚧🚧🚧 **This repository contains only the URLs for Emilia's data sources.** **If you're interested in using the processed data—the Emilia Dataset itself—please visit [this link](https://huggingface.co/datasets/amphion/Emilia-Dataset).** ## Meta Info <image style="width: 500px; height: 500px;" src="https://huggingface.co/datasets/amphion/Emilia/resolve/main/emilia_source_url_category_distribution.png"/> ## Dataset Usage To reconstruct the Emilia dataset, you can download the raw audio files from the [provided URL list](https://huggingface.co/datasets/amphion/Emilia) and use our open-source [Emilia-Pipe](https://github.com/open-mmlab/Amphion/tree/main/preprocessors/Emilia) preprocessing pipeline to process the raw data and rebuild the dataset. Additionally, users can employ Emilia-Pipe to preprocess their own raw speech data to meet specific needs. By open-sourcing the Emilia-Pipe code, we aim to empower the speech community to collaborate on large-scale speech generation research. *Please note that Emilia does not own the copyright to the audio files; the copyright remains with the original owners of the videos or audio. Users are permitted to use this dataset only for non-commercial purposes under the CC BY-NC-4.0 license.* ## Reference If you use the Emilia dataset or the Emilia-Pipe pipeline, please cite the following papers: ```bibtex @inproceedings{emilia, author={He, Haorui and Shang, Zengqiang and Wang, Chaoren and Li, Xuyuan and Gu, Yicheng and Hua, Hua and Liu, Liwei and Yang, Chen and Li, Jiaqi and Shi, Peiyang and Wang, Yuancheng and Chen, Kai and Zhang, Pengyuan and Wu, Zhizheng}, title={Emilia: An Extensive, Multilingual, and Diverse Speech Dataset for Large-Scale Speech Generation}, booktitle={Proc.~of SLT}, year={2024} } ``` ```bibtex @inproceedings{amphion, author={Zhang, Xueyao and Xue, Liumeng and Gu, Yicheng and Wang, Yuancheng and Li, Jiaqi and He, Haorui and Wang, Chaoren and Song, Ting and Chen, Xi and Fang, Zihao and Chen, Haopeng and Zhang, Junan and Tang, Tze Ying and Zou, Lexiao and Wang, Mingxuan and Han, Jun and Chen, Kai and Li, Haizhou and Wu, Zhizheng}, title={Amphion: An Open-Source Audio, Music and Speech Generation Toolkit}, booktitle={Proc.~of SLT}, year={2024} } ```

# Emilia：一款面向大规模语音生成的多语言、多样化高质量语音数据集 Emilia数据集是首个专为语音生成设计的开源、多语言真实野外（in-the-wild）数据集。该数据集涵盖中文（zh）、英语（en）、日语（ja）、韩语（ko）、德语（de）以及法语（fr）六种语言，提供超过10.1万小时的高质量语音数据，包含多样的口语风格及其对应的转录文本。 ## 说明文档 🚧🚧🚧🚧🚧🚧 **本仓库仅收录Emilia数据集的数据源链接。** **若您希望使用处理完成的完整Emilia数据集本体，请访问[此链接](https://huggingface.co/datasets/amphion/Emilia-Dataset)。** ## 元数据 <image style="width: 500px; height: 500px;" src="https://huggingface.co/datasets/amphion/Emilia/resolve/main/emilia_source_url_category_distribution.png"/> ## 数据集使用方法若需重建Emilia数据集，您可从[提供的链接列表](https://huggingface.co/datasets/amphion/Emilia)下载原始音频文件，并通过我们开源的[Emilia-Pipe](https://github.com/open-mmlab/Amphion/tree/main/preprocessors/Emilia)预处理流水线处理原始数据，完成数据集重建。此外，用户还可借助Emilia-Pipe预处理自有原始语音数据，以适配个性化需求。我们开源Emilia-Pipe代码的初衷，是赋能语音领域社区共同推进大规模语音生成相关研究工作。 *请注意：Emilia数据集不享有音频文件的版权，版权仍归属于原视频或音频的所有者。用户仅可在CC BY-NC-4.0许可协议框架下，将本数据集用于非商业用途。* ## 引用说明若您使用Emilia数据集或Emilia-Pipe流水线，请引用以下论文： bibtex @inproceedings{emilia, author={He, Haorui and Shang, Zengqiang and Wang, Chaoren and Li, Xuyuan and Gu, Yicheng and Hua, Hua and Liu, Liwei and Yang, Chen and Li, Jiaqi and Shi, Peiyang and Wang, Yuancheng and Chen, Kai and Zhang, Pengyuan and Wu, Zhizheng}, title={Emilia: An Extensive, Multilingual, and Diverse Speech Dataset for Large-Scale Speech Generation}, booktitle={Proc.~of SLT}, year={2024} } bibtex @inproceedings{amphion, author={Zhang, Xueyao and Xue, Liumeng and Gu, Yicheng and Wang, Yuancheng and Li, Jiaqi and He, Haorui and Wang, Chaoren and Song, Ting and Chen, Xi and Fang, Zihao and Chen, Haopeng and Zhang, Junan and Tang, Tze Ying and Zou, Lexiao and Wang, Mingxuan and Han, Jun and Chen, Kai and Li, Haizhou and Wu, Zhizheng}, title={Amphion: An Open-Source Audio, Music and Speech Generation Toolkit}, booktitle={Proc.~of SLT}, year={2024} }

提供机构：

maas

创建时间：

2024-10-28

搜集汇总

数据集介绍