AniSpeech
收藏魔搭社区2025-11-27 更新2025-05-24 收录
下载链接:
https://modelscope.cn/datasets/ShoukanLabs/AniSpeech
下载链接
链接失效反馈官方服务:
资源简介:
# AniSpeech Dataset
Welcome to the AniSpeech dataset, a continually expanding collection of captioned anime voices brought to you by ShoukanLabs.
- As we label more and more audio, they'll automagically be uploaded here for use, seperated by language
---
## ANNOUNCMENTS:
- An upcoming update will add an immense ammount of data to the dataset... however... because we cannot manually go through this dataset we have had to rely on manual quality estimation, as such, speaker splits may be innacurate, this shouldnt impact finetuning multispeaker models, but when training single speaker models you may have to listen to multiple speakers to find missing data, we plan on eventually completely overhauling this dataset eventually
## Key Features
- **LJSpeech Format Compatibility:** The captions in this dataset can be converted to (recent changes have sacrificed native LJSpeech support for better captions) comply with the LJSpeech format, and we plan to offer conversion scripts to said format eventually.
- **Diverse Anime Voices:** Train your TTS models on high-quality vocal performances with variations in intonation, timbre, and pitch. The dataset offers a rich assortment of anime voices for creating generalised models.
- **Ideal for Generalized Models:** AniSpeech is a perfect choice for fine-tuning generalized models. With a diverse range of voices, it provides a solid foundation for training models that can handle a wide variety of speaking styles (all speakers are labeled with a seperate speaker id).
## Limitations
- **Single-Voice Fine-Tuning:** While AniSpeech excels in training foundation models (due to it's diversity), it's not recommended for fine-tuning on a single voice. Its strength lies in contributing to the development of versatile TTS models.
- **Dataset Curation:** Due to its size, manually curating the entire dataset can be impractical. If you encounter low-quality files or incorrect captions, we encourage you to contribute by creating a pull request to help maintain and improve the dataset.
## License
This dataset is released under the [MIT License](https://huggingface.co/datasets/ShoukanLabs/AniSpeech/raw/main/license).
Your contributions to the AniSpeech dataset are invaluable, and we appreciate your efforts in advancing the field of Text-to-Speech technology.
Happy coding and synthesizing!
# AniSpeech 数据集
欢迎使用AniSpeech数据集——由ShoukanLabs打造的持续扩容的带标注动画语音合集。
- 随着我们标注的音频数量不断增加,新标注的音频将自动上传至本数据集,并按语言分类存储。
---
## 公告:
- 即将推出的更新将为数据集新增海量音频数据……但由于无法对全量数据集进行人工审核,我们仅能依赖人工质量评估,因此说话人划分可能存在偏差。这一问题不会影响多说话人模型的微调,但在训练单说话人模型时,您可能需要跨多个说话人音频排查缺失数据。我们最终计划对本数据集进行全面重构。
## 核心特性
- **LJSpeech格式兼容**:本数据集的标注文本可转换为LJSpeech格式(近期更新为优化标注质量,暂时牺牲了原生LJSpeech支持),我们计划后续提供对应格式的转换脚本。
- **丰富的动画语音资源**:您可基于本数据集的高质量语音表演训练文本转语音(Text-to-Speech,TTS)模型,涵盖语调、音色与音高的多样变化。本数据集包含大量动画语音资源,可用于构建通用型语音合成模型。
- **适配通用模型训练**:AniSpeech是微调通用型模型的理想选择。凭借多样的语音资源,它可为训练可适配多种说话风格的模型提供坚实基础(所有说话人均配有独立的说话人ID)。
## 局限性
- **不适用于单语音微调**:尽管AniSpeech凭借丰富的语音资源在基础模型训练上表现优异,但并不推荐将其用于单说话人语音的微调。其优势在于助力构建多功能的TTS模型。
- **数据集维护**:由于数据集规模庞大,对全量数据进行人工整理并不现实。若您遇到低质量音频文件或标注错误的文本,欢迎提交拉取请求(Pull Request)参与贡献,助力数据集的维护与优化。
## 许可证
本数据集采用[MIT许可证](https://huggingface.co/datasets/ShoukanLabs/AniSpeech/raw/main/license)发布。
您对AniSpeech数据集的贡献至关重要,感谢您为推动文本转语音技术发展所付出的努力。
祝编码与语音合成工作顺利!
提供机构:
maas
创建时间:
2025-05-20



