five

Preprocessed Data and Pretrained Models for Zero-Shot Multi-Speaker Text-To-Speech with State-of-the-art Neural Speaker Embeddings

收藏
NIAID Data Ecosystem2026-03-13 收录
下载链接:
https://zenodo.org/record/6349896
下载链接
链接失效反馈
官方服务:
资源简介:
This is preprocessed data and pretrained models from two of our papers: "Zero-Shot Multi-Speaker Text-To-Speech with State-of-the-art Neural Speaker Embeddings," by Erica Cooper, Cheng-I Lai, Yusuke Yasuda, Fuming Fang, Xin Wang, Nanxin Chen, and Junichi Yamagishi. (ICASSP 2020) https://arxiv.org/abs/1910.10838  "Pretraining Strategies, Waveform Model Choice, and Acoustic Configurations for Multi-Speaker End-to-End Speech Synthesis," by Erica Cooper, Xin Wang, Yi Zhao, Yusuke Yasuda, and Junichi Yamagishi. (arXiv) https://arxiv.org/abs/2011.04839 This data is meant to be used with our open-source implementation, which can be found here:  https://github.com/nii-yamagishilab/multi-speaker-tacotron More information about the directory structure and how to use the data can be found in the READMEs on GitHub.
创建时间:
2022-03-29
二维码
社区交流群
二维码
科研交流群
商业服务