Preprocessed Data and Pretrained Models for Zero-Shot Multi-Speaker Text-To-Speech with State-of-the-art Neural Speaker Embeddings

NIAID Data Ecosystem2026-03-13 收录

下载链接：

https://zenodo.org/record/6349896

下载链接

链接失效反馈

官方服务：

资源简介：

This is preprocessed data and pretrained models from two of our papers: "Zero-Shot Multi-Speaker Text-To-Speech with State-of-the-art Neural Speaker Embeddings," by Erica Cooper, Cheng-I Lai, Yusuke Yasuda, Fuming Fang, Xin Wang, Nanxin Chen, and Junichi Yamagishi. (ICASSP 2020) https://arxiv.org/abs/1910.10838 "Pretraining Strategies, Waveform Model Choice, and Acoustic Configurations for Multi-Speaker End-to-End Speech Synthesis," by Erica Cooper, Xin Wang, Yi Zhao, Yusuke Yasuda, and Junichi Yamagishi. (arXiv) https://arxiv.org/abs/2011.04839 This data is meant to be used with our open-source implementation, which can be found here: https://github.com/nii-yamagishilab/multi-speaker-tacotron More information about the directory structure and how to use the data can be found in the READMEs on GitHub.

创建时间：

2022-03-29