EmergentTTS-Eval
收藏魔搭社区2025-12-05 更新2025-07-26 收录
下载链接:
https://modelscope.cn/datasets/bosonai/EmergentTTS-Eval
下载链接
链接失效反馈官方服务:
资源简介:
# EmergentTTS-Eval Dataset
This dataset accompanies the paper [EmergentTTS-Eval: Evaluating TTS Models on Complex Prosodic, Expressiveness, and Linguistic Challenges Using Model-as-a-Judge](https://huggingface.co/papers/2505.23009). It contains 1645 diverse test cases designed to evaluate Text-to-Speech (TTS) models on six challenging scenarios: emotions, paralinguistics, foreign words, syntactic complexity, complex pronunciation (e.g., URLs, formulas), and questions.
[Github](https://github.com/boson-ai/EmergentTTS-Eval-public) | [arXiv](https://arxiv.org/abs/2505.23009)
The dataset is structured as follows: Each sample contains a category, the text to synthesize, the evolution depth, the language, and the corresponding baseline audio generated by gpt-4o-mini-tts alloy voice, against which we compute win-rate. Details on the data structure can be found in the dataset's metadata. See the linked Github repository for more details on usage and evaluation.
# EmergentTTS-Eval 数据集
本数据集配套论文《EmergentTTS-Eval:基于模型即裁判(Model-as-a-Judge)范式评估文本转语音(Text-to-Speech,TTS)模型的复杂韵律、表现力与语言挑战》,论文链接为https://huggingface.co/papers/2505.23009。本数据集包含1645个多样化测试用例,旨在从六大挑战性场景维度评估TTS模型,涵盖情感表达、副语言特征、外来词汇、句法复杂度、复杂发音(如统一资源定位符(Uniform Resource Locator,URL)、公式)以及疑问句场景。
[GitHub](https://github.com/boson-ai/EmergentTTS-Eval-public) | [arXiv](https://arxiv.org/abs/2505.23009)
本数据集的结构如下:每个样本均包含类别标签、待合成文本、演进深度、语言类型,以及由gpt-4o-mini-tts合金语音生成的对应基准音频,我们将以此基准音频为参照计算胜率。数据集结构的详细说明可参阅数据集元数据;关于使用方法与评估流程的更多细节,请参阅附带的GitHub仓库。
提供机构:
maas
创建时间:
2025-07-24



