EmergentTTS-Eval

Name: EmergentTTS-Eval
Creator: maas
Published: 2025-12-05 16:43:08
License: 暂无描述

魔搭社区2025-12-05 更新2025-07-26 收录

下载链接：

https://modelscope.cn/datasets/bosonai/EmergentTTS-Eval

下载链接

链接失效反馈

官方服务：

资源简介：

# EmergentTTS-Eval Dataset This dataset accompanies the paper [EmergentTTS-Eval: Evaluating TTS Models on Complex Prosodic, Expressiveness, and Linguistic Challenges Using Model-as-a-Judge](https://huggingface.co/papers/2505.23009). It contains 1645 diverse test cases designed to evaluate Text-to-Speech (TTS) models on six challenging scenarios: emotions, paralinguistics, foreign words, syntactic complexity, complex pronunciation (e.g., URLs, formulas), and questions. [Github](https://github.com/boson-ai/EmergentTTS-Eval-public) | [arXiv](https://arxiv.org/abs/2505.23009) The dataset is structured as follows: Each sample contains a category, the text to synthesize, the evolution depth, the language, and the corresponding baseline audio generated by gpt-4o-mini-tts alloy voice, against which we compute win-rate. Details on the data structure can be found in the dataset's metadata. See the linked Github repository for more details on usage and evaluation.

# EmergentTTS-Eval 数据集本数据集配套论文《EmergentTTS-Eval：基于模型即裁判（Model-as-a-Judge）范式评估文本转语音（Text-to-Speech，TTS）模型的复杂韵律、表现力与语言挑战》，论文链接为https://huggingface.co/papers/2505.23009。本数据集包含1645个多样化测试用例，旨在从六大挑战性场景维度评估TTS模型，涵盖情感表达、副语言特征、外来词汇、句法复杂度、复杂发音（如统一资源定位符（Uniform Resource Locator，URL）、公式）以及疑问句场景。 [GitHub](https://github.com/boson-ai/EmergentTTS-Eval-public) | [arXiv](https://arxiv.org/abs/2505.23009) 本数据集的结构如下：每个样本均包含类别标签、待合成文本、演进深度、语言类型，以及由gpt-4o-mini-tts合金语音生成的对应基准音频，我们将以此基准音频为参照计算胜率。数据集结构的详细说明可参阅数据集元数据；关于使用方法与评估流程的更多细节，请参阅附带的GitHub仓库。

提供机构：

maas

创建时间：

2025-07-24

5,000+

优质数据集

54 个

任务类型

进入经典数据集