SOMOS
收藏arXiv2022-08-24 更新2024-06-21 收录
下载链接:
https://innoetics.github.io/publications/somos-dataset/index.html
下载链接
链接失效反馈官方服务:
资源简介:
SOMOS数据集是由三星电子希腊Innoetics团队创建的,专注于神经文本到语音合成评估的大规模数据集。该数据集包含20,100条由200个不同神经TTS系统生成的合成语音样本,使用LPCNet作为统一的声码器以确保样本变化仅由声学模型决定。数据集涵盖了广泛的领域和长度,旨在通过收集的MOS自然度评价,推动自动MOS预测系统的研究。该数据集通过亚马逊Mechanical Turk平台进行大规模的MOS测试,收集了395,318个评价,其中359,380个与合成语音相关。SOMOS数据集的应用领域主要集中在提升TTS系统的评估方法,特别是在单个说话人场景下,快速评估新TTS系统的想法。
The SOMOS dataset was created by the Samsung Electronics Greece Innoetics team, serving as a large-scale resource dedicated to the evaluation of neural text-to-speech (TTS) synthesis. This dataset contains 20,100 synthetic speech samples generated by 200 distinct neural TTS systems, where LPCNet was adopted as a unified vocoder to ensure that all variations across the samples are solely determined by their respective acoustic models. The dataset covers a wide range of domains and utterance lengths, and aims to promote research on automatic mean opinion score (MOS) prediction systems through the collected naturalness MOS ratings. Large-scale MOS tests were conducted on the Amazon Mechanical Turk platform for this dataset, yielding a total of 395,318 ratings, among which 359,380 are associated with synthetic speech. The primary applications of the SOMOS dataset focus on improving TTS system evaluation methodologies, particularly for rapidly assessing new TTS systems in single-speaker scenarios.
提供机构:
三星电子,希腊
创建时间:
2022-04-07



