Synthesized Speech from Virtual Patient
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/Rohan-Chaudhury/Humane-Speech-Synthesis-through-Zero-Shot-Emotion-and-Disfluency-Generation
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了由虚拟病人模拟生成的合成语音,特别是一个名为“Pastor Zimmerman”的个体,他患有物质滥用障碍。数据集采用了不同的提示语来引发具有不同情感深度和语言不流畅性的回应。此外,该数据集还包括了使用不同文本转语音模型生成的不同提示语(中性、中等和极端情感回应)的波形,这些模型包括谷歌云文本转语音、SpeechT5和MMS-TTS。该任务旨在对生成语音中的情感线索进行语音合成和评估。
This dataset contains synthetic speech generated by virtual patients, specifically for an individual named "Pastor Zimmerman" who is diagnosed with substance use disorder. The dataset employs diverse prompts to elicit responses with varying emotional depths and speech disfluencies. Additionally, it includes waveforms of prompts paired with neutral, moderate, and extreme emotional responses, generated using multiple text-to-speech (TTS) models including Google Cloud Text-to-Speech, SpeechT5, and MMS-TTS. The associated task aims to conduct speech synthesis and evaluate emotional cues present in the generated speech.
提供机构:
University of Pittsburgh School of Nursing



