reference-voices-enhanced
收藏Hugging Face2026-03-18 更新2026-03-20 收录
下载链接:
https://huggingface.co/datasets/laion/reference-voices-enhanced
下载链接
链接失效反馈官方服务:
资源简介:
Reference Voices Enhanced 是一个经过语音增强处理的AI语音数据集,包含2,004个经过说话者去重和质控筛选的语音样本。该数据集源自laion/ai-voices-deduplicated,所有样本均通过ClearerVoice-Studio的MossFormer2_SE_48K模型进行背景噪声消除和语音清晰度增强处理,输出格式为48kHz的WAV文件。每个样本均附有Empathic Insight Voice Plus提供的59维标注(包括55种情感评分和4种质量评分),并更新在JSON元数据文件中。数据集按性别(男性/女性/中性)和年龄(儿童/青少年/青年/成人/老年)分类组织,其中男性样本1,037个,女性样本910个,中性样本57个。原始语音样本经过严格的质量筛选(背景质量评分≥3.5,内容享受评分≥5.0),适合用于语音分类、文本转语音、说话者嵌入、情感分析等任务。数据集采用CC-BY-4.0许可协议。
Reference Voices Enhanced is an AI speech dataset processed with speech enhancement techniques, containing 2,004 speech samples that have undergone speaker deduplication and quality control screening. This dataset is derived from laion/ai-voices-deduplicated, and all samples have been processed with background noise reduction and speech clarity enhancement using the MossFormer2_SE_48K model from ClearerVoice-Studio, with the output format being 48kHz WAV files. Each sample is accompanied by 59-dimensional annotations provided by Empathic Insight Voice Plus, including 55 emotion scores and 4 quality scores, which are updated in the JSON metadata file. The dataset is organized by gender (male/female/neutral) and age group (child/adolescent/youth/adult/elderly), with 1,037 male samples, 910 female samples, and 57 neutral samples. The original speech samples underwent strict quality screening with criteria of background quality score ≥ 3.5 and content enjoyment score ≥ 5.0, making it suitable for tasks such as speech classification, text-to-speech, speaker embedding, sentiment analysis and other relevant tasks. This dataset is licensed under CC-BY-4.0.
提供机构:
LAION eV
创建时间:
2026-03-18



