CLESC
收藏魔搭社区2025-12-05 更新2025-12-06 收录
下载链接:
https://modelscope.cn/datasets/toloka/CLESC
下载链接
链接失效反馈官方服务:
资源简介:
CLESC-dataset (Crowd Labeled Emotions and Speech Characteristics) is a dataset of 500 audio samples with transcriptions mixed of 2 open sourced Common Voice (100) and Voxceleb* (400) with voice features labels. We focus on annotating scalable voice characteristics such as pace (slow, normal, fast, variable), pitch (low, medium, high, variable), and volume (quiet, medium, loud, variable) as well as labeling emotions and unique voice features (free input, based on instructions provided).
Curated by: Evgeniya Sukhodolskaya, Ilya Kochik (Toloka)
[1] J. S. Chung, A. Nagrani, A. Zisserman
VoxCeleb2: Deep Speaker Recognition
INTERSPEECH, 2018.
[2] A. Nagrani, J. S. Chung, A. Zisserman
VoxCeleb: a large-scale speaker identification dataset
INTERSPEECH, 2017
CLESC数据集(Crowd Labeled Emotions and Speech Characteristics,众包标注情感与语音特征数据集)包含500条带转录文本的音频样本,其转录文本来源于两个开源数据集:Common Voice(100条)与VoxCeleb*(400条),并附带语音特征标签。本数据集重点标注了可扩展的语音特征,包括语速(慢、正常、快、可变)、音调(低、中、高、可变)与音量(轻、中等、响亮、可变),同时还会基于给定指导要求标注情感与独特语音特征(支持自由文本输入)。
数据集整理者:Evgeniya Sukhodolskaya、Ilya Kochik(Toloka)
参考文献:
[1] J. S. Chung, A. Nagrani, A. Zisserman. VoxCeleb2: Deep Speaker Recognition. INTERSPEECH, 2018.
[2] A. Nagrani, J. S. Chung, A. Zisserman. VoxCeleb: a large-scale speaker identification dataset. INTERSPEECH, 2017.
提供机构:
maas
创建时间:
2025-09-15



