five

toloka/CLESC

收藏
Hugging Face2024-11-18 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/toloka/CLESC
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: features: - name: audio dtype: string - name: Crowd_Worker_1 dtype: string - name: Crowd_Worker_2 dtype: string - name: Crowd_Worker_3 dtype: string - name: Expert_1 dtype: string - name: Expert_2 dtype: string - name: Expert_3 dtype: string - name: source_dataset dtype: string - name: __index_level_0__ dtype: int64 splits: - name: train num_bytes: 475376 num_examples: 500 download_size: 112382 dataset_size: 475376 configs: - config_name: default data_files: - split: train path: data/train-* license: cc-by-4.0 language: - en --- CLESC-dataset (Crowd Labeled Emotions and Speech Characteristics) is a dataset of 500 audio samples with transcriptions mixed of 2 open sourced Common Voice (100) and Voxceleb* (400) with voice features labels. We focus on annotating scalable voice characteristics such as pace (slow, normal, fast, variable), pitch (low, medium, high, variable), and volume (quiet, medium, loud, variable) as well as labeling emotions and unique voice features (free input, based on instructions provided). Curated by: Evgeniya Sukhodolskaya, Ilya Kochik (Toloka) [1] J. S. Chung, A. Nagrani, A. Zisserman VoxCeleb2: Deep Speaker Recognition INTERSPEECH, 2018. [2] A. Nagrani, J. S. Chung, A. Zisserman VoxCeleb: a large-scale speaker identification dataset INTERSPEECH, 2017
提供机构:
toloka
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作