toloka/CLESC

Name: toloka/CLESC
Creator: toloka
Published: 2024-11-18 15:10:32
License: 暂无描述

Hugging Face2024-11-18 更新2025-04-12 收录

下载链接：

https://hf-mirror.com/datasets/toloka/CLESC

下载链接

链接失效反馈

官方服务：

资源简介：

--- dataset_info: features: - name: audio dtype: string - name: Crowd_Worker_1 dtype: string - name: Crowd_Worker_2 dtype: string - name: Crowd_Worker_3 dtype: string - name: Expert_1 dtype: string - name: Expert_2 dtype: string - name: Expert_3 dtype: string - name: source_dataset dtype: string - name: __index_level_0__ dtype: int64 splits: - name: train num_bytes: 475376 num_examples: 500 download_size: 112382 dataset_size: 475376 configs: - config_name: default data_files: - split: train path: data/train-* license: cc-by-4.0 language: - en --- CLESC-dataset (Crowd Labeled Emotions and Speech Characteristics) is a dataset of 500 audio samples with transcriptions mixed of 2 open sourced Common Voice (100) and Voxceleb* (400) with voice features labels. We focus on annotating scalable voice characteristics such as pace (slow, normal, fast, variable), pitch (low, medium, high, variable), and volume (quiet, medium, loud, variable) as well as labeling emotions and unique voice features (free input, based on instructions provided). Curated by: Evgeniya Sukhodolskaya, Ilya Kochik (Toloka) [1] J. S. Chung, A. Nagrani, A. Zisserman VoxCeleb2: Deep Speaker Recognition INTERSPEECH, 2018. [2] A. Nagrani, J. S. Chung, A. Zisserman VoxCeleb: a large-scale speaker identification dataset INTERSPEECH, 2017

提供机构：

toloka

5,000+

优质数据集

54 个

任务类型

进入经典数据集