HUME-PROSODY, HUMEVOCALBURST, MODULATE-SONATA, MODULATE-STREAM
收藏arXiv2024-03-21 更新2024-08-06 收录
下载链接:
http://arxiv.org/abs/2403.14048v1
下载链接
链接失效反馈官方服务:
资源简介:
本论文介绍了四个专门为音频机器学习任务设计的大型数据集,包括HUME-PROSODY、HUMEVOCALBURST、MODULATE-SONATA和MODULATE-STREAM。HUME-PROSODY和HUMEVOCALBURST专注于语音情感识别和非语言语音表达的研究,分别包含41小时和36小时的音频数据,来自不同国家和文化的1004和1702名说话者。MODULATE-SONATA是一个专业演员表演的情感语音数据集,包含23名演员的6小时音频。MODULATE-STREAM则是一个包含7000小时游戏流媒体音频的大型数据集,用于分析人类对动态事件的反应。这些数据集的创建旨在解决音频领域中缺乏大规模、专门化数据集的问题,特别是在人类-计算机交互和人类行为分析领域。
This paper presents four large-scale datasets specifically designed for audio machine learning tasks, namely HUME-PROSODY, HUMEVOCALBURST, MODULATE-SONATA, and MODULATE-STREAM. HUME-PROSODY and HUMEVOCALBURST focus on speech emotion recognition and non-verbal vocal expression research, containing 41 hours and 36 hours of audio data respectively, sourced from 1004 and 1702 speakers across diverse countries and cultures. MODULATE-SONATA is an emotional speech dataset featuring performances by professional actors, with 6 hours of audio from 23 actors. MODULATE-STREAM, by contrast, is a large-scale dataset encompassing 7000 hours of gaming streaming audio, intended for analyzing human responses to dynamic events. The development of these datasets aims to address the shortage of large-scale, specialized datasets in the audio domain, particularly in the fields of human-computer interaction and human behavior analysis.
提供机构:
Hume AI 纽约,美国
创建时间:
2024-03-21



