多模态语音分割行为语料库
收藏arXiv2018-05-11 更新2024-06-21 收录
下载链接:
https://git.io/eyeseg-data
下载链接
链接失效反馈官方服务:
资源简介:
多模态语音分割行为语料库是由德国萨尔兰大学多模态计算与交互研究所创建,旨在记录专家在进行语音分割任务时的行为数据。该数据集包含4277条记录,涵盖了专家的凝视信息、音频回放、视频及屏幕录制等多模态数据。创建过程中,使用了Tobii TX300眼动追踪器记录专家的凝视行为,并通过TobiiStudio软件记录屏幕内容和音频。该数据集主要用于研究人类专家如何利用多模态信息进行语音分割,以及如何通过模拟这些行为来提高自动语音分割的准确性。
The Multimodal Speech Segmentation Behavior Corpus was developed by the Multimodal Computing and Interaction Research Institute of Saarland University, Germany, to record behavioral data of experts engaged in speech segmentation tasks. Comprising 4277 records, this corpus covers multimodal data including experts’ gaze information, audio playback, video recordings, and screen recordings. During its development, experts’ gaze behaviors were recorded using a Tobii TX300 eye tracker, while screen content and audio were captured via TobiiStudio software. This corpus is primarily used to study how human experts leverage multimodal information to perform speech segmentation, and how simulating these behaviors can improve the accuracy of automatic speech segmentation.
提供机构:
德国萨尔兰大学多模态计算与交互研究所
创建时间:
2017-12-13



