MELD-ST

Name: MELD-ST
Creator: 京都大学, 日本
Published: 2024-05-22 06:40:38
License: 暂无描述

arXiv2024-05-22 更新2024-08-06 收录

下载链接：

http://arxiv.org/abs/2405.13233v1

下载链接

链接失效反馈

官方服务：

资源简介：

MELD-ST数据集是由京都大学创建的情感感知语音翻译数据集，专注于英语到日语和英语到德语的语言对。该数据集包含约10,000条带有情感标签的语音片段，源自电视剧《老友记》，情感标签来自MELD数据集。数据集的创建过程涉及从蓝光光盘中提取音频和字幕，并使用光学字符识别工具进行文本清洗和时间戳提取。MELD-ST数据集适用于情感感知语音翻译研究，旨在通过情感标签提升翻译系统的性能，特别是在处理情感丰富的语句时。

The MELD-ST dataset is an emotion-aware speech translation dataset developed by Kyoto University, focusing on the English-to-Japanese and English-to-German language pairs. It contains approximately 10,000 speech segments with emotion labels, which are sourced from the TV series *Friends*, with the emotion labels adopted from the MELD dataset. The dataset creation workflow involves extracting audio and subtitle content from Blu-ray discs, followed by text cleaning and timestamp extraction using optical character recognition (OCR) tools. The MELD-ST dataset is designed for emotion-aware speech translation research, with the goal of enhancing the performance of translation systems by utilizing emotion labels, particularly when processing emotionally expressive utterances.

提供机构：

京都大学, 日本

创建时间：

2024-05-22

5,000+

优质数据集

54 个

任务类型

进入经典数据集