Data underlying the research between ASR and MT quality of Automatic Subtitling Platforms

Name: Data underlying the research between ASR and MT quality of Automatic Subtitling Platforms
Creator: 4TU.ResearchData
Published: 2023-10-18 09:29:20
License: 暂无描述

DataCite Commons2023-10-18 更新2024-07-03 收录

下载链接：

https://data.4tu.nl/datasets/7cfa296a-72b7-4460-acd4-86193b43701e/1

下载链接

链接失效反馈

官方服务：

资源简介：

In the first experiment of ASR accuracy comparison, 1 set of speech-to-text data (hereafter Veed 0 and Iflyrec 0 ) is generated after submitting the “Qantas Safety video” on “Iflyrec” and “Veed”. The reference speech-to-text data is transcribed from Qantas’ official channel on YouTube.In the second experiment of automatic subtitling translation comparison, 3 sets of data are collected and analyzed. The author uses the original speech-to-text data of “Iflyrec” and “Veed” to generate one set of automatic subtitling translations (hereafter Veed 1 and Iflyrec 1), and then inputs the speech-to-text data on these two platforms to generate the final automatic subtitling translation version (hereafter Veed 2 and Iflyrec 2). For the human translation reference, this paper uses the translation from a tutor affiliated with the Civil Aviation University of China.

在自动语音识别（Automatic Speech Recognition，ASR）准确率对比的第一项实验中，作者在「讯飞听见（Iflyrec）」与「Veed」平台提交《澳航安全宣传片》后，生成了1组语音转文字（speech-to-text）数据（以下记为Veed₀与Iflyrec₀）。基准语音转文字数据集源自YouTube平台上澳航官方频道发布内容的转录结果。在自动字幕翻译（automatic subtitling translation）对比的第二项实验中，本研究共收集并分析了3组数据。作者依托「讯飞听见（Iflyrec）」与「Veed」生成的原始语音转文字数据，生成了1组自动字幕翻译结果（以下记为Veed₁与Iflyrec₁）；随后将这两个平台的语音转文字数据输入对应系统，生成了最终自动字幕翻译版本（以下记为Veed₂与Iflyrec₂）。针对人工翻译基准，本文选用了中国民航大学一名教师完成的翻译内容。

提供机构：

4TU.ResearchData

创建时间：

2023-10-18

5,000+

优质数据集

54 个

任务类型

进入经典数据集