Simultaneous Interpretation (Speech Feature - Text Quality Dual - Modal Evaluation Model)

Name: Simultaneous Interpretation (Speech Feature - Text Quality Dual - Modal Evaluation Model)
Creator: li hu lin
Published: 2025-09-03 00:00:00
License: 暂无描述

科学数据银行2025-09-03 更新2026-04-23 收录

下载链接：

https://www.scidb.cn/detail?dataSetId=fa2b7718d7ff445d90ea63d64e3e4540

下载链接

链接失效反馈

官方服务：

资源简介：

This dataset corresponds to the "Speech Features—Text Quality" dual-modal assessment study of simultaneous interpretation quality. It contains six folders, and the data is derived from the English-Chinese simultaneous interpretation of President Trump’s inauguration speech in 2025, processed through multiple stages: first, audio of the source language and the simultaneous interpretations by two interpreters (14 samples from A and 10 samples from B) were obtained from YouTube. The audio was then processed using Praat 6.3.07+ for noise reduction (spectral subtraction) and cut into 24 samples based on "source language sentence units." After transcribing the simultaneous interpretation text using Google Speech-to-Text API (2024 version), it was manually proofread. Next, six speech features were extracted using Praat, and text quality was evaluated by two experts in a double-blind manner. Reliability and statistical analyses were verified using SPSS 26.0+, and data were integrated using Excel 2019, with a standard computer used for processing.The dataset focuses on the 2025 speech and simultaneous interpretation process, without spatial information, and the temporal resolution is based on individual sentences (averaging 10-20 seconds per sentence). The tabular data contains 24 records (corresponding to samples), with row labels as sample IDs and column labels including source language features (such as speech rate SPM), speech features (such as words per minute), and text quality (such as error rate %), with clear unit indications.

提供机构：

li hu lin

创建时间：

2025-09-03

5,000+

优质数据集

54 个

任务类型

进入经典数据集