five

Simultaneous Interpretation (Speech Feature - Text Quality Dual - Modal Evaluation Model)

收藏
科学数据银行2025-09-03 更新2026-04-23 收录
下载链接:
https://www.scidb.cn/detail?dataSetId=fa2b7718d7ff445d90ea63d64e3e4540
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset corresponds to the "Speech Features—Text Quality" dual-modal assessment study of simultaneous interpretation quality. It contains six folders, and the data is derived from the English-Chinese simultaneous interpretation of President Trump’s inauguration speech in 2025, processed through multiple stages: first, audio of the source language and the simultaneous interpretations by two interpreters (14 samples from A and 10 samples from B) were obtained from YouTube. The audio was then processed using Praat 6.3.07+ for noise reduction (spectral subtraction) and cut into 24 samples based on "source language sentence units." After transcribing the simultaneous interpretation text using Google Speech-to-Text API (2024 version), it was manually proofread. Next, six speech features were extracted using Praat, and text quality was evaluated by two experts in a double-blind manner. Reliability and statistical analyses were verified using SPSS 26.0+, and data were integrated using Excel 2019, with a standard computer used for processing.The dataset focuses on the 2025 speech and simultaneous interpretation process, without spatial information, and the temporal resolution is based on individual sentences (averaging 10-20 seconds per sentence). The tabular data contains 24 records (corresponding to samples), with row labels as sample IDs and column labels including source language features (such as speech rate SPM), speech features (such as words per minute), and text quality (such as error rate %), with clear unit indications.
提供机构:
li hu lin
创建时间:
2025-09-03
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作