Analysis of Suprasegmental Features in Interpretation for the Sino-US High-Level Strategic Dialogue 2021
收藏DataCite Commons2025-10-09 更新2026-05-05 收录
下载链接:
https://www.scidb.cn/detail?dataSetId=4ba3904e482844ffb700567df66f15e1
下载链接
链接失效反馈官方服务:
资源简介:
This dataset is derived from the public corpus of the China-US High-Level Strategic Dialogue held in Anchorage, Alaska, USA in March 2021 (link: https://weibo.com/tv/show/1034:4618002222743693), including 6 core files. The data processing follows a standardized workflow: Adobe Premiere Pro is used to convert videos into WAV-format audio (44.1kHz/16bit), with segments containing noise exceeding 2 seconds removed; iFlytek Tingjian (Version 4.0) is adopted for transcription, followed by manual verification of the text; Praat (Version 5.3.51) is used to extract suprasegmental indicators; and SPSS (Version 26.0) is applied to verify consistency (Cohen’s κ=0.85-1.00, ICC=0.92-0.96, p<0.001).US Speeches and Interpretations Comparison.docx contains 19 Segments, including 38 English speech sentences, 33 Chinese interpretation sentences, and the duration (in seconds) of each Segment. It uses Segment numbers as row labels, and "text" and "duration" as column labels. US Intonation Transmission Detailed Data.docx includes intonation data for 19 Segments and two types of tables (intonation transmission rate and semantic point comparison ratio, both with 19 records, in % as the unit), along with the calculation formula. Interpretation Duration of Both Sides.docx records the total interpretation time, interference time, and effective time (in seconds/minutes) for 22 Segments from the Chinese side and 19 Segments from the US side, with a total of 41 records and Segment numbers as row labels.Chinese Speeches and Interpretations Comparison.docx contains 22 Segments, including 97 Chinese speech sentences, 97 English interpretation sentences, and the duration (in seconds) of each Segment. It uses Segment numbers as row labels, and "text" and "duration" as column labels. Chinese Intonation Transmission Detailed Data.docx includes intonation data for 22 Segments (with the semitone formula attached) and two types of tables (transmission rate and semantic point matching ratio, both with 22 records, in % as the unit). Research Experiment Reproduction Process.docx is an operational guide, specifying tools, preprocessing steps, indicator calculation methods, and verification standards, with no tables included.There is no data missing in the dataset. Errors include initial speech recognition errors (already corrected) and a ±0.02s deviation in Praat annotation (error ≤8.2%). Praat can be downloaded from its official website (https://www.fon.hum.uva.nl/praat/), iFlytek Tingjian is an online tool (https://www.iflyrec.com/), and other software is compatible with versions of the same function.
提供机构:
Science Data Bank
创建时间:
2025-10-09



