Global Acoustic Parameters Dataset: Forensic Speaker Comparison under Voice Disguise Conditions (Brazilian Portuguese)

Figshare2025-12-15 更新2026-04-28 收录

下载链接：

https://figshare.com/articles/dataset/Global_Acoustic_Parameters_Dataset_Forensic_Speaker_Comparison_under_Voice_Disguise_Conditions_Brazilian_Portuguese_/30884714

下载链接

链接失效反馈

官方服务：

资源简介：

This dataset contains a comma-separated values (CSV) file with global acoustic parameters extracted for a study on voice disguise and the robustness of acoustic descriptors in forensic speaker comparison. The data were collected as part of a Master’s thesis conducted at the Institute of Language Studies, State University of Campinas (UNICAMP), focusing on Brazilian Portuguese.The dataset comprises measurements from ten native speakers of Brazilian Portuguese (five male and five female, mean age ≈ 25 years). Participants read a standardized narrative text—an adapted excerpt from A Menina do Narizinho Arrebitado by Monteiro Lobato (public domain, approximately 1,049 words)—designed to elicit naturalistic speech while maintaining experimental control.Recordings were carried out in a sound-treated environment using a Zoom H4N PRO digital recorder at a sampling rate of 44.1 kHz and 32-bit resolution. Each speaker was recorded under seven speaking conditions: (i) natural voice (control), (ii) lowered fundamental frequency (F0), (iii) raised F0, (iv) hoarse voice, (v) nasal obstruction (holding the nose), (vi) mechanical obstruction (pencil held between the teeth), and (vii) use of an N95 mask.Global acoustic parameters were extracted using Praat software by combining outputs from two specialized scripts: the Prosody Descriptor Extractor (Barbosa, 2021), used to obtain fundamental frequency statistics, intensity, and spectral balance measures; and the Acoustic Parameters Descriptor for Forensics (APD) (Barbosa, 2018), used to extract global formant-related measures. The dataset includes F0 statistics (e.g., mean, median, standard deviation, range, quartiles, skewness, and peak measures), first-derivative F0 metrics, mean and median values for formant frequencies (F1–F4), and voice quality and intensity measures such as jitter, shimmer, harmonics-to-noise ratio (HNR), and spectral emphasis.Prior to analysis, the data were preprocessed to remove outliers using the interquartile range (IQR) method (threshold = 1.5), following the procedures described in the associated thesis.The research was funded by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES), process no. 88887.807823/2023-00. Data collection followed ethical standards for research involving human participants.

本数据集包含一份逗号分隔值（Comma-Separated Values，CSV）文件，其中包含为一项关于语音伪装与法医说话人比对中声学描述符鲁棒性的研究所提取的全局声学参数。本数据集的数据采集自坎皮纳斯州立大学（State University of Campinas，UNICAMP）语言研究所的一篇硕士学位论文，研究聚焦于巴西葡萄牙语。数据集包含10名巴西葡萄牙语母语者的声学测量数据，其中男性5名、女性5名，平均年龄约25岁。参与者需朗读一段标准化叙事文本——该文本改编自蒙泰罗·洛巴托（Monteiro Lobato）所著公有领域作品《断鼻女孩》（*A Menina do Narizinho Arrebitado*，约1049个单词）的节选段落，旨在引导参与者产出自然口语的同时保证实验控制的一致性。录音在声学处理过的环境中完成，使用Zoom H4N PRO数字录音机，采样率为44.1 kHz，比特深度为32位。每位参与者需在7种说话状态下完成录音：(i) 自然语音（对照组）；(ii) 降低基频（Fundamental Frequency，F0）；(iii) 升高F0；(iv) 沙哑嗓音；(v) 鼻腔阻塞（捏住鼻子）；(vi) 机械性阻塞（口中衔笔）；(vii) 佩戴N95口罩。全局声学参数通过Praat软件提取，整合了两个专用脚本的输出结果：其一为韵律描述符提取器（Prosody Descriptor Extractor，Barbosa，2021），用于获取基频统计量、强度及频谱平衡指标；其二为法医声学参数描述符（Acoustic Parameters Descriptor for Forensics，APD）（Barbosa，2018），用于提取全局共振峰相关参数。数据集包含基频（F0）统计量（如均值、中位数、标准差、极差、四分位数、偏度及峰值指标）、基频一阶导数指标、共振峰频率（F1–F4）的均值与中位数，以及嗓音质量与强度相关指标，如抖动（Jitter）、闪烁（Shimmer）、谐波噪声比（Harmonics-to-Noise Ratio，HNR）与频谱强调值。在分析前，按照配套硕士论文中的流程，采用四分位距（Interquartile Range，IQR）法（阈值为1.5）对数据进行预处理以剔除异常值。本研究由巴西高等教育人员发展协调局（Coordenação de Aperfeiçoamento de Pessoal de Nível Superior，CAPES）资助，项目编号为88887.807823/2023-00。数据采集遵循涉及人类受试者的研究伦理规范。

创建时间：

2025-12-15