Synthetic vowels of speakers with Parkinson’s disease and Parkinsonism

Name: Synthetic vowels of speakers with Parkinson’s disease and Parkinsonism
Creator: figshare
Published: 2025-06-01 02:59:38
License: 暂无描述

DataCite Commons2025-06-01 更新2024-07-27 收录

下载链接：

https://figshare.com/articles/Synthetic_vowels_of_speakers_with_Parkinson_s_disease_and_Parkinsonism/7628819/1

下载链接

链接失效反馈

官方服务：

资源简介：

The dataset contains synthesized replicas of sustained vowels /A/ and /I/ performed by healthy controls, patients with Parkinson’s disease, multiple system atrophy and progressive supranuclear palsy. The dataset can be used as a reference for evaluation of pitch detectors, detectors of modal fundamental frequency, and detectors of subharmonics. Coding system Each recording is named by a unique alphanumeric code in the format Uvxy, where U means abbreviation of the group (HC = healthy control, PD = Parkinson’s disease, MSA = multiple system atrophy, PSP = progressive supranuclear palsy) in upper case characters, v is the numeric identifier of the subject within the group, x denotes type of vowel (a = vowel /A/, i = vowel /I/), and y is the number of repetition. The part of the code of U and v uniquely determine each speaker, whereas x and y determine speaker’s recordings. Recordings All recordings are briefly described in table dataset.csv. All files of each record (see records.zip) are identified by the corresponding code and suffix. Suffix describe type of the file and is separated from the code by underscore. Naming of the files is illustrated on the record HC8a1. The code describes first repetition of the vowel /A/ performed by healthy speaker HC8. The record HC8a1 consists of following files:HC8a1.wav = waveform of the synthesized replica. This is the reference signal used for the evaluation. Parameters of jitter, shimmer and harmonic to noise ratio (HNR) can be found in dataset.csv. HC8a1_clean.wav = waveform of the synthesized replica without added noise. We provide this signal to make the model more versatile. Authors may add a different kind of noise to this signal or manipulate with HNR. Note that that both signals required normalization prior to writing into wav-file. Original scaling factor between HC8a1 and HCa1_clean can be determined from total power of signals and reference HNR value. HC8a1_LF.wav = sample of the glottal pulse used for the synthesis. HC8a2_impulses.csv = list of impulses’ locations in seconds and corresponding amplitudes. The position of pulses was corrected to match with the beginning of glottal pulse, i.e., first sample of the signal HC8a1_LF.wav begins at each of these positions. The jitter and shimmer listed in dataset.csv were are median values. Jitter and shimmer by other definitions can be calculated from positions and amplitudes of pulses provided by this file. HC8a1_subharmonics.csv = list of subharmonic intervals described by the start time in seconds and end time in seconds. Corresponding index of amplitude modulation expressed as SHR in percent can be found in the table dataset.csv. When no subharmonic was determined by the supervised parameterization, no file was included for the speaker and SHR in the dataset.csv was set to zero [1]. [1] Note that the supervised detection had lower sensitivity due to senzitivity of pitch trace in PRAAT, so the occurrence of subharmonics in synthesized data is much lower than in the original dataset analyzed by automated segmentation. This is not a problem because subharmonics were synthesized only at the given interval - this illustrates why it is important to detect subharmonics in other way than by pitch.

本数据集包含健康对照者（healthy control, HC）、帕金森病患者（Parkinson’s disease, PD）、多系统萎缩患者（multiple system atrophy, MSA）以及进行性核上性麻痹患者（progressive supranuclear palsy, PSP）所发出的持续元音/A/与/I/的合成复制品。本数据集可作为基音检测器、模态基频检测器以及次谐波检测器的评估参考基准。 编码系统 每条录音均采用格式为Uvxy的唯一字母数字代码命名，其中U为分组缩写，使用大写字母；v为该分组内受试者的数字标识符；x表示元音类型（a对应元音/A/，i对应元音/I/）；y为重复次数。代码中U与v的组合可唯一确定每位说话者，而x与y则用于区分该说话者的不同录音。 录音文件 所有录音的简要说明可参见dataset.csv表格。每条记录（详见records.zip）的所有文件均通过对应代码与后缀进行标识，后缀用于说明文件类型，且通过下划线与代码分隔。文件命名规则可通过示例记录HC8a1进行说明：该代码代表健康对照者HC8所发出的元音/A/的首次重复录音。记录HC8a1包含以下文件：HC8a1.wav = 合成复制品的波形文件，此为评估所用的参考信号。数据集.csv中包含了该信号的抖动（jitter）、闪烁（shimmer）以及谐波信噪比（harmonic to noise ratio, HNR）相关参数。 HC8a1_clean.wav = 未添加噪声的合成复制品波形文件。我们提供该文件以提升模型的泛用性，研究者可在此文件中添加不同类型的噪声或调整HNR参数。请注意，两个波形文件在写入WAV格式前均需进行归一化处理。HC8a1与HC8a1_clean之间的原始缩放因子可通过信号总功率与参考HNR值计算得到。 HC8a1_LF.wav = 合成所用声门脉冲样本。 HC8a2_impulses.csv = 以秒为单位的脉冲位置列表及其对应振幅。脉冲位置经过校正，以匹配声门脉冲的起始点，即信号HC8a1_LF.wav的首个采样点与每个脉冲位置对齐。dataset.csv中列出的抖动与闪烁为中位数数值，其他定义下的抖动与闪烁可通过该文件提供的脉冲位置与振幅数据计算得到。 HC8a1_subharmonics.csv = 以秒为单位的次谐波区间起止时间列表，对应的以百分比表示的振幅调制指数（SHR）可参见dataset.csv表格。若通过监督式参数化未检测到次谐波，则该说话者不会生成对应文件，且dataset.csv中的SHR值将设为0[1]。 [1] 请注意，由于PRAAT中基音轨迹的灵敏度限制，本次监督式检测的灵敏度较低，因此合成数据中的次谐波出现频率远低于通过自动分割分析的原始数据集。但这并非问题所在，因为次谐波仅在指定区间内合成——这也说明了不通过基音检测来识别次谐波的重要性。

提供机构：

figshare

创建时间：

2019-10-29

搜集汇总

数据集介绍

以上内容由遇见数据集搜集并总结生成