Synthetic vowels of speakers with Parkinson’s disease and Parkinsonism
收藏Figshare2019-10-29 更新2026-04-29 收录
下载链接:
https://figshare.com/articles/dataset/Synthetic_vowels_of_speakers_with_Parkinson_s_disease_and_Parkinsonism/7628819
下载链接
链接失效反馈官方服务:
资源简介:
The dataset contains synthesized replicas of sustained vowels /A/ and /I/ performed by healthy controls, patients with Parkinson’s disease, multiple system atrophy and progressive supranuclear palsy. The dataset can be used as a reference for evaluation of pitch detectors, detectors of modal fundamental frequency, and detectors of subharmonics.Coding system Each recording is named by a unique alphanumeric code in the format Uvxy, where U means abbreviation of the group (HC = healthy control, PD = Parkinson’s disease, MSA = multiple system atrophy, PSP = progressive supranuclear palsy) in upper case characters, v is the numeric identifier of the subject within the group, x denotes type of vowel (a = vowel /A/, i = vowel /I/), and y is the number of repetition. The part of the code of U and v uniquely determine each speaker, whereas x and y determine speaker’s recordings. Recordings All recordings are briefly described in table dataset.csv. All files of each record (see records.zip) are identified by the corresponding code and suffix. Suffix describe type of the file and is separated from the code by underscore. Naming of the files is illustrated on the record HC8a1. The code describes first repetition of the vowel /A/ performed by healthy speaker HC8. The record HC8a1 consists of following files:HC8a1.wav = waveform of the synthesized replica. This is the reference signal used for the evaluation. Parameters of jitter, shimmer and harmonic to noise ratio (HNR) can be found in dataset.csv.HC8a1_clean.wav = waveform of the synthesized replica without added noise. We provide this signal to make the model more versatile. Authors may add a different kind of noise to this signal or manipulate with HNR. Note that that both signals required normalization prior to writing into wav-file. Original scaling factor between HC8a1 and HCa1_clean can be determined from total power of signals and reference HNR value.HC8a1_LF.wav = sample of the glottal pulse used for the synthesis.HC8a2_impulses.csv = list of impulses’ locations in seconds and corresponding amplitudes. The position of pulses was corrected to match with the beginning of glottal pulse, i.e., first sample of the signal HC8a1_LF.wav begins at each of these positions. The jitter and shimmer listed in dataset.csv were are median values. Jitter and shimmer by other definitions can be calculated from positions and amplitudes of pulses provided by this file. HC8a1_subharmonics.csv = list of subharmonic intervals described by the start time in seconds and end time in seconds. Corresponding index of amplitude modulation expressed as SHR in percent can be found in the table dataset.csv. When no subharmonic was determined by the supervised parameterization, no file was included for the speaker and SHR in the dataset.csv was set to zero [1]. [1] Note that the supervised detection had lower sensitivity due to senzitivity of pitch trace in PRAAT, so the occurrence of subharmonics in synthesized data is much lower than in the original dataset analyzed by automated segmentation. This is not a problem because subharmonics were synthesized only at the given interval - this illustrates why it is important to detect subharmonics in other way than by pitch.
本数据集包含由健康对照者、帕金森病患者、多系统萎缩患者及进行性核上性麻痹患者发出的持续元音/A/与/I/的合成复现样本。本数据集可作为基音检测器、模态基频检测器及次谐波检测器的评估参考基准。
编码命名规则
每条录音均采用唯一的字母数字编码,格式为Uvxy,其中U为大写字母形式的组别缩写(HC=健康对照(HC, Healthy Control),PD=帕金森病(PD, Parkinson’s Disease),MSA=多系统萎缩(MSA, Multiple System Atrophy),PSP=进行性核上性麻痹(PSP, Progressive Supranuclear Palsy)),v为组别内受试者的数字编号,x代表元音类型(a对应元音/A/,i对应元音/I/),y为重复次数。编码中U和v部分可唯一确定每位受试者,而x和y则用于区分该受试者的不同录音。
录音文件说明
所有录音的简要说明均收录于dataset.csv表格中。每条记录(详见records.zip压缩包)的所有文件均通过对应编码与后缀进行标识。后缀用于说明文件类型,与编码之间以下划线分隔。文件命名规则可通过示例记录HC8a1进行说明:该编码代表健康受试者HC8的元音/A/的第一次复现录音。记录HC8a1包含以下文件:
HC8a1.wav:合成复现的语音波形,为本次评估所用的参考信号。基音微扰(jitter)、振幅微扰(shimmer)及谐波信噪比(HNR, Harmonic-to-Noise Ratio)的相关参数可在dataset.csv中查询。
HC8a1_clean.wav:未添加噪声的合成复现语音波形。提供该文件旨在提升模型的泛用性,研究者可自行向该信号添加不同类型的噪声或调整其谐波信噪比。需注意,两个文件在写入WAV格式前均需进行归一化处理。HC8a1与HC8a1_clean之间的原始缩放因子可通过信号总功率与参考HNR值计算得到。
HC8a1_LF.wav:合成所用声门脉冲的采样波形。
HC8a2_impulses.csv:以秒为单位的脉冲位置及对应振幅列表。脉冲位置已进行校正,以匹配声门脉冲的起始点,即信号HC8a1_LF.wav的首个采样点与每个脉冲位置对齐。dataset.csv中列出的基音微扰与振幅微扰均为中位数数值,其他定义下的微扰参数可通过该文件提供的脉冲位置与振幅信息计算得到。
HC8a1_subharmonics.csv:以秒为单位的次谐波区间起止时间列表。对应以百分比表示的幅值调制指数(SHR)可在dataset.csv表格中查询。若通过监督参数化方法未检测到次谐波,则不会为该受试者生成该文件,且dataset.csv中的SHR值将设为0[1]。
[1] 需注意,由于PRAAT软件中基音轨迹的检测灵敏度限制,本次监督检测的灵敏度较低,因此合成数据中的次谐波出现频次远低于通过自动分割分析的原始数据集。但这一情况并不影响使用,因为次谐波仅在指定区间内进行合成——这也正说明了采用非基音检测的方式识别次谐波的重要性。
创建时间:
2019-10-29



