Mobile Device Voice Recordings at King's College London (MDVR-KCL) from both early and advanced Parkinson's disease patients and healthy controls

Mendeley Data2024-03-27 更新2024-06-28 收录

下载链接：

https://zenodo.org/record/2867216

下载链接

链接失效反馈

官方服务：

资源简介：

Dataset description The dataset description will start with describing the local conditions and other metadata, then will continue with describing the recording procedure and annotation methodology. Finally, a brief description of the dataset deployment and publication will be given. Meta Information The dataset was recorded at King's College London (KCL) Hospital, Denmark Hill, Brixton, London SE5 9RS in the period from 26 to 29 September 2017. We used a typical examination room with about ten square meters area and a typical reverberation tome of approx. 500ms to perform the voice recordings. Due to the fact, that the voice recordings are performed in the realistic situation of doing a phone call (i.e. participant holds the phone to the preferred ear and microphone is in direct proximity to the mouth), one can assume that all recordings were performed within the reverberation radius and thus can be considered as “clean”. Recording Procedure We used a Motorola Moto G4 Smartphone as recording device. To perform the voice recordings on the device, we developed a “Toggle Recording App”, which uses the same functionalities as the voice recording module used within the i-PROGNOSIS Smartphone application, but deployed as a standalone android application. This means, that the voice capturing service runs as a standalone background service on the recording device and triggers voice recordings via on- and off-hook signals of the Smartphone. Due to the fact, that we directly record the microphone signal, and not the GSM (“Global System for Mobile Communications”) compressed stream, we end up with high quality recordings with a sample rate of 44.1 kHz and a bit depth of 16 Bit (audio CD quality). The raw, uncompressed data is directly written to the external storage of the Smartphone (SD-card) using the well-known WAVE file format (.wav). We used the following workflow to perform a voice recording: Ask the participant to relax a bit and then to make a phone call to the test executor (off-hook signal triggered).} Ask the participant to read out “The North Wind and the Sun” Depending on the constitution of the participant either ask to read out “Tech. Engin. Computer applications in geography snippet” Start a spontaneous dialog with the participant, the test executor starts asking random questions about places of interest, local traffic, or personal interests if acceptable. Test executor ends call by farewell (on-hook signal triggered). Annotation Scheme For each HC and PD participant, we labeled the data regarding scores on the Hoehn & Yahr (H&Y), as well as the UPDRS II part 5 and UPDRS III part 18 scale. The voice recordings are labeled in the following scheme: SI_ HS_ HYR_ UPDRS II-5_UPDRS III-18 with SI as subject identification in the form IDNN, N in [0, 9] HS as the health status label (hc or pd accordingly) HYR as the expert assessed H&Y scale rating UPDRS II-5 as the according expert peer-reviewed score UPDRS III-18 as the according expert assessed score For example, an audio recording with the file name “ID02_pd_1_2_1.wav” represents a recording of the third participant (First participant was anonymized as ID00), which has PD and a H&Y rating of 1, a UPDRS II-5 score of 2 and a UPDRS III-18 score of 1. At this point, it should be noted, that also all healthy controls were evaluated with regard to the introduced scales, because Parkinson's disease and voice degradation correlate, but don't match exactly. This means, that the data set includes one HC participant (ID31) with UPDRS II-5 and III-18 rating of 1, and also includes PD patients with UPDRS II-5 and III-18 ratings of 0. It should be emphasized, that this does not mean the data set includes ambiguous information, but that an expert was not able to hear voice degradation that would end up in a UPDRS rating greater than zero. Machine learning approaches may be able to nevertheless classify correctly, or at least learn to correlate, but not match PD and voice degradation at any time. Appendix North Wind and the Sun (Orthographic Version): “The North Wind and the Sun were disputing which was the stronger, when a traveler came along wrapped in a warm cloak. They agreed that the one who first succeeded in making the traveler take his cloak off should be considered stronger than the other. Then the North Wind blew as hard as he could, but the more he blew the more closely did the traveler fold his cloak around him; and at last the North Wind gave up the attempt. Then the Sun shone out warmly, and immediately the traveler took off his cloak. And so the North Wind was obliged to confess that the Sun was the stronger of the two.” BNC – Tech. Engin. Computer applications in geography snippet: “[...] This is because there is less scattering of blue light as the atmospheric path length and consequently the degree of scattering of the incoming radiation is reduced. For the same reason, the sun appears to be whiter and less orange-coloured as the observer's altitude increases; this is because a greater proportion of the sunlight comes directly to the observer's eye. Figure 5.7 is a schematic representation of the path of electromagnetic energy in the visible spectrum as it travels from the sun to the Earth and back again towards a sensor mounted on an orbiting satellite. The paths of waves representing energy prone to scattering (that is, the shorter wavelengths) as it travels from sun to Earth are shown. To the sensor it appears that all the energy has been reflected from point P on the ground whereas, in fact, it has not, because some has been scattered within the atmosphere and has never reached the ground at all. [...]”

数据集说明本数据集说明将首先介绍采集环境与其他元数据，随后阐述录制流程与标注方法，最后简要说明数据集的部署与发布情况。元数据本数据集于2017年9月26日至29日期间，在伦敦布里克斯顿丹麦山的伦敦国王学院（King's College London, KCL）医院（邮编SE5 9RS）采集。我们使用一间面积约10平方米的标准诊室，其典型混响时间约为500ms，用于开展语音录制。由于本次语音录制模拟真实通话场景（即受试者将手机贴至惯用耳侧，麦克风紧邻口腔），可认为所有录制均处于混响半径范围内，因此可将其视为“干净”语音。录制流程我们采用摩托罗拉Moto G4智能手机作为录制设备。为在该设备上执行语音录制，我们开发了一款“切换录制应用”，其功能与i-PROGNOSIS智能手机应用内的语音录制模块一致，但作为独立安卓应用部署。这意味着语音捕获服务以独立后台服务的形式运行于录制设备，并通过智能手机的摘机、挂机信号触发录制。由于我们直接录制麦克风信号，而非全球移动通信系统（Global System for Mobile Communications, GSM）压缩流，最终得到的录制文件音质优异，采样率为44.1 kHz，位深度为16比特（符合音频CD标准）。原始未压缩数据以通用WAVE文件格式（.wav）直接写入智能手机的外部存储（SD卡）。我们采用以下流程执行语音录制： 1. 请受试者稍作放松，随后拨打测试人员的电话（触发摘机信号）； 2. 请受试者朗读《北风与太阳》； 3. 根据受试者情况，亦可请其朗读“科技工程：地理领域计算机应用片段”； 4. 若受试者允许，与受试者开展自发对话，测试人员可随机询问名胜古迹、本地交通或个人兴趣相关问题； 5. 测试人员以道别结束通话（触发挂机信号）。标注方案针对每一位健康对照（Health Control, HC）与帕金森病（Parkinson's Disease, PD）受试者，我们依据霍恩-亚尔（Hoehn & Yahr, H&Y）量表评分、统一帕金森病评定量表（Unified Parkinson's Disease Rating Scale, UPDRS）第二部分第5项及第三部分第18项评分对数据进行标注。语音录制文件采用以下命名规则：`SI_ HS_ HYR_ UPDRS II-5_UPDRS III-18`，其中： - SI：受试者标识，格式为IDNN，其中N∈[0,9]； - HS：健康状态标签，分别为hc（健康对照）或pd（帕金森病患者）； - HYR：专家评估的霍恩-亚尔量表评级； - UPDRS II-5：经专家同行评审的对应量表评分； - UPDRS III-18：经专家评估的对应量表评分。例如，文件名"ID02_pd_1_2_1.wav"对应的录音来自第3位受试者（首位受试者匿名化为ID00），该受试者为帕金森病患者，霍恩-亚尔评级为1，UPDRS II-5评分为2，UPDRS III-18评分为1。需特别说明的是，所有健康对照受试者均接受了上述量表评估，这是因为帕金森病与语音退化存在相关性，但并非完全对应。这意味着本数据集包含1位健康对照受试者（ID31），其UPDRS II-5与III-18评分均为1；同时也包含部分帕金森病患者，其UPDRS II-5与III-18评分为0。需强调的是，这并不代表数据集存在歧义信息，而是指专家无法通过听觉识别出会导致UPDRS评分大于0的语音退化。尽管如此，机器学习方法仍可实现准确分类，或至少学习到语音特征与帕金森病的相关性，即便二者并非始终完全匹配。附录《北风与太阳》（正字版）：“北风与太阳曾争论谁更强大，这时一名裹着厚斗篷的旅人路过。他们约定，谁能最先让旅人脱下斗篷，谁就更强大。于是北风使出全力猛吹，但风越大，旅人就把斗篷裹得越紧；最终北风放弃了尝试。随后太阳暖暖地照耀着，旅人立刻脱下了斗篷。就这样，北风不得不承认太阳比它更强大。” BNC——科技工程：地理领域计算机应用片段：“[...] 这是因为随着大气路径长度缩短，入射辐射的散射程度随之降低，蓝光的散射也随之减少。同理，随着观测者海拔升高，太阳看起来会更白、偏橙色更少，这是因为更多太阳光直接到达观测者眼中。图5.7为电磁能量在可见光谱中从太阳传播至地球，再传回至轨道卫星上的传感器的路径示意图。图中展示了易发生散射的波（即较短波长）从太阳传播至地球的路径。对于传感器而言，所有能量似乎都从地面P点反射而来，但实际上并非如此，因为部分能量已在大气中发生散射，根本未曾到达地面。[...]”

创建时间：

2023-06-28

搜集汇总

数据集介绍

背景与挑战

背景概述

该数据集包含早期和晚期帕金森病患者以及健康对照者的高质量移动设备语音录音，采样率为44.1 kHz，位深度为16位。录音内容涵盖朗读文本和自发对话，并标注了健康状态、Hoehn & Yahr评分以及UPDRS评分，适用于帕金森病早期检测的机器学习研究。

以上内容由遇见数据集搜集并总结生成