Speech databases of typical children and children with SLI

Name: Speech databases of typical children and children with SLI
Creator: figshare
Published: 2020-09-04 12:56:21
License: 暂无描述

DataCite Commons2020-09-04 更新2024-07-25 收录

下载链接：

https://figshare.com/articles/dataset/New_draft_item/2360626

下载链接

链接失效反馈

官方服务：

资源简介：

Our Laboratory of Artificial Neural Network Applications (LANNA) in the Czech Technical University in Prague (head of the laboratory is professor Jana Tučková) collaborates on a project with the Department of Paediatric Neurology, 2nd Faculty of Medicine of Charles University in Prague and with the Motol University Hospital (head of clinic is professor Vladimír Komárek), which focuses on the study of children with SLI. The speech database contains two subgroups of recordings of children's speech from different types of speakers. The first subgroup (healthy) consists of recordings of children without speech disorders; the second subgroup (patients) consists of recordings of children with SLI. These children have different degrees of severity (1 – mild, 2 – moderate, and 3 – severe). The speech therapists and specialists from Motol Hospital decided upon this classification. The children’s speech was recorded in the period 2003-2013. These databases were commonly created in a schoolroom or a speech therapist’s consulting room, in the presence of surrounding background noise. This situation simulates the natural environment in which the children live, and is important for capturing the normal behavior of children. The database of healthy children’s speech was created as a referential database for the computer processing of children’s speech. It was recorded on the SONY digital Dictaphone (sampling frequency, fs = 16 kHz, 16-bit resolution in stereo mode in the standardized wav format) and on the MD SONY MZ-N710 (sampling frequency, fs = 44.1 kHz, 16-bit resolution in stereo mode in the standardized wav format). The corpus was recorded in the natural environment of a schoolroom and in a clinic. This subgroup contains a total of 44 native Czech participants (15 boys, 29 girls) aged 4 to 12 years, and was recorded during the period 2003–2005. The database of children with SLI was recorded in a private speech therapist’s office. The children’s speech is captured by means of a SHURE lapel microphone using the solution by the company AVID (MBox – USB AD/DA converter and ProTools LE software) on an Apple laptop (iBook G4). The sound recordings are saved in the standardized wav format. The sampling frequency is set to 44.1 kHz with 16-bit resolution in mono mode. This subgroup contains a total of 54 native Czech participants (35 boys, 19 girls) aged 6 to 12 years, and was recorded during the period 2009–2013. This package contains wav data sets for development and testing methods for detection children with SLI. Software pack:FORANA - was developed the original software FORANA for formants analysis. It is based on the MATLAB programming environment. The development of this software was mainly driven by the need to have the ability to complete formant analysis correctly and full automation of the process of extracting formants from the recorded speech signals. Development of this application is still running. Software was developed in the LANNA at CTU FEE in Prague. LABELING - the program LABELING is used for segmentation of the speech signal. It is a part of SOMLab program system. Software was developed in the LANNA at CTU FEE in Prague. PRAAT - is an acoustic analysis software. The Praat program was created by Paul Boersma and David Weenink of the Institute of Phonetics Sciences of the University of Amsterdam. Home page:http://www.praat.org or http://www.fon.hum.uva.nl/praat/. openSMILE - The openSMILE feature extration tool enables you to extract large audio feature spaces in realtime. It combines features from Music Information Retrieval and Speech Processing. SMILE is an acronym forSpeech & Music Interpretation by Large-space Extraction. It is written in C++ and is available as both a standalone commandline executable as well as a dynamic library. The main features of openSMILE are its capability of on-line incremental processing and its modularity. Feature extractor components can be freely interconnected to create new and custom features, all via a simple configuration file. New components can be added to openSMILE via an easy binary plugin interface and a comprehensive API. Citing: Florian Eyben, Martin Wöllmer, Björn Schuller: "openSMILE - The Munich Versatile and Fast Open-Source Audio Feature Extractor", In Proc. ACM Multimedia (MM), ACM, Florence, Italy, ACM, ISBN 978-1-60558-933-6, pp. 1459-1462, October 2010. doi:10.1145/1873951.1874246

布拉格捷克理工大学人工神经网络应用实验室（LANNA）由亚娜·图奇科娃（Jana Tučková）教授担任负责人，该实验室与布拉格查理大学第二医学院儿科神经科、莫托尔大学医院（诊所负责人为弗拉基米尔·科马雷克（Vladimír Komárek）教授）合作开展了一项针对特定语言障碍（Specific Language Impairment，SLI）儿童的研究项目。本语音数据库包含两类来自不同说话者的儿童语音录音子群。第一子群（健康组）为无言语障碍儿童的语音录音；第二子群（患者组）为SLI儿童的语音录音。这些患儿按病情严重程度分为3级：1级（轻度）、2级（中度）与3级（重度），该分级由莫托尔医院的言语治疗师与临床专家确定。儿童语音的录制时间为2003年至2013年，录音场景多为教室或言语治疗师诊室，伴有周围背景噪声，以此模拟儿童的真实生活环境，便于捕捉儿童的自然行为表现。其中健康组儿童语音数据库作为儿童语音计算机处理的参考数据库构建：采用索尼（SONY）数字录音笔（采样频率fs=16 kHz，立体声模式下16位分辨率，标准wav格式）以及索尼MD MZ-N710（采样频率fs=44.1 kHz，立体声模式下16位分辨率，标准wav格式）进行录制。该子群的录音场景涵盖教室与诊所，共包含44名母语为捷克语的受试者（15名男孩，29名女孩），年龄区间为4至12岁，录制时间为2003年至2005年。 SLI儿童语音数据库录制于私人言语治疗师诊室，采用舒尔（SHURE）领夹式麦克风，搭配AVID公司的方案（MBox——USB AD/DA转换器与ProTools LE软件），使用苹果（Apple）笔记本电脑（iBook G4）采集语音，音频以标准wav格式存储，采样频率为44.1 kHz，单声道模式下16位分辨率。该子群共包含54名母语为捷克语的受试者（35名男孩，19名女孩），年龄区间为6至12岁，录制时间为2009年至2013年。本数据集包包含用于开发和测试SLI儿童检测方法的wav格式数据集。软件套件： 1. FORANA：我们开发了用于共振峰分析的原生软件FORANA，其基于MATLAB编程环境开发。开发该软件的核心需求是实现语音信号共振峰分析的准确性与提取过程的全自动化，目前该应用仍在持续开发中。本软件由布拉格CTU FEE的LANNA实验室开发。 2. LABELING：LABELING程序用于语音信号的分割，属于SOMLab程序系统的组成部分，同样由布拉格CTU FEE的LANNA实验室开发。 3. Praat：Praat是一款声学分析软件，由阿姆斯特丹大学语音科学研究所的保罗·博尔斯马（Paul Boersma）与戴维·温金克（David Weenink）开发。官方主页：http://www.praat.org 或 http://www.fon.hum.uva.nl/praat/。 4. openSMILE：openSMILE特征提取工具支持实时提取大规模音频特征空间，整合了音乐信息检索与语音处理领域的特征。SMILE是“Speech & Music Interpretation by Large-space Extraction（基于大空间提取的语音与音乐解读）”的首字母缩写。该工具采用C++编写，可作为独立命令行可执行程序或动态库使用。openSMILE的核心特性为在线增量处理能力与模块化设计：可通过简单配置文件自由连接特征提取组件以构建自定义特征，同时支持通过便捷的二进制插件接口与完善的应用程序编程接口（API）添加新组件。引用信息：Florian Eyben, Martin Wöllmer, Björn Schuller: "openSMILE - The Munich Versatile and Fast Open-Source Audio Feature Extractor", In Proc. ACM Multimedia (MM), ACM, Florence, Italy, ACM, ISBN 978-1-60558-933-6, pp. 1459-1462, October 2010. doi:10.1145/1873951.1874246

提供机构：

figshare

创建时间：

2016-02-18

5,000+

优质数据集

54 个

任务类型

进入经典数据集