Non-acoustic Speech Dataset
收藏NIAID Data Ecosystem2026-03-14 收录
下载链接:
https://zenodo.org/record/7090119
下载链接
链接失效反馈官方服务:
资源简介:
Non-acoustic speech sensing system based on flexible piezoelectric
Version 1.0.0
This Read_Me.txt file briefly describes the non-acoustic speech dataset and instructions to access it.
The non-acoustic speech sensing system based on flexible piezoelectric is designed to satisfy specific needs around testing device models (in high-noise, complex environments). The system collected vibration signals from the jaws of six males and five females containing ten different control commands at 90 dB of background noise. The dataset is reliable with high intelligibility and is able to achieve 93.7% recognition accuracy by calculation. In general, this paper provides a non-acoustic speech dataset for Mandarin, including the parts collected, the number of people collected, and the environment.
The dataset is available at:
https://10.5281/zenodo.7090120
The data descriptor paper with details of data collection and cleaning process is under submission. For proper citation of the manuscript, please refer to the latest version of this dataset which includes the details.
This dataset and its descriptor paper were created by:
Shiji Yuan, Ying Sun, Dezhi Zheng, Xinlei Chen, Ying Ding,Shuai Wang, Shangchun Fan
For questions or suggestions, please e-mail Dezhi Zheng
Description:
Ten common words were chosen as the core of the vocabulary in this dataset. These ten command words can be used for commands in IoT or robotics applications: "forward", "backward", "right", "left", "stop", "up", "down", "draw", "drop", and "reset".
The recording software is Adobe Audition2022,which adopts monophonic recording, 16-bit storage format, 16 kHz sampling frequency, and the recorded voice is saved in wav format. The dataset is provided with two storage rules, which are stored by subject number and corpus number as classification. In the first rule, the speech data of 11 subjects were stored in different folders with the subject serial number as the folder name. Each folder contains subfolders categorized by corpus. In the second rule, the speech data of ten corpus are stored in different folders, and the names of the folders are the corpus contents. The subject number, corpus number and record order are given for each data entry. For example, the data obtained when subject one recorded corpus 10 for the first time was labeled as 1-10_1.
After the data collection process, a filtering algorithm for automatic detection of low non-acoustic speech data is designed to remove problematic data that are very short or very quiet. The script of the data filtering algorithm is provided in this repository.
For specific detail of the data filtering process, please refer to the script (speech data filtering algorithm in MATLAB) in this repository and the data descriptor paper.
The dataset in this repository is the processed version. The raw dataset and removed audio files are not included in this repository.
File list:
Non-acoustic Speech Dataset.zip
speech data filtering algorithm.zip
Readme.txt
创建时间:
2022-09-20



