Comparison results on spectrogram dataset.

Figshare2024-04-26 更新2026-04-28 收录

下载链接：

https://figshare.com/articles/dataset/Comparison_results_on_spectrogram_dataset_/25703777

下载链接

链接失效反馈

官方服务：

资源简介：

Digital speech recognition is a challenging problem that requires the ability to learn complex signal characteristics such as frequency, pitch, intensity, timbre, and melody, which traditional methods often face issues in recognizing. This article introduces three solutions based on convolutional neural networks (CNN) to solve the problem: 1D-CNN is designed to learn directly from digital data; 2DS-CNN and 2DM-CNN have a more complex architecture, transferring raw waveform into transformed images using Fourier transform to learn essential features. Experimental results on four large data sets, containing 30,000 samples for each, show that the three proposed models achieve superior performance compared to well-known models such as GoogLeNet and AlexNet, with the best accuracy of 95.87%, 99.65%, and 99.76%, respectively. With 5-10% higher performance than other models, the proposed solution has demonstrated the ability to effectively learn features, improve recognition accuracy and speed, and open up the potential for broad applications in virtual assistants, medical recording, and voice commands.

创建时间：

2024-04-26

5,000+

优质数据集

54 个

任务类型

进入经典数据集