Abnormal sound detection for pumped storage units based on improved ViT model
收藏中国科学数据2026-03-30 更新2026-04-25 收录
下载链接:
https://www.sciengine.com/AA/doi/10.16232/j.cnki.1001-4179.2026.03.029
下载链接
链接失效反馈官方服务:
资源简介:
To address the challenges of frequent operating condition changes, limited fault acoustic samples, and data imbalance in the anomaly detection of pumped storage units, this paper proposes an improved Vision Transformer (ViT)-based method for abnormal acoustic signal detection. First, the Mel-spectrogram algorithm was employed to convert one-dimensional acoustic signals into two-dimensional spectrograms, enriching the information content of the fault samples. These spectrograms were then fed into the ViT network, which leveraged the interaction mechanism between the self-attention layers and image features to learn features that were invariant across multiple operating conditions. Furthermore, a domain prompt and prompt adaptation module was introduced. This module predicts the unit′s status in the target domain by assessing feature similarities between the source and target domains. Experimental results on a real-world dataset demonstrate that the proposed method achieves an average accuracy of 90.0%, a recall of 87.9%, and an F1-score of 0.887. On the MIMII public dataset, it outperforms other comparative methods, improving accuracy by 8.7%, recall by 6.92%, and F1-score by 4.52% on average. Therefore, the proposed model effectively accomplishes anomaly detection tasks under conditions of multiple operating states and limited fault samples.
针对抽水蓄能机组异常检测中面临的工况频繁变化、故障声学样本稀缺以及数据分布不平衡等挑战,本文提出一种改进的基于Vision Transformer(ViT)的异常声学信号检测方法。首先,采用梅尔频谱图(Mel-spectrogram)算法将一维声学信号转换为二维频谱图,以丰富故障样本的信息含量;随后将这些频谱图输入至ViT网络,该网络利用自注意力层与图像特征间的交互机制,学习可在多种工况下保持不变的特征表示。此外,本文引入了领域提示(domain prompt)与提示适配模块,该模块通过评估源域与目标域之间的特征相似度,预测目标域内机组的运行状态。在真实数据集上的实验结果表明,所提方法的平均准确率可达90.0%、召回率为87.9%、F1值为0.887;在MIMII公开数据集上,该方法相较于其他对比方法性能更优,平均可将准确率提升8.7%、召回率提升6.92%、F1值提升4.52%。综上,所提模型可有效完成多运行状态且故障样本稀缺场景下的异常检测任务。
创建时间:
2026-03-30



