Audio data preprocessing and prediction of phased operation times of control and protection switch
收藏DataCite Commons2025-04-27 更新2025-04-16 收录
下载链接:
https://www.scidb.cn/detail?dataSetId=dd98b6dc629e40c1ba8f9aee506c6862
下载链接
链接失效反馈官方服务:
资源简介:
Use Python to predict the number of switch operations. 1. Use wavelet denoising algorithm to denoise audio signals to reduce the impact of noise on subsequent analysis. 2. Use a bandpass filter (with 95% energy frequency as high cut-off value and 1% energy frequency as low cut-off value) to reduce the influence of factors such as season and radio signals. 3. Normalization processing 4. Feature extraction selected 'peak to peak value', 'effective value', 'margin index', 'kurtosis index', 'skewness index', 'centroid frequency', 'frequency standard deviation', 'energy percentage of the first sub-band', 'energy percentage of the second sub-band', 'energy percentage of the third sub-band', and 'information entropy'. 5. Preprocess the data. For duplicate values, delete duplicate rows and retain the first occurrence row; For missing values, fill in with column averages; For outliers, if the proportion of outliers does not exceed the minimum threshold of 10%, the 99th and 1st percentiles are used for processing. If the proportion of outliers exceeds the maximum threshold of 50%, the feature data in that column is deleted. If the proportion of outliers is between the two, the sample data in that row is deleted. 6. Discretization 7. Calculate the feature correlation coefficient and retain feature quantities with a correlation coefficient greater than 0.2 with the number of operations. Based on this, for feature quantities with a correlation coefficient greater than 0.8 (excluding columns with fewer operations), further remove feature quantities with lower correlation coefficients with the number of operations. 8. Draw a quantity distribution map and view the trend of feature quantity changes. 9. Six models including logistic regression, random forest, support vector machine, K-nearest neighbor, gradient boosting, and Bayesian classification were selected for training and prediction, with default model parameters. 10. Evaluate using accuracy, precision, recall, and F1 score. 11. We have drawn performance charts for each model. The horizontal axis represents various machine learning models, and the vertical axis represents various evaluation indicators, 12. Draw the cumulative correct discrimination rate. Contains 3 files: forecast.py、wav_and_power.py、noise.py。 Among them, forecast. py implements the content described above. Waf_and_power.exe implements the time-domain waveform and power spectrum corresponding to each number of operations (note: the program for power spectrum needs to remove #). Noise.py compares the new switch without signal preprocessing with signal preprocessing. File placement location: D:\program The dataset involved in the program: https://doi.org/10.57760/sciencedb.16554 (File placement location: D:\program)
提供机构:
Science Data Bank
创建时间:
2024-11-18



