Dataset for Machine Learning-Based Photovoltaic Arc-Fault Detection Using FFT and Preprocessed Current\/Voltage Signals
收藏IEEE2026-04-17 收录
下载链接:
https://ieee-dataport.org/documents/dataset-machine-learning-based-photovoltaic-arc-fault-detection-using-fft-and
下载链接
链接失效反馈官方服务:
资源简介:
This dataset contains raw and processed measurements acquired during controlled DC series arc-fault experiments for photovoltaic (PV) systems. Sixteen per-experiment CSV files (processed_signals\/) record time, current (CH1), and voltage (CH2) at a 4 \u03bcs sampling interval under IEC 63027\u2013compliant conditions. From these signals, fixed 200-sample windows were extracted, DC-removed and Hann-windowed, and converted into 204 time- and frequency-domain predictors (including FFT magnitudes). A binary label {Normal, Arc} was assigned to each window, yielding over 1.2 M labeled instances. Engineered features are provided both as CSV and as Apache Parquet (fft_features\/experiments_fft.parquet) for fast, typed, columnar access. For reproducibility, we additionally include variant feature sets with Min\u2013Max and Z-Score scaling, test-set predictions, and evaluation artifacts (ROC\/AUC points, bootstrap confidence intervals, McNemar tests, and inference-latency tables). The dataset supports benchmarking of supervised models for PV arc-fault detection and reproduces the experiments reported in our IEEE Access manuscript, where RF, KNN, MLP, and CNN were evaluated with temporal cross-validation. Typical results show high F1\/AUC and sub-millisecond inference latency for lightweight models, indicating suitability for edge deployment in PV inverters. Files are organized into processed_signals\/, fft_features\/ (engineered predictors + labels), normalized_features_minmax\/, standardized_features_zscore\/, predictions\/, evaluation_metrics\/, notebooks\/, and docs\/ (README and notes). A Jupyter notebook is provided to reproduce the entire pipeline.
提供机构:
Filipe Ramos; Bruno Lima; José Neto; Gabryel Gouveia; Michel Oliveira



