five

PyHydroQC Sensor Data QC: Single Site Example

收藏
DataONE2021-12-05 更新2024-06-08 收录
下载链接:
https://search.dataone.org/view/sha256:ad76dca73192d82d1e4b2cce298ce7b75dca169e58bd7edd80e48c4e33f7aacc
下载链接
链接失效反馈
官方服务:
资源简介:
This resource contains an example script for using the software package PyHydroQC. PyHydroQC was developed to identify and correct anomalous values in time series data collected by in situ aquatic sensors. For more information, see the code repository: https://github.com/AmberSJones/PyHydroQC and the documentation: https://ambersjones.github.io/PyHydroQC/. The package may be installed from the Python Package Index. This script applies the functions to data from a single site in the Logan River Observatory, which is included in the repository. The data collected in the Logan River Observatory are sourced at http://lrodata.usu.edu/tsa/ or on HydroShare: https://www.hydroshare.org/search/?q=logan%20river%20observatory. Anomaly detection methods include ARIMA (AutoRegressive Integrated Moving Average) and LSTM (Long Short Term Memory). These are time series regression methods that detect anomalies by comparing model estimates to sensor observations and labeling points as anomalous when they exceed a threshold. There are multiple possible approaches for applying LSTM for anomaly detection/correction. - Vanilla LSTM: uses past values of a single variable to estimate the next value of that variable. - Multivariate Vanilla LSTM: uses past values of multiple variables to estimate the next value for all variables. - Bidirectional LSTM: uses past and future values of a single variable to estimate a value for that variable at the time step of interest. - Multivariate Bidirectional LSTM: uses past and future values of multiple variables to estimate a value for all variables at the time step of interest. The correction approach uses piecewise ARIMA models. Each group of consecutive anomalous points is considered as a unit to be corrected. Separate ARIMA models are developed for valid points preceding and following the anomalous group. Model estimates are blended to achieve a correction. The anomaly detection and correction workflow involves the following steps: 1. Retrieving data 2. Applying rules-based detection to screen data and apply initial corrections 3. Identifying and correcting sensor drift and calibration (if applicable) 4. Developing a model (i.e., ARIMA or LSTM) 5. Applying model to make time series predictions 6. Determining a threshold and detecting anomalies by comparing sensor observations to modeled results 7. Widening the window over which an anomaly is identified 8. Aggregating detections resulting from multiple models 9. Making corrections for anomalous events Instructions to run the notebook through the CUAHSI JupyterHub: 1. Click \"Open with...\" at the top of the resource and select the CUAHSI JupyterHub. You may need to sign into CUAHSI JupyterHub using your HydroShare credentials. 2. Select 'Python 3.8 - Scientific' as the server and click Start. 2. From your JupyterHub directory, click on the PyHydroQC_example.ipynb file. 3. Execute each cell in the code by clicking the Run button.
创建时间:
2021-12-05
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作