LEN-DB - Local earthquakes detection: a benchmark dataset of 3-component seismograms built on a global scale
收藏NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://zenodo.org/record/3648231
下载链接
链接失效反馈官方服务:
资源简介:
In this study ( The paper ) we present a large dataset of 1,249,411 3-component seismograms, recorded along the vertical, north, and east components of 1487 broad-band or very broad-band receivers distributed worldwide, including 631,105 3-component seismograms generated by 304,878 local earthquakes and labeled as earthquakes (EQ), and 618,306 ones labeled as noise (AN). The choice of collecting only local earthquake-data is motivated by the fact that small-magnitude events, which generate relatively small amplitudes and are easily attenuated, are often problematic to detect but provide valuable information about earthquake processes. The labeled data are split into HDF5-Groups: EQ and AN. Each of these groups contains as many HDF5-Datasets as the number of 3-component seismograms; these are labeled in accordance to the format net_sta_starttime, where net, sta, and starttime represent the seismic network, station, and start time of the seismograms. Each HDF5-Dataset (i.e. each triplet of seismograms) has an attribute, which allows accessing the respective metadata. In addition, the HDF5-Group Stations allows accessing stations’ metadata through as many HDF5-Datasets (which are labeled in accordance to the format net_sta) as the number of receivers employed for collecting the waveforms.
This global dataset is intended to be used for carrying out a multitude of seismological and signal processing tasks on single-station recordings, and its size particularly suits machine learning (ML) applications.. Application of ML to this dataset shows that a simple Convolutional Neural Network of 67,939 parameters allows discriminating between earthquakes and noise single-station recordings with high accuracy (93.2%), even if applied in regions not investigated by the training set. We make the dataset publicly available as a unique file in HDF5 data format, intending to provide the seismological and broader scientific community with a benchmark for time-series to be used as a testing ground in seismology and signal processing.
创建时间:
2020-11-02



