five

Clustering Analysis Methods for GNSS Observations: A Data-Driven Approach to Identifying California’s Major Faults

收藏
DataCite Commons2023-09-15 更新2025-04-16 收录
下载链接:
https://dataverse.jpl.nasa.gov/citation?persistentId=doi:10.48577/jpl.51PKC3
下载链接
链接失效反馈
官方服务:
资源简介:
We present a data-driven approach to clustering or grouping Global Navigation Satellite System (GNSS) stations according to their observed velocities, displacements and/or other selected characteristics. Clustering GNSS stations not only has the potential for identifying useful scientific information (e.g., separating regions of post-seismic motion), but also is a necessary initial step in other GNSS analysis methods, such as those used to detect aseismic transient signals (Granat et. al., 2013). Using this approach, desired features of the data can be selected for clustering, including some subset of the three displacement or velocity components, uncertainty estimates, the station location, and any other relevant information present in the data set. Based on those selections, the clustering procedure autonomously groups the GNSS stations according to a selected clustering method; some methods require that the number of groups be specified in advance, while others estimate the number of groups from the data. We have implemented this approach as a Python application, allowing us to draw upon the full range of open source clustering methods available in Python’s scikit-learn package (Pedregosa et. al., 2011). The application returns the GNSS stations labeled by group in both tabular form and as a color coded KML file for overlay in Google Earth with other sources of information. Our implementation is designed to work with the GNSS displacement and velocity information available from GeoGateway (Heflin et. al., 2020; Donnellan et al, in press), a map-based science gateway supported by NASA, but is easily extendable to other data sources or output formats. Our analysis is focused on California and western Nevada. The results typically show partitions that follow faults or geologic boundaries, including for recent large earthquakes and post-seismic motion. The San Andreas fault system is the most prominent partition, reflecting Pacific-North American plate boundary motion. Deformation reflected as class boundaries is distributed north and south of the central California creeping section. For most models the southern San Andreas fault connects with the Eastern California Shear Zone (ECSZ) rather than continuing through the San Gorgonio Pass.

本研究提出了一种基于数据驱动的聚类分组方法,用于根据观测得到的速度、位移及其他选定特征,对全球导航卫星系统(Global Navigation Satellite System,GNSS)测站进行聚类分组。对GNSS测站进行聚类,不仅有望挖掘出有价值的科学信息(例如,区分震后运动区域),同时也是其他GNSS分析方法中必要的前置步骤,例如用于检测无震瞬态信号的分析方法(Granat等,2013)。通过该方法,可选取数据中的所需特征用于聚类,包括三个位移或速度分量的部分子集、不确定性估计值、测站位置,以及数据集中包含的其他任何相关信息。基于所选特征,聚类程序将根据选定的聚类算法自动对GNSS测站进行分组;部分算法需要预先指定分组数量,而另一些算法则可从数据中自动估算分组数目。本研究已将该方法实现为Python应用程序,可调用Python的scikit-learn库中所有可用的开源聚类算法(Pedregosa等,2011)。该应用程序会以表格形式和色彩编码的KML文件两种格式,返回按分组标记的GNSS测站,可用于在Google Earth中与其他信息源叠加显示。本实现程序适配由NASA支持的基于地图的科学网关GeoGateway(Heflin等,2020;Donnellan等,即将出版)提供的GNSS位移与速度数据,但也可轻松扩展至其他数据源或输出格式。本研究的分析聚焦于加利福尼亚州与内华达州西部地区。分析结果通常会呈现出与断层或地质边界相符的分区,包括近期大地震及震后运动的相关分区。圣安德烈亚斯断层系统是最显著的分区边界,反映了太平洋板块与北美板块的板块边界运动。以类别边界体现的形变分布在加利福尼亚中部蠕动段的南北两侧。在多数模型中,圣安德烈亚斯断层南段与加利福尼亚东部剪切带(Eastern California Shear Zone,ECSZ)相连,而非继续穿过圣戈尔戈尼奥山口。
提供机构:
Root
创建时间:
2023-09-14
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作