Data from: Statistical significance for maximally persistent topological features via the Gumbel distribution
收藏DataCite Commons2022-11-04 更新2024-07-13 收录
下载链接:
https://idn.duke.edu/ark:/87924/r48k7ft9j
下载链接
链接失效反馈官方服务:
资源简介:
Topological data analysis (TDA) is finding traction as a novel way to discover and quantify structure in data. While there has been great success in descriptive characterizations, a rigorous statistical framework for the field is still in development. Here we look at a commonly used metric -- the length of the maximally persistent feature in a point cloud -- and develop a framework for hypothesis testing. Because the distribution of persistence lengths in Poisson spatial point clouds is well-aligned with the probabilistic theory of extreme values, we argue that critical values of the Gumbel distribution should be used when assessing statistical significance. For one-dimensional topological features (holes) in two-dimensional point clouds, we use the theory to predict an asymptotic rescaling of maximally persistent features that results in convergence in distribution to an approximately standard Gumbel random variable. We then propose a model for critical values as a function of point density and demonstrate its effectiveness on some standard TDA challenges.
提供机构:
Duke Research Data Repository
创建时间:
2022-11-03



