Data from: Complexity of possibly-gapped histogram and Analysis of Histogram (ANOHT)
收藏DataONE2018-01-31 更新2024-06-25 收录
下载链接:
https://search.dataone.org/view/null
下载链接
链接失效反馈官方服务:
资源简介:
We demonstrate that gaps and distributional patterns are inseparable biological and mechanistic information contained in real-valued measurements data. Such patterns are discovered through data-driven possibly-gapped histograms, which further render a geometry-based Analysis of Histogram (ANOHT). Constructing a possibly-gapped histogram is a complex problem of statistical mechanics due to the ensemble of candidate histograms being captured by a two-layer Ising model. This construction is also a distinctive problem of Information Theory from the perspective of data compression via Uniformity. By defining a Hamiltonian (or energy) as sum of total coding lengths of boundaries and total decoding errors within bins, this issue of computing the minimum energy macroscopic states is surprisingly resolved by applying the Hierarchical Clustering algorithm. Thus a possibly-gapped histogram corresponds to a macrostate. And then the 1st phase of ANOHT is developed for simultaneous comparison of multiple treatments, while the 2nd phase of ANOHT is developed based on classic empirical process theory for a tree-geometry that can check authenticity of branches of treatments. The well-known Iris data is used to illustrate our technical developments, and a large baseball pitching data set and a heavily right-censored divorce data are analyzed to showcase the existential gaps and utilities of ANOHT.
创建时间:
2018-01-31



