Data from: Complexity of possibly gapped histogram and analysis of histogram
收藏DataCite Commons2025-04-01 更新2025-04-09 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.bs632
下载链接
链接失效反馈官方服务:
资源简介:
We demonstrate that gaps and distributional patterns embedded within
real-valued measurements are inseparable biological and mechanistic
information contents of the system. Such patterns are discovered through
data-driven possibly gapped histogram, which further leads to the
geometry-based analysis of histogram (ANOHT). Constructing a possibly
gapped histogram is a complex problem of statistical mechanics due to the
ensemble of candidate histograms being captured by a two-layer Ising
model. This construction is also a distinctive problem of Information
Theory from the perspective of data compression via uniformity. By
defining a Hamiltonian (or energy) as a sum of total coding lengths of
boundaries and total decoding errors within bins, this issue of computing
the minimum energy macroscopic states is surprisingly resolved by applying
the hierarchical clustering algorithm. Thus, a possibly gapped histogram
corresponds to a macro-state. And then the first phase of ANOHT is
developed for simultaneous comparison of multiple treatments, while the
second phase of ANOHT is developed based on classical empirical process
theory for a tree-geometry that can check the authenticity of branches of
the treatment tree. The well-known Iris data are used to illustrate our
technical developments. Also, a large baseball pitching dataset and a
heavily right-censored divorce data are analysed to showcase the
existential gaps and utilities of ANOHT.
提供机构:
Dryad
创建时间:
2018-01-31



