Visualizing Probability Distributions Across Bivariate Cyclic Temporal Granularities
收藏DataCite Commons2021-07-26 更新2024-07-28 收录
下载链接:
https://tandf.figshare.com/articles/dataset/Visualizing_probability_distributions_across_bivariate_cyclic_temporal_granularities/14749324
下载链接
链接失效反馈官方服务:
资源简介:
Deconstructing a time index into time granularities can assist in exploration and automated analysis of large temporal datasets. This article describes classes of time deconstructions using linear and cyclic time granularities. Linear granularities respect the linear progression of time such as hours, days, weeks and months. Cyclic granularities can be circular such as hour-of-the-day, quasi-circular such as day-of-the-month, and aperiodic such as public holidays. The hierarchical structure of granularities creates a nested ordering: hour-of-the-day and second-of-the-minute are single-order-up. Hour-of-the-week is multiple-order-up, because it passes over day-of-the-week. Methods are provided for creating all possible granularities for a time index. A recommendation algorithm provides an indication whether a pair of granularities can be meaningfully examined together (a “harmony”), or when they cannot (a “clash”). Time granularities can be used to create data visualizations to explore for periodicities, associations and anomalies. The granularities form categorical variables (ordered or unordered) which induce groupings of the observations. Assuming a numeric response variable, the resulting graphics are then displays of distributions compared across combinations of categorical variables. The methods implemented in the open source R package gravitas are consistent with a tidy workflow, with probability distributions examined using the range of graphics available in ggplot2. Supplementary files for this article are available online.
将时间索引拆解为时间粒度(time granularities),可为大规模时序数据集的探索与自动化分析提供助力。本文阐述了基于线性与循环时间粒度的时间拆解方法类别。线性时间粒度契合时间的线性流逝规律,例如小时、日、周、月等。循环时间粒度可分为严格循环型(如当日时刻)、准循环型(如当月日期)与非周期型(如法定节假日)。时间粒度的层级结构会形成嵌套排序关系:当日时刻与每分钟内的秒数属于单层级嵌套;当周时刻则属于多层级嵌套,因其跨越了当周日期这一粒度层级。本文提供了针对时间索引生成全部可行时间粒度的方法。本文还提出了一种推荐算法,可用于判断任意两个时间粒度是否适合联合分析(即"适配组合"),反之则为"冲突组合"。时间粒度可用于构建数据可视化图表,以探索时序数据中的周期性、关联性与异常值。时间粒度可转化为分类变量(有序或无序),进而实现观测样本的分组。若以数值型响应变量为基础,最终生成的可视化图表可展示不同分类变量组合下的分布对比情况。本研究在开源R软件包gravitas中实现了上述方法,该实现契合整洁式工作流规范,并可借助ggplot2提供的各类绘图函数完成概率分布的可视化分析。本文的补充材料可在线获取。
提供机构:
Taylor & Francis
创建时间:
2021-06-08



