five

ZeitZeiger: supervised learning for high-dimensional data from an oscillatory system

收藏
NIAID Data Ecosystem2026-03-10 收录
下载链接:
http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.hn8gp
下载链接
链接失效反馈
官方服务:
资源简介:
Numerous biological systems oscillate over time or space. Despite these oscillators’ importance, data from an oscillatory system is problematic for existing methods of regularized supervised learning. We present ZeitZeiger, a method to predict a periodic variable (e.g. time of day) from a high-dimensional observation. ZeitZeiger learns a sparse representation of the variation associated with the periodic variable in the training observations, then uses maximum-likelihood to make a prediction for a test observation. We applied ZeitZeiger to a comprehensive dataset of genome-wide gene expression from the mammalian circadian oscillator. Using the expression of 13 genes, ZeitZeiger predicted circadian time (internal time of day) in each of 12 mouse organs to within ∼1 h, resulting in a multi-organ predictor of circadian time. Compared to the state-of-the-art approach, ZeitZeiger was faster, more accurate and used fewer genes. We then validated the multi-organ predictor on 20 additional datasets comprising nearly 800 samples. Our results suggest that ZeitZeiger not only makes accurate predictions, but also gives insight into the behavior and structure of the oscillator from which the data originated. As our ability to collect high-dimensional data from various biological oscillators increases, ZeitZeiger should enhance efforts to convert these data to knowledge.

诸多生物系统会随时间或空间维度呈现周期性振荡行为。尽管此类生物振荡器具有重要研究价值,但振荡系统的实验数据对于现有正则化监督学习(regularized supervised learning)方法而言仍存在适配性难题。本文提出ZeitZeiger算法,一种可从高维度观测数据中预测周期性变量(如每日时段)的方法。ZeitZeiger可从训练观测数据中学习与目标周期性变量相关的变异的稀疏表示(sparse representation),随后借助最大似然(maximum-likelihood)法对测试观测数据完成预测。我们将ZeitZeiger应用于一套涵盖哺乳动物昼夜节律振荡器(circadian oscillator)的全基因组基因表达(genome-wide gene expression)综合数据集。仅使用13个基因的表达量,ZeitZeiger便可对小鼠12个器官的昼夜节律时间(即内部日时段)进行预测,误差控制在约1小时以内,由此构建出多器官昼夜节律时间预测模型。与当前最优(state-of-the-art)方法相比,ZeitZeiger运算速度更快、预测精度更高,且所需使用的基因数量更少。随后我们利用另外20套包含近800个样本的数据集对该多器官预测模型进行了验证。研究结果表明,ZeitZeiger不仅能够实现精准预测,还可帮助研究者深入理解产生该数据的振荡器的运行机制与结构特征。随着我们从各类生物振荡器中采集高维度数据的能力不断提升,ZeitZeiger将助力研究者更好地将此类数据转化为科学认知。
创建时间:
2017-01-08
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作