Data_Sheet_5_Mind the Queue: A Case Study in Visualizing Heterogeneous Behavioral Patterns in Livestock Sensor Data Using Unsupervised Machine Learning Techniques.ZIP
收藏NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://figshare.com/articles/dataset/Data_Sheet_5_Mind_the_Queue_A_Case_Study_in_Visualizing_Heterogeneous_Behavioral_Patterns_in_Livestock_Sensor_Data_Using_Unsupervised_Machine_Learning_Techniques_ZIP/12799829
下载链接
链接失效反馈官方服务:
资源简介:
Sensor technologies allow ethologists to continuously monitor the behaviors of large numbers of animals over extended periods of time. This creates new opportunities to study livestock behavior in commercial settings, but also new methodological challenges. Densely sampled behavioral data from large heterogeneous groups can contain a range of complex patterns and stochastic structures that may be difficult to visualize using conventional exploratory data analysis techniques. The goal of this research was to assess the efficacy of unsupervised machine learning tools in recovering complex behavioral patterns from such datasets to better inform subsequent statistical modeling. This methodological case study was carried out using records on milking order, or the sequence in which cows arrange themselves as they enter the milking parlor. Data was collected over a 6-month period from a closed group of 200 mixed-parity Holstein cattle on an organic dairy. Cows at the front and rear of the queue proved more consistent in their entry position than animals at the center of the queue, a systematic pattern of heterogeneity more clearly visualized using entropy estimates, a scale and distribution-free alternative to variance robust to outliers. Dimension reduction techniques were then used to visualize relationships between cows. No evidence of social cohesion was recovered, but Diffusion Map embeddings proved more adept than PCA at revealing the underlying linear geometry of this data. Median parlor entry positions from the pre- and post-pasture subperiods were highly correlated (R = 0.91), suggesting a surprising degree of temporal stationarity. Data Mechanics visualizations, however, revealed heterogeneous non-stationary among subgroups of animals in the center of the group and herd-level temporal outliers. A repeated measures model recovered inconsistent evidence of a relationships between entry position and cow attributes. Mutual conditional entropy tests, a permutation-based approach to assessing bivariate correlations robust to non-independence, confirmed a significant but non-linear association with peak milk yield, but revealed the age effect to be potentially confounded by health status. Finally, queueing records were related back to behaviors recorded via ear tag accelerometers using linear models and mutual conditional entropy tests. Both approaches recovered consistent evidence of differences in home pen behaviors across subsections of the queue.
传感器技术可让动物行为学家(ethologist)在较长时段内持续监测大量动物的行为表现,这为在商业化养殖环境中研究家畜行为带来了新机遇,但同时也带来了新的方法学挑战。来自大型异质群体的高密度采样行为数据,往往包含诸多复杂模式与随机结构,若采用传统探索性数据分析技术,往往难以对其进行可视化呈现。本研究旨在评估无监督机器学习工具(unsupervised machine learning)从这类数据集中复现复杂行为模式的效能,以更好地为后续统计建模提供参考。
本方法学案例研究以挤奶顺序——即奶牛进入挤奶厅时的自主排队序列——为研究对象,数据采集自某有机奶牛场的200头混合胎次荷斯坦奶牛(Holstein)封闭群体,采集周期为6个月。研究发现,队列首尾位置的奶牛,其入场位置相较于队列中部个体更为稳定;借助熵估计(entropy)——一种无需依赖尺度与分布、对异常值鲁棒的方差替代指标——可更清晰地展现这一系统性异质性模式。随后通过降维技术(dimension reduction)对奶牛间的关联进行可视化:未发现社会凝聚力相关证据,但相较于主成分分析(Principal Component Analysis, PCA),扩散映射嵌入(Diffusion Map)更能揭示该数据集的潜在线性几何结构。
放牧前后两个子时段的挤奶厅入场中位位置呈现高度相关(R=0.91),这一结果显示出令人意外的高时间平稳性。然而,数据力学(Data Mechanics)可视化结果显示,队列中部的动物亚群存在异质非平稳性,同时存在畜群层面的时间异常值。重复测量模型(repeated measures model)并未发现入场位置与奶牛性状间存在稳定关联。基于置换的互条件熵检验(Mutual Conditional Entropy)——一种适用于非独立数据的双变量相关性评估方法——证实入场位置与峰值产奶量间存在显著但非线性的关联,但同时发现年龄效应可能受健康状态的混淆影响。最后,本研究通过线性模型与互条件熵检验,将排队记录与耳标加速度计(accelerometer)采集的圈舍内行为数据进行关联分析,两种方法均一致发现队列不同子区段的奶牛圈舍内行为存在显著差异。
创建时间:
2020-08-13



