Systematic review of validation of supervised machine learning models in accelerometer-based animal behaviour classification literature
收藏DataCite Commons2026-01-28 更新2026-04-25 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.fxpnvx14d
下载链接
链接失效反馈官方服务:
资源简介:
Supervised machine learning has been used to detect fine-scale animal
behaviour from accelerometer data, but a standardised protocol for
implementing this workflow is currently lacking. As the application of
machine learning to ecological problems expands, it is essential to
establish technical protocols and validation standards that align with
those in other "big data" fields. Overfitting is a
prevalent and often misunderstood challenge in machine learning. Overfit
models overly adapt to the training data to memorise specific instances
rather than to discern the underlying signal. Associated results can
indicate high performance on the training set, yet these models are
unlikely to generalise to new data. Overfitting can be detected through
rigorous validation using independent test sets. Our systematic review of
119 studies using accelerometer-based supervised machine learning to
classify animal behaviour reveals that 79% (94 papers) did not validate
their models sufficiently well to robustly identify potential overfitting.
Although this does not inherently imply that these models are overfit, the
absence of independent test sets limits the interpretability of their
results. To address these challenges, we provide a theoretical
overview of overfitting in the context of animal accelerometry and propose
guidelines for optimal validation techniques. We aim to equip ecologists
with the tools necessary to adapt general machine learning validation
theory to the specific requirements of biologging, facilitating reliable
overfitting detection and advancing the field.
提供机构:
Dryad
创建时间:
2025-06-24



