F0 estimation for bioacoustics: A benchmark/training dataset of non-human vocalisations with annotated frequency contours
收藏DataCite Commons2025-05-08 更新2025-05-10 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.prr4xgxw8
下载链接
链接失效反馈官方服务:
资源简介:
The fundamental frequency (F0) is a key parameter for characterising
structures in vertebrate vocalisations, for instance defining vocal
repertoires and their variations at different biological scales (e.g.,
population dialects, individual signatures). However, the task is too
laborious to perform manually, and its automation is complex. Despite
significant advancements in the fields of speech and music for automatic
F0 estimation, similar progress in bioacoustics has been limited. To
address this gap, we compile and publish a benchmark dataset of over
250,000 calls from 13 taxa, each paired with ground truth F0 values (each
call are associated a series of time x frequency points delimitating its
frequency contour). These vocalisations range from high to low SNR, from
infra-sounds to ultra-sounds, from high to low harmonicity, and some
include non-linear phenomena. This dataset allows to train supervised
and/or self-supervised models in estimating F0 values (similarly to CREPE
or PESTO for instance). Also, the provided ground truth allows to evaluate
the performance and compare different algorithms on these signals (see the
associated manuscript for a first benchmark and baseline). Pretrained
models and scripts to train or evaluate models on this dataset are
available on a separate github repository.
提供机构:
Dryad
创建时间:
2025-05-08



