sounding_out_chorus
收藏Mendeley Data2024-05-10 更新2024-06-28 收录
下载链接:
https://zenodo.org/records/8338003
下载链接
链接失效反馈官方服务:
资源简介:
This repository contains the data for the paper Towards interpretable learned representations for Ecoacoustics using variational auto-encoding. This dataset contains a series of 1 min wav files recorded across UK and Ecuadorian habitats. Each sample has 26 acoustic indices calculated and a full list of avian species and abundances. This dataset is an updated version of a previous release (10.5281/zenodo.1255218) including KML maps containing GPS data for each sample site and updated label metadata. A data module for use in a PyTorch machine learning pipeline is available here. Abstract Ecoacoustics is an emerging science that seeks to understand the role of sound in ecological processes. Passive acoustic monitoring is increasingly being used to collect vast quantities of whole-soundscape audio recordings in order to study variations in acoustic community activity across spatial and temporal scales. However, extracting relevant information from audio recordings for ecological inference is non-trivial. Recent approaches to machine-learned acoustic features appear promising but are limited by inductive biases, crude temporal integration methods and few means to interpret downstream inference. To address these limitations we developed and trained a self-supervised representation learning algorithm - a convolutional Variational Auto-Encoder (VAE) - to embed latent features from acoustic survey data collected from sites representing a gradient of habitat degradation in temperate and tropical ecozones and use prediction of survey site as a test case for interpreting inference. We investigate approaches to interpretability by mapping discriminative descriptors back to the spectro-temporal domain to observe how soundscape components change as we interpolate across a linear classification boundary traversing latent feature space; we advance temporal integration methods by encoding a probabilistic soundscape descriptor capable of capturing multi-modal distributions of latent features over time. Our results suggest that varying combinations of soundscape components (biophony, geophony and anthrophony) are used to infer sites along a degradation gradient and increased sensitivity to periodic signals improves on previous research using time-averaged representations for site classification. We also find the VAE is highly sensitive to differences in recorder hardware’s frequency response and demonstrate a simple linear transformation to mitigate the effect of hardware variance on the learned representation. Our work paves the way for development of a new class of deep neural networks that afford more interpretable machine-learned ecoacoustic representations to advance the fundamental and applied science and support global conservation efforts. Sampling Methods (extract from paper) Surveys were designed to monitor the acoustic characteristics of sites across a gradient of degradation, ranging from primary forest, through secondary forest (or areas in the process of ecological restoration), to agricultural monocultures, providing a space-for-time substitution to investigate changes in soundscapes across a gradient of ecological status. Samples were taken for 1 minute in every 15 for 10 sequential days at each site. Full dawn and dusk recordings were also collected. In each site, 15 recorders were placed in a grid-like system spaced a minimum of 200m away from their neighbours in the UK - 300m in Ecuador - to mitigate acoustic overlap and avoid spatial pseudo-replication. Wildlife Acoustics Song Meters equipped with two channel omni-directional microphone were used. Seven SM2+ and eight SM3 devices were deployed. Gains were matched between recorders (analogue gains at +36dB on SM2+ and +12dB on SM3 which has inbuilt +12dB gain) and recordings made at resolution of 16 bits with a sampling rate of 48 kHz. To provide a cleaner validation data set, local weather recordings were used to select 3 days with lowest wind and rain from each site giving 4725 1 min recording in total. Sites are labelled by their quality in descending order i.e. UK1 (primary), UK2 (regenerating), UK3 (degraded).
创建时间:
2024-01-08



