Animal acoustic identification, denoising, and source separation using generative adversarial networks
收藏DataCite Commons2026-01-29 更新2026-04-25 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.vhhmgqp6k
下载链接
链接失效反馈官方服务:
资源简介:
Soundscapes contain rich ecological information, offering insights into
both biodiversity and ecosystem dynamics. However, the sheer volume of
data produced by passive acoustic monitoring presents significant
challenges for scalable analysis and ecological interpretation. While
convolutional neural networks (CNNs) have advanced species classification
in bioacoustics, they often struggle with identifying acoustic targets in
acoustic space and quantifying soundscapes’ characteristics. In this
study, we propose a novel spectrogram-to-spectrogram translation framework
based on generative adversarial networks (GANs) to isolate and quantify
acoustic sources within soundscape recordings. Our method is trained on
paired spectrogram images: original full-spectrogram representations and
target spectrogram representations containing only the vocalizations of
specific sound labels. This design enables the model to learn
source-specific mappings and perform both the species and community-level
separation of acoustic components in soundscape recordings. We developed
and evaluated two GAN-based models: a species-level GAN targeting eight
avian species, and a community-level GAN distinguishing among avian,
insect, and anthropogenic sound sources. The models were trained and
tested using soundscape recordings collected from the Yaoluoping National
Nature Reserve, eastern China. The species-level model achieved a mean F1
score of 0.76 for pixel-wise detection, while the community-level model
reached 0.79 across categories. In addition to precise temporal-spectral
localization, our approach captures sources’ acoustic occupancy and
frequency distribution patterns, offering deeper ecological insight.
Compared to baseline CNN classifiers, our model achieved a mean F1 score
of 0.97, demonstrating comparable classification performance to ResNet50
(0.95) and VGG16 (0.98) across multiple species. Our GAN approach for
extracting sound sources also significantly outperformed conventional
methods in denoising and source separation, as indicated by lower
image-level mean squared error. These results demonstrate the
utility of GANs in advancing ecoacoustic analyses and biodiversity
monitoring. By enabling robust source separation and fine-resolution
signal mapping, the proposed approach contributes a scalable and
transferable tool for soundscape quantification.
提供机构:
Dryad
创建时间:
2025-08-18



