Animal acoustic identification, denoising, and source separation using generative adversarial networks

Name: Animal acoustic identification, denoising, and source separation using generative adversarial networks
Creator: Dryad
Published: 2026-01-29 00:42:46
License: 暂无描述

DataCite Commons2026-01-29 更新2026-04-25 收录

下载链接：

https://datadryad.org/dataset/doi:10.5061/dryad.vhhmgqp6k

下载链接

链接失效反馈

官方服务：

资源简介：

Soundscapes contain rich ecological information, offering insights into both biodiversity and ecosystem dynamics. However, the sheer volume of data produced by passive acoustic monitoring presents significant challenges for scalable analysis and ecological interpretation. While convolutional neural networks (CNNs) have advanced species classification in bioacoustics, they often struggle with identifying acoustic targets in acoustic space and quantifying soundscapes’ characteristics. In this study, we propose a novel spectrogram-to-spectrogram translation framework based on generative adversarial networks (GANs) to isolate and quantify acoustic sources within soundscape recordings. Our method is trained on paired spectrogram images: original full-spectrogram representations and target spectrogram representations containing only the vocalizations of specific sound labels. This design enables the model to learn source-specific mappings and perform both the species and community-level separation of acoustic components in soundscape recordings. We developed and evaluated two GAN-based models: a species-level GAN targeting eight avian species, and a community-level GAN distinguishing among avian, insect, and anthropogenic sound sources. The models were trained and tested using soundscape recordings collected from the Yaoluoping National Nature Reserve, eastern China. The species-level model achieved a mean F1 score of 0.76 for pixel-wise detection, while the community-level model reached 0.79 across categories. In addition to precise temporal-spectral localization, our approach captures sources’ acoustic occupancy and frequency distribution patterns, offering deeper ecological insight. Compared to baseline CNN classifiers, our model achieved a mean F1 score of 0.97, demonstrating comparable classification performance to ResNet50 (0.95) and VGG16 (0.98) across multiple species. Our GAN approach for extracting sound sources also significantly outperformed conventional methods in denoising and source separation, as indicated by lower image-level mean squared error. These results demonstrate the utility of GANs in advancing ecoacoustic analyses and biodiversity monitoring. By enabling robust source separation and fine-resolution signal mapping, the proposed approach contributes a scalable and transferable tool for soundscape quantification.

提供机构：

Dryad

创建时间：

2025-08-18

5,000+

优质数据集

54 个

任务类型

进入经典数据集