five

Biodenoising validation

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/13736464
下载链接
链接失效反馈
官方服务:
资源简介:
Biodenoising_validation is a benchmark dataset for animal vocalization denoising. It contains 62 pairs of clean animal vocalizations and noise excerpts.  We list the data sources in the clean.csv and noise.csv files. The dataset is created at two sample rates: 16000 and 44100. Each subfolder contains the clean, noise, and noisy subfolders with the accompanying metadata related to the data sources. MethodologyWe programatically create mixtures by pairing vocalizations of noise at random Signal-to-Noise Ratios (SNR) from an uniform distribution between -5 and 10 dB (2.8 average SNR). To ensure reproducibility, we start with a fixed seed that controls the SNR of the mixtures. The samples are between 1 to 60 seconds long (20.14 seconds on average). We split the vocalizations and noises into two lists: underwater (11 vocalizations and 26 noises) and terrestrial (51 vocalizations and 20 noises). For each separate case, we sort the vocalizations and the noise samples and pair them in the order of their duration e.g. matching the longest calls with longest noises.  CitationMiron, Marius, Sara Keen, Jen-Yu Liu, Benjamin Hoffman, Masato Hagiwara, Olivier Pietquin, Felix Effenberger, Maddie Cusimano, "Biodenoising: animal vocalization denoising without access to clean data,"  LicenseThis dataset is provided for educational purposes only and the material contained in them should not be used for any commercial purpose without the express permission of the copyright holders. Contact   info@mariusmiron.com
创建时间:
2024-09-09
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作