augCENSE-18k

NIAID Data Ecosystem2026-03-12 收录

下载链接：

https://zenodo.org/record/4733680

下载链接

链接失效反馈

官方服务：

资源简介：

Created By Félix Gontier and Mathieu Lagrange, LS2N, CNRS, Ecole Centrale Nantes Contact : mathieu.lagrange@cnrs.fr If used for research, please refer to: @article{gontier2021training, title={Polyphonic training set synthesis improves self-supervised urban sound classification}, author={Félix Gontier and Vincent Lostanlen, and Mathieu Lagrange and Nicolas Fortin and Jean-Francois Petiot and Catherine Lavandier}, journal={The Journal of the Acoustical Society of America}, year={2021}, publisher={Acoustical Society of America} } augCENSE-18k is a derivative of CENSE-2k, obtained by time stretching and pitch shifting audio clips of the \emph{voice} and \emph{birds} classes at random. The total duration of the dataset is equal to 18k seconds, i.e., the same as simCENSE-18k, with balanced material over classes. Each audio samples are cut into one or several 3 seconds parts, each resulting into spectrograms of size 23x29, leading to a dataset of 609 spectrograms. Low volume amorphic background noise recordings is added and the cut audio sample is centered within the 3 seconds if shorter. >>> a=numpy.load('augCENSE-18k_train_spectralData.npy') >>> a.shape (4421, 23, 29) >>> a=numpy.load('augCENSE-18k_train_presence.npy') >>> a.shape (4421, 16, 3) The 3 dimensions corresponds to the sceneId, the frameId (time), the sourceId (traffic, voice, birds). Annotation is provided as a binary indicator of source presence for one second, that is 8 consecutive 125 ms frames with a hop of one frame.

创建时间：

2021-06-02

5,000+

优质数据集

54 个

任务类型

进入经典数据集