five

augCENSE-18k

收藏
NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://zenodo.org/record/4733680
下载链接
链接失效反馈
官方服务:
资源简介:
Created By Félix Gontier and Mathieu Lagrange, LS2N, CNRS, Ecole Centrale Nantes Contact : mathieu.lagrange@cnrs.fr If used for research, please refer to: @article{gontier2021training, title={Polyphonic training set synthesis improves self-supervised urban sound classification}, author={Félix Gontier and Vincent Lostanlen, and Mathieu Lagrange and Nicolas Fortin and Jean-Francois Petiot and Catherine Lavandier}, journal={The Journal of the Acoustical Society of America}, year={2021}, publisher={Acoustical Society of America} } augCENSE-18k is a derivative of CENSE-2k, obtained by time stretching and pitch shifting audio clips of the \emph{voice} and \emph{birds} classes at random. The total duration of the dataset is equal to 18k seconds, i.e., the same as simCENSE-18k, with balanced material over classes. Each audio samples are cut into one or several 3 seconds parts, each resulting into  spectrograms of size 23x29, leading to a dataset of 609 spectrograms. Low volume amorphic background noise recordings is added and the cut audio sample is centered within the 3 seconds if shorter. >>> a=numpy.load('augCENSE-18k_train_spectralData.npy') >>> a.shape (4421, 23, 29) >>> a=numpy.load('augCENSE-18k_train_presence.npy') >>> a.shape (4421, 16, 3) The 3 dimensions corresponds to the sceneId, the frameId (time), the sourceId (traffic, voice, birds). Annotation is provided as a binary indicator of source presence for one second, that is 8 consecutive 125 ms frames with a hop of one frame.
创建时间:
2021-06-02
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作