simFSD-18k
收藏NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://zenodo.org/record/4733697
下载链接
链接失效反馈官方服务:
资源简介:
Created By Félix Gontier and Mathieu Lagrange, LS2N, CNRS, Ecole Centrale Nantes
Contact : mathieu.lagrange@cnrs.fr
If used for research, please refer to:
@article{gontier2021training,
title={Polyphonic training set synthesis improves self-supervised urban sound classification},
author={Félix Gontier and Vincent Lostanlen, and Mathieu Lagrange and Nicolas Fortin and Jean-Francois Petiot and Catherine Lavandier},
journal={The Journal of the Acoustical Society of America},
year={2021},
publisher={Acoustical Society of America}
}
simFSD-18k is a dataset of synthetic acoustic scenes made with Freesound and Librispeech samples}
simFSD-18k contains 400 scenes acoustic scenes of duration equal to 45 seconds.
We synthesized these polyphonic scenes via the simScene software, based on FSD-2k.
The total duration of the dataset is equal to 18k seconds, i.e., the same as simFSD-18k.
The total duration of the dataset is equal to 18k seconds, i.e., five hours.The audio is made available as third octave spectral data, see demoTob.zip for an implementation of its computation from audio in Python.
>>> a=numpy.load('simFSD-18k_training_spectralData.npy')
>>> a.shape
(280, 351, 29)
>>> a=numpy.load('simFSD-18k_training_presence.npy')
>>> a.shape
(280, 344, 3)
The 3 dimensions corresponds to the sceneId, the frameId (time), the sourceId (traffic, voice, birds). Annotation is provided as a binary indicator of source presence for one second, that is 8 consecutive 125 ms frames with a hop of one frame.
创建时间:
2021-06-02



