OPEN-WINDOW: SOUND EVENT DATABASE FOR RESEARCH AND DEVELOPMENT
收藏Mendeley Data2024-03-27 更新2024-06-27 收录
下载链接:
https://zenodo.org/record/4118394
下载链接
链接失效反馈官方服务:
资源简介:
(1) Background: Situated in the domain of urban sound scene classification by humans and machines, the research in this project will be a first step towards mapping urban noise pollution experienced indoors and finding ways to reduce its negative impact in peoples' homes. The acoustic distinction between outdoor and indoor scenes is an active research field and can be automated with some success. A much subtler difference is the change in the indoor soundscape induced by an open window. Being able to determine this, however, would allow applications in warning systems and be a prerequisite for an app-based urban sound mapping project. Acoustic detection requires neither line of sight nor sensors at the window frame or knowledge of the number of windows or their size. The task, however, varies substantially in difficulty with the amount of sound inside and outside. From the point of machine classification, the lack of specificity is the most problematic aspect: Very few sounds if any can be assumed to originate exclusively from outside and be present at all times to aid automatic detection. The required generalisation ability, however, can be assumed for humans, who might also use very subtle cues in the change of reverberations. (2) Dataset (a) Recording locations The recordings have been made at three different locations. Farm: A farm in Brook, Surrey, United Kingdom. The recordings were made in an open-plan studio flat area in the centre of the farm. The recordings in this location have the lowest levels of background noise, due mainly to a quiet environmental surrounding. Office 1: An office at the University of Surrey, Guildford, United Kingdom. The recordings were made in an open-plan office located on the first floor, at the Centre for Vision, Speech and Signal Processing (CVSSP). Since this office accommodates 16 researchers, recordings in this location have the highest level of background noise Office 2: An office at the University of Surrey, Guildford, United Kingdom. The recordings were made in a small size open-plan office at the CVSSP. This office accommodates 8 researchers and the recordings made in this office considered to have a medium level of background noise. (b) Recording equipment The recordings made at the two offices and a studio flat in a farm used a dedicated laptop, Focusrite Clarett 4pre USB external sound card (44,100 Hz sample rate at 16 bits per sample) 1, and a Behringer ECM 8000 microphone. (c) Recording setup The Behringer ECM 8000 microphone is connected to the External Line Return (XLR) input of the Focusrite Clarett external sound card via an XLR cable. The external sound card is connected to the dedicated laptop and controlled using Ableton Live 10 software for setting configurations and exporting the recorded audio files. The microphone is located approximately 10 cm away from the window and fixed using a microphone holder. At each location 90 audio sessions are recorded; 60 one minute recordings for static state setup and 30 fifteen seconds recordings for transitional state setup. (d) File naming conventions The naming convention for audio recording is as follows: [Location] [State] [Time] [IDX] [State] will be one of the following: “O stands for open, C stands for Close, OC means a transition from Open to Close and CO stands for a transition from Close to Open.” [Time] stamp will be one of the following: “AM stands for morning between 9:00 to 12:00, N stands for noon which is between 13:00 to 15:00 and PM which stands for an afternoon which is between 17:00 to 20:00.” [IDX] is representing the file ID number. For example, “Farm C PM 01.wav”, means this file is recorded at the farm and in the afternoon when the window is closed and the file ID is 01. (e) Dataset acquisition: A recording kit consisting of a dedicated laptop and microphone will be given to volunteers. Custom-programmed software will remind the user to specify the window state (establishing the so-called ground truth). (f) Specifications - Open-Window contains 270 audio recordings totalling 3.37 hours of audio. - Each audio recording belongs to one of the four classes representing the window states; two stationary states (Open, Close) and two transitional states (Open-Close, Close-Open). - The recordings were carried out in different locations and at different times of the day. - Three locations: Office1, Office2, Farm - Three periods of the day: Morning, Afternoon, Evening - The recordings are split into six-folds. - Fold 1 is the test set. - Fold 2 is the validation set. - Folds 3-6 comprise the training set. Each fold is balanced in terms of the class and location distribution. - The annotations/metadata can be found in annotations.csv. - The recordings for the stationary states are approximately 60 seconds, while the recordings for the transitional states are approximate 15 seconds. - The format of the recordings is 2-channel 16-bit PCM sampled at 44.1 kHz.
创建时间:
2023-06-28



