five

AudioSet-EV: an AudioSet-derived distribution of Emergency Vehicle Siren sounds

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/14882313
下载链接
链接失效反馈
官方服务:
资源简介:
AudioSet-EV, is a case-study tailored distribution of AudioSet (©Google) (AS) for acoustic emergency vehicle siren detection and recognition. By selectively grouping siren and non-siren urban sounds, enforcing taxonomy consistency, and mitigating class imbalances, AudioSet-EV offers a robust, large-scale resource for research in Machine Learning and Deep Learning acoustic modeling. Methodology Our design methodology encompasses a systematic selection and filtering of relevant AS samples, with AudioSet-Tools and a binary distinction between True Positives (siren-related) and True Negatives (non-siren) samples, mitigating class imbalances and label contamination. We emphasize that, given the original weak labeling nature, total reliability of the label association process cannot be guaranteed. We structured AudioSet-EV into two primary groups: Positives: including only EV-siren-related classes, specifically 'Police car (siren)', 'Ambulance (siren)', 'Fire engine, fire truck (siren)', and the ontology container class 'Emergency vehicle', to account for any weakly labeled or meaningful sound. Negatives: consisting of a diverse and challenging set, comprising vehicle-related sounds ('Car', 'Car passing by', 'Power windows, electric windows', 'Tire squeal', 'Motor vehicle (road)', 'Truck', 'Air brake', 'Ice cream truck, ice cream van', 'Bus', 'Motorcycle', 'Skidding', 'Race car, auto racing', 'Bicycle', 'Train', 'Rail transport', 'Train wheels squealing', 'Railroad car, train wagon', 'Skateboard'), alarm signals ('Car alarm', 'Vehicle horn, car horn, honking', 'Bicycle bell', 'Train horn', 'Train whistle', 'Foghorn', 'Toot', 'Reversing beeps', 'Beep, bleep', 'Civil defense siren', 'Alarm', 'Smoke detector, smoke alarm', 'Fire alarm', 'Buzzer'), environmental noises ('Traffic noise, roadway noise', 'Outside, rural or natural', 'Outside, urban or manmade'). We also included some Speech, Music, and Engine-related sounds to improve robustness against waveform pattern similarities and semantic taxonomy proximities. Pre-Processing For Positives category, segments processing followed these steps: Selection by Label: balanced, unbalanced, and eval AS segments were filtered according to our Positives label selection. Segments Merging: given the scarcity and sparsity of results, samples were aggregated across resulting intermediate .csv files, to achieve greater consistency. Blacklist Filtering: to refine our selection, any 'Civil defense siren' sample was removed to prevent contamination with non-emergency vehicle sounds. For the Negatives category, datasets processing followed these steps: Selection by Label: balanced, unbalanced, and eval AS entries, matching our defined non-siren labels, were extracted. Segments Merging: extracted negative subsets were merged to consolidate a unique non-siren set. Partial Blacklist Filtering: to avoid overlaps with the Positives category, samples containing at least one positive class label were removed, except for 'Civil defense siren', which is taxonomically included within the 'Siren' container class. Class Re-Balancing: to minimize imbalance among ontology child leaf classes, label occurrences were counter-equalized while preserving dataset diversity. Overall class uniformity is not feasible due to the ontological structure of AS and the presence of weakly multi-labeled entries. Final .csv files were processed through two independent instances of our AudioSet-Tools downloader, configured to re-sample YouTube audio to 32KHz, reduce files to mono, and avoid amplitude normalization. We stress the aspect that, given the large amount of Negatives, there actually exist multiple instances of this subset (due to the randomized class down-sampling process).  Summary Statistics   Samples Emergency_Vehicle Siren Police car (siren) Ambulance (siren) Fire engine, fire truck (siren) Positives 8409  5700 4352 3643  1931 3187 Downloaded 7324 4972  3768 3124 1637 2852 Difference (abs.) 1085 728 584 519 294  335 References S. Giacomelli et al. - "AudioSet-Tools: a Python Research Framework for Custom AudioSet Distributing and Processing" (under peer-review) GitHub: Dataset folder - https://github.com/StefanoGiacomelli/audioset-tools/tree/main/EV-benchmark/AudioSet-EV AudioSet-Tools: https://github.com/StefanoGiacomelli/audioset-tools/
创建时间:
2025-02-17
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作