Acoustic dataset for "Dual‐Signal Buzz Pollination Monitoring: How Flight and Floral Vibrations Complement Each Other to Improve Bee Species Identification"
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://data.mendeley.com/datasets/ggdyfh78d7
下载链接
链接失效反馈官方服务:
资源简介:
Research hypothesis
We hypothesized that the acoustic signatures of bee floral buzzes and flight sounds encode species-specific patterns that can be automatically distinguished using machine-learning methods. In particular, by combining “static” features (event duration, fundamental frequency, and a selected subset of Mel-frequency cepstral coefficients) with “dynamic” or cinematic features (statistical moments, RMS envelope, spectral band-power in predefined bands, and a rolling‐element fault indicator), a Random Forest classifier can reliably separate floral buzz from flight and further discriminate among five bee species.
Data description & collection
This dataset comprises 200 individual sound events (100 floral floral buzzes, 100 flight buzzes recordings) from five bee species. For each event we provide:
Metadata (buzz_data.csv)
id: unique integer identifier
type: behavior label (“floral” or “flight” buzzes)
Acoustic features
duration_s: buzz or flight duration in seconds
fundamental_frequency (Hz)
Fifteen selected MFCC coefficients (e.g. mel_7, mel_20, …, mel_167)
All raw audio files (.wav), per-sample time‐amplitude exports (.txt), and acoustic feature tables are provided.
Notable findings
Using a Random Forest classifier (200 trees) with 1 000 randomized train/test splits (33 % hold-out each), we observe:
Floral buzz classification: 82.2 % ± 6.7 % accuracy
Flight buzz classification: 90.9 % ± 5.2 % accuracy
Combined classification: 95.0 % ± 3.9 % accuracy
Feature-importance consistently ranked fundamental frequency, certain MFCC quantiles (e.g. 75th and 97.5th percentiles of MFCC 5 and 11), and specific envelope band-power bands (310–400 Hz, 110–130 Hz, 225–250 Hz) as most discriminative.
Interpretation & reuse
Researchers can load the CSV tables into Python, R, or other tools to:
Reproduce or benchmark acoustic classification models
Explore species-specific acoustic patterns via feature analysis
Extend the dataset with new recordings or alternative feature-extraction methods
Integrate into automated pollinator monitoring pipelines
The accompanying GitHub repository (DOI: …) contains all Python scripts for feature extraction and model training. Users should first run the cinematic-feature extraction script on the raw .txt files, then apply the classification script to reproduce our results.
Licensing & citation
This dataset is released under CC BY 4.0. Please cite the Mendeley Data DOI and our code DOI (Zenodo) when reusing.
创建时间:
2025-07-16



