Acoustic dataset for "Dual‐Signal Buzz Pollination Monitoring: How Flight and Floral Vibrations Complement Each Other to Improve Bee Species Identification"

NIAID Data Ecosystem2026-05-02 收录

下载链接：

https://data.mendeley.com/datasets/ggdyfh78d7

下载链接

链接失效反馈

官方服务：

资源简介：

Research hypothesis We hypothesized that the acoustic signatures of bee floral buzzes and flight sounds encode species-specific patterns that can be automatically distinguished using machine-learning methods. In particular, by combining “static” features (event duration, fundamental frequency, and a selected subset of Mel-frequency cepstral coefficients) with “dynamic” or cinematic features (statistical moments, RMS envelope, spectral band-power in predefined bands, and a rolling‐element fault indicator), a Random Forest classifier can reliably separate floral buzz from flight and further discriminate among five bee species. Data description & collection This dataset comprises 200 individual sound events (100 floral floral buzzes, 100 flight buzzes recordings) from five bee species. For each event we provide: Metadata (buzz_data.csv) id: unique integer identifier type: behavior label (“floral” or “flight” buzzes) Acoustic features duration_s: buzz or flight duration in seconds fundamental_frequency (Hz) Fifteen selected MFCC coefficients (e.g. mel_7, mel_20, …, mel_167) All raw audio files (.wav), per-sample time‐amplitude exports (.txt), and acoustic feature tables are provided. Notable findings Using a Random Forest classifier (200 trees) with 1 000 randomized train/test splits (33 % hold-out each), we observe: Floral buzz classification: 82.2 % ± 6.7 % accuracy Flight buzz classification: 90.9 % ± 5.2 % accuracy Combined classification: 95.0 % ± 3.9 % accuracy Feature-importance consistently ranked fundamental frequency, certain MFCC quantiles (e.g. 75th and 97.5th percentiles of MFCC 5 and 11), and specific envelope band-power bands (310–400 Hz, 110–130 Hz, 225–250 Hz) as most discriminative. Interpretation & reuse Researchers can load the CSV tables into Python, R, or other tools to: Reproduce or benchmark acoustic classification models Explore species-specific acoustic patterns via feature analysis Extend the dataset with new recordings or alternative feature-extraction methods Integrate into automated pollinator monitoring pipelines The accompanying GitHub repository (DOI: …) contains all Python scripts for feature extraction and model training. Users should first run the cinematic-feature extraction script on the raw .txt files, then apply the classification script to reproduce our results. Licensing & citation This dataset is released under CC BY 4.0. Please cite the Mendeley Data DOI and our code DOI (Zenodo) when reusing.

创建时间：

2025-07-16