Bat-aggregated time series workflow
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.w0vt4b8zf
下载链接
链接失效反馈官方服务:
资源简介:
This dataset and code provides radar-based detections of Brazilian free-tailed bats (Tadarida brasiliensis) across select regions of California and Texas, compiled using weather radar data from the NEXRAD (NEXtgeneration weather RADar) system. NEXRAD radars, operated by the US National Weather Service, continuously monitor the airspace, detecting various airborne organisms including birds, insects, and bats.
The dataset was generated using the ‘BATS’ Python toolkit (program included), which automates the retrieval, processing, and classification of radar data. It employs a pre-trained machine learning model specifically designed to detect radar echoes associated with Brazilian free-tailed bats. The dataset includes the results from machine learning models trained and tested on radar data, which achieved an AUC of 0.963, demonstrating high accuracy in identifying bat activity. The dataset also includes pre-trained neural network and random forest models for reproducibility.
This dataset provides valuable spatiotemporal information on bat presence at a large landscape scale and across extended timeframes. By distilling radar data into efficient summaries of bat occurrence, the dataset enables researchers to explore patterns in bat activity and their potential ecosystem services, such as insect consumption, in agricultural regions.
Methods
Data Description
This dataset provides detailed radar-based detections of Brazilian free-tailed bats (Tadarida brasiliensis) across select regions of California and Texas. The data were compiled from the NEXRAD (NEXt-generation weather RADar) system, which operates S-band Doppler weather radars across the United States. NEXRAD radars detect various airborne targets such as birds, insects, and bats.
The dataset is processed using the 'BATS' Python toolkit, which automates the retrieval and classification of radar data. Using radar data sourced from the Amazon Web Services (AWS) repository, the BATS toolkit classifies radar echoes based on a machine learning model trained to identify Brazilian free-tailed bats. The dataset contains bat presence information at a pixel resolution of 70 meters, derived from radar data over multiple time periods in 2018 and 2019. This data will be useful for researchers exploring bat ecology, insectivorous bat ecosystem services, and landscape-level bat monitoring.
The dataset includes:
Radar data processed to detect bat presence in California (2018) and Texas (2019)
Classified radar pixels indicating bat presence or absence
Machine learning-derived bat occurrence probabilities (thresholded for binary classification)
Geotiff files that aggregate radar data over six-month periods
Methods
Data Collection
The dataset was generated using NEXRAD radar data, sourced from AWS. The BATS Python toolkit facilitated the collection and processing of radar data files, automating the pipeline from raw radar retrieval to bat detection. Radar data was selected based on specific regions, timeframes, and weather conditions associated with confirmed Brazilian free-tailed bat emergence events. The radar data collected spans 11 weather-free days in California (2018) and 7 days in Texas (2019). Reference data on bat emergence was gathered from field observations provided by local bat monitoring organizations.
Data Processing
Once downloaded, the raw radar data (Level II “.gz” files) was processed using the Py-ART library, which is designed for radar data manipulation. Py-ART converted the radar data from its native polar coordinates into a uniform Cartesian grid, with a resampled pixel resolution of 70 meters to facilitate accurate bat detection.
The processed radar data was then classified using a machine learning pipeline. The BATS toolkit includes scripts for classification, in which radar echoes were evaluated by pre-trained machine learning models. The dataset was classified using three machine learning models: random forest (RF), support vector machines (SVM), and artificial neural networks (ANN). The ANN model, selected for its superior performance (AUC of 0.963), was used to classify each radar pixel as either containing or not containing Brazilian free-tailed bats. The model outputs a binary classification based on a 90% probability threshold to ensure accurate detection while minimizing false positives.
Evaluation and Quality Control
To ensure the accuracy of the model and its classifications, the dataset was evaluated using standard binary classification metrics: precision, recall, AUC (Area Under the ROC Curve), and precision-recall curves. Hyperparameter tuning and spatial cross-validation were performed to account for spatial autocorrelation in the radar data and to improve the generalization of the machine learning models.
Training data for the model was primarily sourced from California, while independent testing was conducted using radar data from Texas. The dataset also includes labeled data representing noise sources (such as birds, vehicles, and weather phenomena) to reduce false positives during classification.
By processing large volumes of radar data and applying machine learning algorithms, the BATS toolkit condensed terabytes of raw radar data into concise geotiff maps of bat presence, enabling efficient analysis of bat populations across landscapes.
创建时间:
2024-10-15



