Colonial Bird Nest Detection Dataset: Biscayne National Park Aerial Monitoring 2010-2024
收藏DataCite Commons2026-04-01 更新2026-05-03 收录
下载链接:
https://dataverse.fiu.edu/citation?persistentId=doi:10.34703/gzx1-9v95/UD9HTD
下载链接
链接失效反馈官方服务:
资源简介:
Colonial Bird Nest Detection Dataset: Biscayne National Park Aerial Monitoring 2010–2024 This dataset supports automated wildlife monitoring and computer vision research for colonial waterbird populations at Biscayne National Park, Florida. It includes 15,759 georeferenced aerial survey images with 161,744 ground-truth bounding box annotations, flight survey metadata from 117 helicopter surveys across 15 years, a trained YOLOv5s6 object detection model, and complete source code and documentation. Data were collected under a Cooperative Ecosystem Studies Unit (CESU) agreement between the National Park Service and Florida International University GIS Center. 📁 What Is in This Dataverse Record This Dataverse record contains curated metadata, model artifacts, tabular data, and documentation — a representative subset designed for citation and reuse. The full 113 GB image dataset is too large for direct Dataverse deposit and is hosted separately on the Open Science Data Federation (OSDF) via the FIU Pelican origin server (see Accessible section below). README.md — Dataset overview and usage guide DATA_DICTIONARY.md — Field definitions for all tabular files 1_Survey_Images/image_index.csv — Per-image index with filename, year, location, annotation counts, and direct OSDF URL for each of the 15,759 images 1_Survey_Images/README.md — Survey images directory documentation 2_Flight_Survey_Metadata/flight_survey_log.csv — 117 helicopter surveys (date, site, observer, conditions) 2_Flight_Survey_Metadata/location_reference.csv — 11 canonical monitoring site names, coordinates, and aliases 3_Detection_Model/weights/best.pt — YOLOv5s6 trained weights, best checkpoint (PyTorch, ~14 MB) 3_Detection_Model/evaluation/ — confusion_matrix.png, F1/P/R/PR curves, results.csv, results.png 3_Detection_Model/README.md — Model architecture, training setup, and evaluation results 4_Source_Code_Documentation/notebooks/ — 5 Jupyter notebooks (data preprocessing, model training, evaluation) 4_Source_Code_Documentation/project_docs/ — System design PDF (birdNestDesign_v1.pdf) Note: The full image dataset (15,759 JPEG images, YOLO annotations, train/val/test splits, 113 GB) is hosted on OSDF/Pelican and is NOT deposited here. Use image_index.csv to access individual images by their OSDF URLs, or download the full set via pelican object sync (see Accessible section). 💡 Note: To access the full image dataset, use the OSDF Pelican URL or the per-file URLs in image_index.csv. No account is required for read access. 📌 F — Findable This dataset is registered with a persistent identifier (DOI) in the FIU GIS Center Dataverse Collection. Large image files (>113 GB) are accessible via the Open Science Data Federation (OSDF) using a resolvable Pelican federated URL: Pelican Federation URL: pelican://osg-htc.org/envistor/FIU-GIS-CENTER/birdnest-biscayne/survey-images/ HTTPS Direct Access: https://pelican.fiu.edu:8443/envistor/FIU-GIS-CENTER/birdnest-biscayne/survey-images/ Tabular image index with per-file OSDF URLs: 1_Survey_Images/image_index.csv (deposited in this Dataverse record) Keywords are mapped to ITIS taxonomic identifiers for all four monitored species (Double-crested Cormorant, Great Egret, Great Blue Heron, Tricolored Heron). 🔒 A — Accessible Tabular data, model files, and documentation are deposited directly in this Dataverse record and are freely downloadable under CC0 1.0. Survey images (113 GB, 15,759 JPEG files) are hosted on the Open Science Data Federation (OSDF) via the FIU Pelican origin server. OSDF is a distributed data delivery network operated by the OSG Consortium, designed for large-scale scientific data access without requiring accounts for read access to public namespaces. How to access images via Pelican/OSDF: No account required for read access. The /envistor/FIU-GIS-CENTER/ namespace is configured as PublicReads. Install the Pelican client: see Pelican Platform — Accessing Data Download a single file:pelican object get pelican://osg-htc.org/envistor/FIU-GIS-CENTER/birdnest-biscayne/survey-images/images/2024/ArsenickerKey/example.jpg ./local_dir/ Download the full image directory (resumable):pelican object sync pelican://osg-htc.org/envistor/FIU-GIS-CENTER/birdnest-biscayne/survey-images/ ./birdnest_images/ Alternatively, browse files via HTTPS:https://pelican.fiu.edu:8443/envistor/FIU-GIS-CENTER/birdnest-biscayne/survey-images/ For programmatic access or to request a token for write access, visit: Pelican Platform Documentation or contact the FIU GIS Center at levente.juhasz@ufl.edu. 🔁 I — Interoperable Survey Images Format: JPEG, captured via helicopter-mounted camera during monthly NPS surveys Volume: 15,759 images across 117 flights (2010–2024) Directory structure: images/{year}/{location}/ (e.g., images/2016/MangroveKey/) 11 canonical monitoring sites: ArsenickerKey, JonesLagoon, KeyBiscayne, MangroveKey, RaggedKey, RaggedKeyNorth, SoldierKey, WestArsenickerKey, and others in Biscayne National Park (25.35°N–25.67°N, 80.08°W–80.25°W) Image index: image_index.csv provides per-image metadata including filename, year, location, annotation count by class, and direct OSDF URL for programmatic access Annotations Format: YOLO v5 (normalized bounding boxes: class_id cx cy w h, one annotation per line) Class mapping: 0 = occupied_nest (138,400 instances) 1 = egg (3,611 instances) 2 = chick (8,495 instances) 3 = non_occupied_nest (11,238 instances) Total: 161,744 bounding box annotations across 15,759 images (avg 10.3 per image) Directory structure: annotations/{year}/{location}/, mirroring the image structure Provenance: Originally drawn as circles on aerial photographs by NPS trained observers; converted to YOLO bounding boxes by FIU GIS Center Training Splits splits/train.txt — 12,609 images (80%) splits/valid.txt — 1,575 images (10%) splits/test.txt — 1,575 images (10%) Splits are stratified by location to avoid spatial data leakage Detection Model Architecture: YOLOv5s6 (small, 6-layer backbone; selected for inference speed on edge hardware vs. YOLOv5m6/l6 tradeoff) Training: Amazon SageMaker ml.g5.12xlarge (4× NVIDIA A10G), April 2025 Best checkpoint: 3_Detection_Model/weights/best.pt (PyTorch, ~14 MB) Performance: mAP@0.5 = 0.458 (all classes), F1 = 0.45 at threshold 0.35 Evaluation outputs: confusion matrix, precision/recall curves, F1 curve, results.csv ♻ R — Reusable License: CC0 1.0 Universal (public domain). No attribution required, but citation is appreciated. How to reuse this dataset Run inference with the provided model: git clone https://github.com/Keven1894/birdnest-biscayne-detection cd birdnest-biscayne-detection && pip install -r requirements.txt python detect.py --weights path/to/best.pt --source path/to/images/ Re-train or fine-tune: Use the provided splits/train.txt and splits/valid.txt with any YOLOv5-compatible framework. The dataset is compatible with Ultralytics YOLOv5, Ultralytics YOLO11, and Roboflow. Access images programmatically using image_index.csv: Each row in image_index.csv contains a direct OSDF URL. Use requests or pelican object get to fetch specific subsets by year, location, or annotation density. Extend to new survey years: New NPS survey images can be annotated with CVAT or Roboflow and ingested using the preprocessing scripts in the repository. Source Code & Documentation GitHub Repository: https://github.com/Keven1894/birdnest-biscayne-detection Contains: training pipeline, inference scripts, Flask web demo, annotation preprocessing, and Jupyter notebooks Supplementary notebooks and project documentation are also deposited in this Dataverse record under 4_Source_Code_Documentation/ Suggested Citation Juhasz, L. & Guan, B. (2025). Colonial Bird Nest Detection Dataset: Biscayne National Park Aerial Monitoring 2010–2024. FIU GIS Center / University of Florida GATOR Lab Dataverse. https://doi.org/10.34703/gzx1-9v95/UD9HTD
提供机构:
FIU Research Data Portal
创建时间:
2026-04-01



