Plant detection dataset from Mapillary images in Guadeloupe
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/14916324
下载链接
链接失效反馈官方服务:
资源简介:
This dataset contains automated plant identification results derived from onboard vehicle (wardriving) camera data in Guadeloupe, using Mapillary imagery and the Pl@ntNet API for species recognition.
This dataset contains results of automated queries to the Mapillary API for a large bounding box covering the main islands of Guadeloupe (Basse-Terre, Grande-Terre, Marie Galante). The bounding box coordinates were set to [-61.9, 15.9, -61.1, 16.6], and the queries were performed on 2025-02-12. Each returned Mapillary image was then sent to the Pl@ntNet API to attempt automatic identification of the plant visible in that image.
The dataset is stored as a CSV file with one row per image. Below is a description of each column:
image_id
A unique identifier for the Mapillary image.
lat
Latitude of the image (WGS84).
lon
Longitude of the image (WGS84).
capture_date
Unix timestamp (in milliseconds) corresponding to the date/time the image was captured.
image_url
Direct link to the image resource as hosted on Mapillary associated CDN.
best_match_scientific_name
The top-ranked plant species name returned by Pl@ntNet.
If the field is “Not Found,” it means the Pl@ntNet API did not return a valid species match (e.g., status code 404).
best_match_probability
Probability (score) associated with the best_match_scientific_name, as provided by Pl@ntNet.
May be empty or zero when no valid species was found.
plantnet_data
A JSON string containing the full set of results from Pl@ntNet.
Includes details such as predicted organs, alternative species suggestions, taxonomic information, and associated metadata.
Data Collection Method
The Mapillary API was queried systematically across the bounding box ([-61.9, 15.9, -61.1, 16.6]) on 2025-02-12, retrieving available images taken along roads or other traversable paths. For each image, the latitude, longitude, and capture date were extracted.
Each image was then submitted to the Pl@ntNet API for plant identification. Only the best match (top-ranked species) and the associated probability are recorded in specific columns, while the entire Pl@ntNet response is stored under plantnet_data.
Notes and Caveats
Single-species assumption: Each image is labeled with at most one best match species. In reality, scenes may contain multiple plant taxa; however, for simplicity, this dataset focuses on the top match returned by Pl@ntNet. Full results are nevertheless stored in the plantnet_data column.
Spatial bias: Data covers only roads and areas with Mapillary coverage at the time of querying.
Temporal mismatch: The capture_date of each image can vary widely; it is not uniform across all images. This can introduce seasonal or inter-annual discrepancies in plant appearance.
Accuracy: Automated identification may yield false positives or imprecise probabilities, especially if the plant is partially obscured, out of focus, or not a common species in Pl@ntNet’s training set.
Licenses and usage:
Mapillary imagery is subject to CC-BY-SA licence (see this page).
Pl@ntNet usage data follows Pl@ntNet’s license and usage policy.
Acknowledgments
The Mapillary platform for providing access to street-level imagery and enabling community-based data collection.
The Pl@ntNet team for making plant identification services available via API.
This dataset is released for research and educational purposes. Users are encouraged to cite this deposit if it contributes to any publications or projects.
创建时间:
2025-02-24



