EcoCleanR: Enhancing data quality of biogeographic ranges with application for marine invertebrates
收藏DataONE2026-01-22 更新2026-02-07 收录
下载链接:
https://search.dataone.org/view/sha256:f86e098fc55fb169dbf8875dc98045b3ca7c8cfaabce41779428766230a72464
下载链接
链接失效反馈官方服务:
资源简介:
Published distribution data, while invaluable for understanding species' biogeography, often suffer from limitations such as dated and static representations of ranges, a bias toward latitudinal information, and a lack of resolution in sampling frequency and variation in abundance throughout a speciesâ distribution. Extensive open-source biodiversity data now allow us to construct biogeographic ranges with more modern observations, which can be useful in conservation, evolution, and ecological studies. However, data quality remains a persistent challenge, hampering data reliability and usability. We introduce EcoCleanR, an R package that integrates existing tools with new functionalities to address data integration and quality assessment through a systematic, step-by-step approach for marine occurrence data. This package enhances the process of identifying and resolving common issues in biodiversity data, including taxonomy and georeferencing errors. It provides: (a) example scripts to ..., , # EcoCleanR: Enhancing data quality of biogeographic ranges with application for marine invertebrates
Dataset DOI: [10.5061/dryad.fj6q57488](https://doi.org/10.5061/dryad.fj6q57488)
## Description of the data and file structure
We have provided sample data files extracted in April 2025 to demonstrate a case study on the species *Mexacanthina lugubris*, a marine mollusc that inhabits rocky intertidal environments along the Pacific coast. The data sources used for this case study include GBIF, OBIS, iDigBio, and a local database (InvertEbase). The number of occurrence records may vary depending on when the data are extracted.
We created a merged dataset and removed all duplicate records using the EcoCleanR functions `ec_db_merge()` and `ec_rm_duplicate()`, resulting in an output file named **`ecodata.csv`**. All subsequent analyses can be performed using this file, with reproducible results as demonstrated in the package README.
Note: Cells containing `NA` indicate that no value was ...,
创建时间:
2026-01-23



