five

EcoCleanR: Enhancing data quality of biogeographic ranges with application for marine invertebrates

收藏
DataONE2026-01-22 更新2026-02-07 收录
下载链接:
https://search.dataone.org/view/sha256:f86e098fc55fb169dbf8875dc98045b3ca7c8cfaabce41779428766230a72464
下载链接
链接失效反馈
官方服务:
资源简介:
Published distribution data, while invaluable for understanding species' biogeography, often suffer from limitations such as dated and static representations of ranges, a bias toward latitudinal information, and a lack of resolution in sampling frequency and variation in abundance throughout a species’ distribution. Extensive open-source biodiversity data now allow us to construct biogeographic ranges with more modern observations, which can be useful in conservation, evolution, and ecological studies. However, data quality remains a persistent challenge, hampering data reliability and usability. We introduce EcoCleanR, an R package that integrates existing tools with new functionalities to address data integration and quality assessment through a systematic, step-by-step approach for marine occurrence data. This package enhances the process of identifying and resolving common issues in biodiversity data, including taxonomy and georeferencing errors. It provides: (a) example scripts to ..., , # EcoCleanR: Enhancing data quality of biogeographic ranges with application for marine invertebrates Dataset DOI: [10.5061/dryad.fj6q57488](https://doi.org/10.5061/dryad.fj6q57488) ## Description of the data and file structure We have provided sample data files extracted in April 2025 to demonstrate a case study on the species *Mexacanthina lugubris*, a marine mollusc that inhabits rocky intertidal environments along the Pacific coast. The data sources used for this case study include GBIF, OBIS, iDigBio, and a local database (InvertEbase). The number of occurrence records may vary depending on when the data are extracted. We created a merged dataset and removed all duplicate records using the EcoCleanR functions `ec_db_merge()` and `ec_rm_duplicate()`, resulting in an output file named **`ecodata.csv`**. All subsequent analyses can be performed using this file, with reproducible results as demonstrated in the package README. Note: Cells containing `NA` indicate that no value was ...,
创建时间:
2026-01-23
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作