five

Influence of different data cleaning solutions of point-occurrence records on downstream macroecological diversity models

收藏
DataONE2022-07-14 更新2025-05-10 收录
下载链接:
https://search.dataone.org/view/sha256:4d982b0ee6ef69bc03cdc16111693093e010822a8efc51f36772ae1869bb3c96
下载链接
链接失效反馈
官方服务:
资源简介:
Digital point-occurrence records from the Global Biodiversity Information Facility (GBIF) and other data providers enable a wide range of research in macroecology and biogeography. However, data errors may hamper immediate use. Manual data cleaning is time-consuming and often unfeasible, given that the databases may contain thousands or millions of records. Automated data cleaning pipelines are therefore of high importance. This study examined the extent to which cleaned data from six pipelines using data cleaning tools (e.g., the GBIF web application, different R packages) affect downstream species distribution models. In addition, we assessed how the pipeline data differ from expert data. From 13,889 North American Ephedra observations in GBIF, the pipelines removed 31.7% to 62.7% false-positives, invalid coordinates, and duplicates, leading to data sets that included between 9,484 (GBIF application) and 5,196 records (manual-guided filtering). The expert data consisted of 703 thoroug...
创建时间:
2025-04-28
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作