Harmonized Tree Species Occurrence Points for Europe
收藏NIAID Data Ecosystem2026-03-13 收录
下载链接:
https://zenodo.org/record/4061816
下载链接
链接失效反馈官方服务:
资源简介:
This data set is a harmonized collection of existing data from GBIF, the EU-Forest project and the LUCAS survey. It has about 3 million observations and is supplemented by variables (e.g. location accuracy, land cover type, canopy height) which enable precise filtering for certain needs.
An .rds file is created from an sf-object in R. The .csv contains records as a table with Easting and Northing in Coordinate Reference System ETRS89 / LAEA Europe (= EPSG code 3035).
The code is publicly available on GitLab.
Variables:
id = unique point identifier
easting = x coordinate
northing = y coordinate
country = ISO country code
species = Latin species name
genus = genus name
scientific_name = long species name
gbif_taxon_key = taxon key from GBIF
gbif_genus_key = genus key from GBIF
taxon_rank = species or genus
year = year of observation
accessed_through = database through which data was accessed (GBIF, LUCAS, EU-Forest)
dataset_info = data set name (individual sub-data-set)
citation = DOI citation of the individual data set
license = distribution license
location_accuracy = spatial accuracy of observation (meters)
flag_location_issue = known location issues present
flag_date_issue = known date issues present
eoo = Extent of occurrence (applying the concept of natural geographical range used for the EU-Forest data set (Mauri et al., 2017) to all other data points. 1 = point inside species range; 0 = point outside; NA = EOO polygon not available for this species)
dbh = Diameter Breast Height (only recorded for observations from the EU-Forest data set)
lc1 = LUCAS land cover type 1 (only recorded for observations from LUCAS data)
lc2 = LUCAS land cover type 2 (only recorded for observations from LUCAS data)
landmask_country = land mask overlay 30 meters (NA = not on land)
corine = CORINE 2018 land cover type (extracted from the 100 meter raster data set)
nightlights = light pollution observed by VIIRS (proxy for remoteness / distance to human structures)
canopy_height = canopy height derived from GEDI waveform LiDAR point data
natura_2000 = Natura 2000 site code (if a point falls inside a protected area (GIS-layer) this variable contains the site identification code; all sites can be explored on an interactive map)
freq_location = number of points with identical location (in some cases one location has multiple observation, differing in species and/or year. This may lead to difficulties in certain modeling tasks)
geometry = point geometry in ETRS89 / LAEA Europe
See this detailed documentation for more insights into each variable.
If you would like to know more about the creation of this data set, see
the R-Markdown documenting the process (GitLab repository)
the talk at OpenGeoHub Summer School 2020 (Youtube)
Some advice: This data set is a puzzle with pieces from many different sources. Take some time to explore before including it in your work. Use summary statistics to see which variables have NAs and how many. Choose your filtering criteria wisely. For example, some points with the highest location accuracy have no record for the year of observations. You would exclude these, if "year > 1990" was your criteria.
This work has received funding from the European Union's the Innovation and Networks Executive Agency (INEA) under Grant Agreement Connecting Europe Facility (CEF) Telecom project 2018-EU-IA-0095 (https://ec.europa.eu/inea/en/connecting-europe-facility/cef-telecom/2018-eu-ia-0095).
创建时间:
2021-10-21



