A global consistent database of plankton and detritus from in situ imaging by the Underwater Vision Profiler 5
收藏DataCite Commons2025-10-25 更新2026-05-05 收录
下载链接:
https://www.seanoe.org/data/00964/107583/
下载链接
链接失效反馈官方服务:
资源简介:
Dataset summary
Plankton and detritus are essential components of the Earth’s oceans influencing biogeochemical cycles and carbon sequestration. Climate change impacts their composition and marine ecosystems as a whole. To improve our understanding of these changes, standardized observation methods and integrated global datasets are needed to enhance the accuracy of ecological and climate models. Here, we present a global dataset for plankton and detritus obtained by two versions of the Underwater Vision Profiler 5 (UVP5). This release contains the images classified in 33 homogenized categories, as well as the metadata associated with them, reaching 3,114 profiles and ca. 8 million objects acquired between 2008-2018 at global scale. The geographical distribution of the dataset is unbalanced, with the Equatorial region (30° S - 30° N) being the most represented, followed by the high latitudes in the northern hemisphere and lastly the high latitudes in the Southern Hemisphere. Detritus is the most abundant category in terms of concentration (90%) and biovolume (95%), although its classification in different morphotypes is still not well established. Copepoda was the most abundant taxa within the plankton, with Trichodesmium colonies being the second most abundant. The two versions of UVP5 (SD and HD) have different imagers, resulting in a different effective size range to analyse plankton and detritus from the images (HD objects >600 µm, SD objects >1 mm) and morphological properties (grey levels, etc.) presenting similar patterns, although the ranges may differ. A large number of images of plankton and detritus will be collected in the future by the UVP5, and the public availability of this dataset will help it being utilized as a training set for machine learning and being improved by the scientific community. This will reduce uncertainty by classifying previously unclassified objects and expand the classification categories, ultimately enhancing biodiversity quantification.
Data tables
The data set is organised according to:
- samples : Underwater Vision Profiler 5 profiles, taken at a given point in space and time. - objects : individual UVP images, taken at a given depth along the each profile, on which various morphological features were measured and that where then classified taxonomically in EcoTaxa.
samples and objects have unique identifiers. The sample_id is used to link the different tables of the data set together. All files are Tab separated values, UTF8 encoded, gzip compressed.
samples.tsv.gz
- sample_id unique sample identifier
- sample_name original sample identifier
- project EcoPart project title
- lat, lon location [decimal degrees]
- datetime date and time of start of profile [ISO 8601: YYYY-MM-DDTHH:MM:SSZ]
- pixel_size size of one pixel [mm]
- uvp_model version of the UVP: SD: standard definition, ZD: zoomed, HD: high definition
samples_volume.tsv.gz
Along a profile, the UVP takes many images, each of a fixed volume. The profiles are cut into 5 m depth bins in which the number of images taken is recorded and hence the imaged volume is known. This is necessary to compute concentrations.
- sample_id unique sample identifier
- mid_depth_bin middle of the depth bin (2.5 = from 0 to 5 m depth) [m]
- water_volume_imaged volume imaged = number of full images × unit volume [L]
objects.tsv.gz
- object_id unique object identifier
- object_name original object identifier
- sample_id unique sample identifier
- depth depth at which the image was taken [m]
- mid_depth_bin corresponding depth bin [m]; to match with samples_volumes
- taxon original taxonomic name as in EcoTaxa; is not consistent across projects
- lineage taxonomic lineage corresponding to that name
- classif_author unique, anonymised identifier of the user who performed this classification
- classif_datetime date and time at which the classification was
- group broader taxonomic name, for which the identification is consistent over the whole dataset
- group_lineage taxonomic lineage corresponding to this broader group
- area_mm2 measurements on the object, in real worl units (i.e. comparable across the whole dataset) …
- major_mm
- area measurements on the objet, in [pixels] and therefore not directly comparable among the different UVP models and units
- mean …
- skeleton_area
properties_per_bin.tsv.gz
The information above allows to compute concentrations, biovolumes, and average grey level within a given depth bin. The code to do so is in `summarise_objects_properties.R`.
- sample_id unique sample identifier
- depth_range range of depth over which the concentration/biovolume are computed: (start,end], in [m] where `(` means not including, `]` means including
- group broad taxonomic group
- concentration concentration [ind/L]
- biovolume biovolume [mm3/L]
- avg_grey average grey level of particles [no unit; 0 is black, 255 is white]
ODV_biovolumes.txt, ODV_concentrations.txt, ODV_grey_levels.txt
This is the same information as above, formatted in a way that Ocean Data View https://odv.awi.de can read. In ODV, go to Import > ODV Spreadsheet and accept all default choices.
Images
The images are provided in a separate, much larger, zip file. They are stored with the format `sample_id/object_id.jpg`, where `sample_id` and `object_id` are the integer identifiers used in the data tables above.
提供机构:
SEANOE
创建时间:
2025-07-25



