UVP6Net : plankton images captured with the UVP6

Name: UVP6Net : plankton images captured with the UVP6
Creator: SEANOE
License: 暂无描述

doi.org2025-01-15 收录

下载链接：

https://doi.org/10.17882/101948

下载链接

链接失效反馈

官方服务：

资源简介：

plankton was imaged with uvp6 in contrasted oceanic regions. the full images were processed by the uvp6 firmware and the regions of interest (rois) around each individual object were recorded. a set of associated features were measured on the objects (see picheral et al. 2021, doi:10.1002/lom3.10475, for more information). all objects were classified by a limited number of operators into 110 different classes using the web application ecotaxa (http://ecotaxa.obs-vlfr.fr). the following dataset corresponds to the 634 459 objects that have an area superior to 73 square pixels (equivalent spherical diameter of 9.8 pixels, corresponding to the default size limit of 620µm in the uvp6 configuration). the different files provide information about the features of the objects, their taxonomic identification as well as the raw images. for the purpose of training machine learning classifiers, the images in each class were split into training, validation, and test sets, with proportions 70%, 15% and 15%.an additional folder is provided, which includes the subset of images used to train the unique embedded classification model of the uvp6 actually deployed on the nke cts5 floats (10.5281/zenodo.10694203). these images correspond to uvp6net objects filtered to retain only those with a size of 79 square pixels to fit with the 645µm class from ecopart, resulting in a total of 595,595 objects. the taxonomic identification was also made coarser (from 110 classes to 20) to ensure adequate performance of the classification model on power-constrained hardware. images in this subset display objects as shades of grey/white on a black background.first folder (uvp6net.tar) contains :taxa.csv.gztable of the classification of each object in the dataset, with columns : objid: unique object identifier in ecotaxa (integer number). taxon_level1: taxonomic name corresponding to the level 1 classification lineage_level1: taxonomic lineage corresponding to the level 1 classification taxon_level2: name of the taxon corresponding to the level 2 classification plankton: if the object is a plankton or not (boolean) set: class of the image corresponding to the taxon (train: training, val: validation, or test) img_path: local path of the image corresponding to the taxon (of level 1), named according to the object idfeatures_native.csv.gztable of morphological features computed by zoocam. all features are computed on the object only, not the background. all area/length measures are in pixels. all grey levels are in encoded in 8 bits (0=black, 255=white). with columns : objid: unique object identifier in ecotaxa (integer number). area: surface area of the object (integer number) mean: average grey value within the object; sum of the grey values of all pixels in the object divided by the number of pixels stddev: standard deviation of the grey value used to generate the mean grey value mode: modal grey value within the object min: minimum grey value within the object (0 = black) max: maximum grey value within the object (255 = white) perim: the length of the outside boundary of the object width: width of the smallest rectangle enclosing the object height: height of the smallest rectangle enclosing the object major: primary axis of the best fitting ellipse for the object minor: secondary axis of the best fitting ellipse for the object angle: angle between the primary axis and a line parallel to the x-axis of the image circ: circularity = (4 * pi * area) / perim2) ; a value of 1 indicates a perfect circle, a value approaching 0 indicates an increasingly elongated polygon feret: maximum feret diameter, i.e., the longest distance between any two points along the object boundary intden: integrated density. this is the sum of the grey values of the pixels in the object (i.e. = area*mean) median: median grey value within the object skew: skewness of the histogram of grey level values kurt: kurtosis of the histogram of grey level values %area: percentage of object’s surface area that is comprised of holes, defined as the background grey level area_exc: surface area of the holes in the object, in square pixels (=area*(1-(%area/100)) fractal: fractal dimension of object boundary (berube and jebrak 1999) skelarea: surface area of skeleton in pixels. in a binary image, skeleton is obtained by repeatedly removing pixels from the edges of objects until they are reduced to the width of a single pixel slope: slope of the grey level normalized cumulative histogram histcum1, 2, 3: grey level value at the first, second and third quartile of the normalized cumulative histogram of grey levels nb1 nb2 nb3: number of remaining objects in the image after thresholding on level histcum1, 2 and 3 symetrieh: bilateral horizontal symmetry index symetriev: bilateral vertical symmetry index symetriehc: symmetry of the largest remaining object in relation to the horizontal axis[...]

浮游生物于对比鲜明的海洋区域经UVPI6成像。图像的完整数据经UVPI6固件处理，并记录了每个独立对象周围的感兴趣区域（ROI）。在对象上测量了一系列相关特征（详见Picheral等，2021，DOI：10.1002/lom3.10475，以获取更多信息）。所有对象均由有限数量的操作员使用Web应用程序Ecotaxa（http://ecotaxa.obs-vlfr.fr）分类至110个不同的类别。以下数据集对应于面积为73平方像素以上的634459个对象（相当于9.8像素的等距球体直径，对应于UVPI6配置中的默认尺寸限制620µm）。不同的文件提供了关于对象特征、其分类鉴定以及原始图像的信息。为了训练机器学习分类器，每个类别的图像被分为训练集、验证集和测试集，比例分别为70%、15%和15%。另外还提供了一个文件夹，其中包含用于训练UVPI6实际部署在NKE CTS5漂流浮标上独特的嵌入式分类模型的一组图像（10.5281/zenodo.10694203）。这些图像对应于UVPI6net对象，经过筛选仅保留79平方像素大小的对象以符合Ecopart中的645µm类别，从而产生总计595595个对象。分类鉴定也被进行了粗化（从110个类别减少至20个）以确保分类模型在功耗受限的硬件上能够达到足够的性能。该子集中的图像以黑白背景上的灰度/白色阴影显示对象。第一个文件夹（uvp6net.tar）包含以下内容： taxa.csv.gz - 数据集中每个对象的分类表，列包括： objid：Ecotaxa中唯一的对象标识符（整数）。 taxon_level1：对应于一级分类的分类学名称。 lineage_level1：对应于一级分类的分类学谱系。 taxon_level2：对应于二级分类的物种名称。 plankton：对象是否为浮游生物（布尔值）。 set：对应于物种的一级分类的图像类别（train：训练，val：验证，或test）。 img_path：对应于物种的一级分类的图像的本地路径，根据对象ID命名。 features_native.csv.gz - 由Zoocam计算出的形态学特征表。所有特征均仅计算于对象上，而非背景。所有面积/长度测量单位为像素。所有灰度级别均以8位编码（0=黑色，255=白色）。列包括： objid：Ecotaxa中唯一的对象标识符（整数）。 surface_area：对象的表面积（整数）。 mean：对象内部的平均灰度值；对象中所有像素灰度值的总和除以像素数。 stddev：用于生成平均灰度的灰度值的标准差。 mode：对象内部的模态灰度值。 min：对象内部的最小灰度值（0=黑色）。 max：对象内部的最大灰度值（255=白色）。 perim：对象外部边界的长度。 width：包含对象的矩形的最小宽度。 height：包含对象的矩形的最小高度。 major：最适合对象的最佳拟合椭圆的主轴。 minor：最适合对象的最佳拟合椭圆的次轴。 angle：主轴与图像的x轴平行的线的角度。 circ：圆形度 = (4 * π * area) / perim^2；值为1表示完美的圆形，值接近0表示越来越细长的多边形。 feret：Feret最大直径，即沿对象边界任意两点之间的最长距离。 intden：集成密度。这是对象中像素灰度值的总和（即=面积*mean）。 median：对象内部的灰度值的中位数。 skew：灰度值直方图的偏度。 kurt：灰度值直方图的峰度。 %area：对象表面积中由孔组成的百分比，定义为背景灰度值。 area_exc：对象中孔的表面积，以平方像素计（=area*(1-(%area/100)))。 fractal：对象边界的分形维度（Berube和Jebrak，1999）。 skelarea：以像素为单位的骨骼表面积。在二值图像中，骨骼是通过反复从对象边缘移除像素直到其宽度减至单个像素宽度而获得的。 slope：灰度值归一化累积直方图的斜率。 histcum1, 2, 3：归一化累积灰度值直方图的第一个、第二个和第三个四分位数。 nb1 nb2 nb3：在阈值为histcum1、2和3后图像中剩余对象的数量。 symetrieh：双边水平对称指数。 symetriev：双边垂直对称指数。 symetriehc：与水平轴相关联的最大剩余对象的对称性。 ...（此处省略了更多列和定义）

提供机构：

SEANOE