five

GobhiSet: Dataset of raw, manually and automatically annotated RGB images across phenology of Brassica oleracea var. Botrytis

收藏
Mendeley Data2024-04-18 更新2024-06-26 收录
下载链接:
https://data.mendeley.com/datasets/dcjjcwc5dh
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset encompasses a compilation of unprocessed aerial RGB images and orthomosaics. These images, captured via a DJI Phantom 4, span several dates and depict Brassica oleracea crops. The images are uniformly distributed across crop spaces and have undergone both manual and automatic annotation. This data pool is engineered to facilitate the detection, segmentation, and growth modelling of crops, utilizing pixel information annotated both manually and automatically. The publicly accessible repository houses 244 raw RGB images, acquired over six distinct dates in October and November of 2020. The experimental farm is located in Portici, Italy. Each raw image bears a dimension of 5472×3648 pixels. The initial three sets of images, captured on October 8, 2020, October 21, 2020, and October 29, 2020, were manually annotated using bounding boxes via the Visual Geometry Group Image Annotator (VIA). These annotations were exported in the Common Objects in Context (COCO) segmentation format. The manual labelling data of the imagery dated October 8, October 21, and October 29, including region and shape attributes, is detailed in JavaScript Object Notation (JSON). These three dates served as training data for the annotator to improve the automated labelling across all dates: 8 October, 21 October, 29 October, 11 November, 18 November, and 25 November. The benchmark annotation was noted to be of 21 October, 2020, in terms of quantitative assessment criteria. Seven classes, designated as Row 1 through Row 7, have been identified for crop labelling within them. Additional attributes such as individual crop ID and the repetitiveness of individual crop specimens are delineated in the Comma Separated Values (CSV) version of the manual annotation. For the generation of automated annotations, the manual annotations were trained over a framework of Grounding DINO + Segment Anything Model (SAM), and the labels were archived in Pascal Visual Object Classes (PASCAL VOC) format. The segmentation masks, derived from automated annotations, are furnished in the form of Portable Network Graphics (PNG) images, catering to three distinct scenarios: aerial images, individual crop rows, and orthomosaics. These automated annotations facilitate the monitoring of growth across the crop phenology, employing evaluation based on binary masks of individually identified crop rows, captured across various dates. The codes utilized for these processes are accessible to ensure transparency and reproducibility. The dataset not only furnishes annotation information but can also assist in the refinement of various machine learning models.

本数据集包含未经过处理的航拍RGB图像与正射影像图(orthomosaics)的合集。这些图像由大疆(DJI)精灵4采集,覆盖多个拍摄日期,拍摄对象为甘蓝(Brassica oleracea)作物。图像均匀分布于作物种植区域,且已完成人工与自动双重标注。本数据集旨在依托人工与自动标注的像素信息,助力作物检测、图像分割与生长建模研究。该公开可用的数据集仓库共收录244张原始RGB图像,采集时间覆盖2020年10月至11月的6个不同日期。试验田位于意大利波尔蒂奇(Portici)。每张原始图像的分辨率为5472×3648像素。2020年10月8日、10月21日及10月29日采集的前三批图像,通过视觉几何组图像标注工具(Visual Geometry Group Image Annotator, VIA)采用边界框进行人工标注。上述标注以通用目标检测分割格式(Common Objects in Context, COCO)导出。2020年10月8日、21日、29日的图像标注数据(包含区域与形状属性)以JavaScript对象表示法(JavaScript Object Notation, JSON)格式存储并详细记录。上述三个日期的标注数据被用作标注模型的训练集,以优化全部6个采集日期(10月8日、10月21日、10月29日、11月11日、11月18日及11月25日)的自动标注效果。在量化评估标准中,以2020年10月21日的标注数据作为基准标注。本次标注共划定7个作物标注类别,编号为第1行至第7行。单株作物ID、单株作物样本重复次数等额外属性,均在人工标注的逗号分隔值(Comma Separated Values, CSV)文件中进行了标注说明。自动标注的生成依托Grounding DINO + 任意分割模型(Segment Anything Model, SAM)框架,基于人工标注数据进行训练,标注结果以帕斯卡视觉对象类(Pascal Visual Object Classes, PASCAL VOC)格式归档存储。由自动标注生成的分割掩码以便携式网络图形(Portable Network Graphics, PNG)格式提供,覆盖三种应用场景:航拍图像、单条作物行图像及正射影像图。上述自动标注可依托不同日期采集的单条作物行二值掩码开展评估,助力作物物候期内的生长监测。本数据集配套的处理代码均已公开,以保障研究的可追溯性与可复现性。本数据集不仅提供标注信息,还可用于优化各类机器学习模型。
创建时间:
2024-02-10
二维码
社区交流群
二维码
科研交流群
商业服务