Ground-based Pixel-level Cloud Dataset (GPCD)

Name: Ground-based Pixel-level Cloud Dataset (GPCD)
Creator: IEEE Dataport
License: 暂无描述

ieee-dataport.org2025-01-22 收录

下载链接：

https://ieee-dataport.org/documents/ground-based-pixel-level-cloud-dataset-gpcd

下载链接

链接失效反馈

官方服务：

资源简介：

Using the PVIFS-02 whole-sky imagers, we collected 500,000 independent cloud images from 2021 to 2023, captured in a southern city and a northern city in China. The cloud images collected in southern China are clear, with obvious cloud edges. In contrast, the cloud images from northern China appear relatively blurred. This difference is attributed to the geographical characteristics of northern China, where regions are frequently affected by sand and dust, leading to a certain degree of image blurring. It brings challenges to cloud detection and classification. In order to train and test the algorithms for pixel-level cloud detection and classification, 714 images which contained various types of clouds were selected, and were manually annotated at the pixel-level after being normalized to a resolution of 1024 $\times$ 1024. The annotated dataset, referred to as the Ground-based Pixel-level Cloud Dataset (GPCD), as shown in Fig. \ref{fig3}. This dataset contains two types of annotation files, one with a binarized cloud-sky segmentation and the other classifying clouds into eight categories at the pixel-level according to cloud genera definitions of the World Meteorological Organization (WMO), cloud approximate appearance and sky conditions in practice. Table \ref{tab1} presents the cloud genera and descriptions for each category in GPCD. To further enhance the robustness and applicability of the GPCD, the dataset was subdivided into two region-specific subsets: GPCD-North and GPCD-South. This subdivision was based on the geographical origin of the images, with GPCD-North comprising data collected from northern China and GPCD-South encompassing data from southern China. The rationale behind creating these subsets is to account for regional atmospheric differences that may influence cloud morphology and behavior. By conducting separate analyses on these subsets, we aim to evaluate the performance and robustness of cloud detection and classification algorithms in the context of regional variations. This approach not only allows for a more nuanced understanding of algorithm performance across diverse climatic conditions but also facilitates the testing of transfer learning capabilities.

运用PVIFS-02全天空成像仪，自2021年至2023年期间，我们收集了来自中国南北两地的50万张独立云图。其中，南方地区的云图清晰可见，边缘轮廓分明。相对而言，北方地区的云图则显得较为模糊。此差异可归因于中国北方的地理特征，该地区常受沙尘暴的影响，导致图像出现一定程度的不清晰。这一现象对云的检测与分类提出了挑战。为了训练和测试像素级云检测与分类算法，我们从包含多种云型的大语言模型中精选了714张图像，并在将图像标准化至1024×1024分辨率后，对像素级进行了人工标注。标注后的数据集，即地面像素级云数据集（GPCD），如图 ef{fig3}所示。该数据集包含两种类型的标注文件，一种为二值化的云-天空分割，另一种根据世界气象组织（WMO）对云型定义、云的大致外观以及实际的天空状况，将云分为八个类别进行像素级分类。表 ef{tab1}展示了GPCD中每个类别的云型及其描述。为进一步增强GPCD的鲁棒性和适用性，该数据集被细分为两个基于区域特性的子集：GPCD-North和GPCD-South。这一划分依据图像的地理来源，GPCD-North包括来自中国北方的数据，而GPCD-South则包含来自南方地区的数据。创建这些子集的目的是为了考虑到区域大气差异可能对云的形态和行为产生的影响。通过对这些子集进行单独分析，我们旨在评估云检测与分类算法在不同区域条件下的性能和鲁棒性。这种方法不仅有助于对算法在不同气候条件下的性能有更细致的理解，而且便于测试迁移学习的能力。

提供机构：

IEEE Dataport

5,000+

优质数据集

54 个

任务类型

进入经典数据集