Ready-To-Train AI4Arctic Sea Ice Challenge Dataset
收藏DataCite Commons2025-04-01 更新2025-04-10 收录
下载链接:
https://data.dtu.dk/articles/dataset/Ready-To-Train_AI4Arctic_Sea_Ice_Challenge_Dataset/21316608/3
下载链接
链接失效反馈官方服务:
资源简介:
The AI4Arctic Sea Ice Challenge Datasets are produced for the AI4EO sea ice competition initiated by the European Space Agency (ESA) ɸ-lab. The purpose of the competition is to develop deep learning models to automatically produce sea ice charts including sea ice concentration, stage-of-development and floe size (form) information. The training datasets contain Sentinel-1 active microwave Synthetic Aperture Radar (SAR) data and corresponding passive MicroWave Radiometer (MWR) data from the AMSR2 satellite sensor. While SAR data has ambiguities between open water and sea ice, it has a high spatial resolution, whereas MWR data has good contrast between open water and ice. However, the coarse resolution of the AMSR2 MWR observations introduces a new set of obstacles, e.g. land spill-over, which can lead to erroneous sea ice predictions along the coastline adjacent to open water. Label data in the challenge datasets are ice charts, that have been produced by the Greenland ice service at the Danish Meteorological Institute (DMI) and the Canadian Ice Service (CIS) for the safety of navigation. The challenge datasets also contain other auxiliary data such as the distance to land and numerical weather prediction model data. The scenes are from the time period from January 8 2018 to December 21 2021. Two versions of the dataset exist, the '<em>raw'</em> and '<em>ready-to-train'-</em>versions with corresponding test datasets<em>. </em>The datasets each consist of the same 512 training and 20 test (without label data) scenes. The ‘<em>ready-to-train’</em>-version has been further prepared for model training, such as downsampled data from 40 to 80 m pixel spacing, standard scaled, converted ice charts (sea ice concentration, stage of development and floe size), removal of nan values, mask alignment etc. This is the Ready-To-Train version. Further details are described in the common manual that is published together with the datasets; “AI4Arctic_challenge-dataset-manual”. Code with a get-started toolkit for the '<em>ready-to-train</em>' dataset: https://github.com/astokholm/AI4ArcticSeaIceChallenge A quick challenge video overview of the challenge is available at: https://youtu.be/iuXIeLPyKfg This item is part of the Collection https://doi.org/10.11583/DTU.c.6244065 Version 2 has 20 additional scenes and has been reprocessed to accommodate the updated mean and STandard Deviation (std). Furthermore, SOD and FLOE variables have been slightly altered from version 1, as the dominant ice code threshold was incorrectly set to 70% and 50%, respectively, instead of the 65%, which was otherwise specified in the dataset manual. Version 3 removes a scene with a faulty ice chart.
AI4Arctic海冰挑战数据集(AI4Arctic Sea Ice Challenge Datasets)是为欧洲空间局(European Space Agency, ESA)ɸ-lab发起的AI4EO海冰竞赛打造的定制数据集。本次竞赛的目标是研发深度学习模型,以自动生成海冰图,涵盖海冰密集度、冰发展阶段以及浮冰尺寸(形态)相关信息。训练数据集包含Sentinel-1主动微波合成孔径雷达(Synthetic Aperture Radar, SAR)数据,以及搭载于AMSR2卫星传感器的被动微波辐射计(MicroWave Radiometer, MWR)配套观测数据。尽管合成孔径雷达数据在开阔水域与海冰之间存在识别歧义,但其空间分辨率较高;而被动微波辐射计数据在开阔水域与海冰之间具有良好的对比度。但AMSR2被动微波辐射计观测数据的粗分辨率带来了一系列新的问题,例如陆地溢漏效应(land spill-over),这会导致紧邻开阔水域的海岸线周边海冰预测出现误差。本次挑战数据集的标签数据为海冰图,由丹麦气象研究所(Danish Meteorological Institute, DMI)下属格陵兰冰情服务部门以及加拿大冰情服务局(Canadian Ice Service, CIS)为保障航行安全所制作。挑战数据集还包含其他辅助数据,例如距陆地距离数据以及数值天气预报模式数据。数据集覆盖的影像场景时段为2018年1月8日至2021年12月21日。该数据集共包含两个版本:分别为「原始(raw)」版本与「可直接训练(ready-to-train)」版本,且均配有对应的测试数据集。两个版本的数据集均包含512组训练场景与20组无标签测试场景。「可直接训练」版本已针对模型训练做了进一步预处理,包括将数据下采样至40至80米的像素间距、标准化处理、转换海冰图格式(涵盖海冰密集度、冰发展阶段与浮冰尺寸)、剔除非数值(NaN)值以及掩膜对齐等操作。本次提供的即为可直接训练版本。数据集的更多细节可参阅随数据集一同发布的《AI4Arctic_challenge-dataset-manual》通用手册。针对可直接训练版本数据集的入门工具包代码地址为:https://github.com/astokholm/AI4ArcticSeaIceChallenge;本次竞赛的快速概览视频地址为:https://youtu.be/iuXIeLPyKfg。本数据集隶属于馆藏集合 https://doi.org/10.11583/DTU.c.6244065。V2版本新增了20组影像场景,并针对更新后的均值与标准差(Standard Deviation, std)进行了重新处理。此外,V1版本中的SOD(冰发展阶段)与FLOE(浮冰尺寸)变量均做了小幅调整:原数据中主导冰类代码的阈值被错误设定为70%与50%,而数据集手册中指定的正确阈值应为65%。V3版本移除了1组存在错误海冰图的影像场景。
提供机构:
Technical University of Denmark
创建时间:
2023-02-14
搜集汇总
数据集介绍

以上内容由遇见数据集搜集并总结生成



