Raw AI4Arctic Sea Ice Challenge Dataset
收藏DataCite Commons2025-05-01 更新2025-04-10 收录
下载链接:
https://data.dtu.dk/articles/dataset/Raw_AI4Arctic_Sea_Ice_Challenge_Dataset/21284967/3
下载链接
链接失效反馈官方服务:
资源简介:
The AI4Arctic Sea Ice Challenge Datasets are produced for the AI4EO sea ice competition initiated by the European Space Agency (ESA) ɸ-lab. The purpose of the competition is to develop deep learning models to automatically produce sea ice charts including sea ice concentration, stage-of-development and floe size (form) information. The training datasets contain Sentinel-1 active microwave Synthetic Aperture Radar (SAR) data and corresponding passive MicroWave Radiometer (MWR) data from the AMSR2 satellite sensor. While SAR data has ambiguities between open water and sea ice, it has a high spatial resolution, whereas MWR data has good contrast between open water and ice. However, the coarse resolution of the AMSR2 MWR observations introduces a new set of obstacles, e.g. land spill-over, which can lead to erroneous sea ice predictions along the coastline adjacent to open water. Label data in the challenge datasets are ice charts, that have been produced by the Greenland ice service at the Danish Meteorological Institute (DMI) and the Canadian Ice Service (CIS) for the safety of navigation. The challenge datasets also contain other auxiliary data such as the distance to land and numerical weather prediction model data. The scenes are from the time period from January 8 2018 to December 21 2021. Two versions of the dataset exist, the '<em>raw'</em> and '<em>ready-to-train'-</em>versions with corresponding test datasets<em>. </em>The datasets each consist of the same 512 training and 20 test (without label data) scenes. The ‘<em>ready-to-train’</em>-version has been further prepared for model training, such as downsampled data from 40 to 80 m pixel spacing, standard scaled, converted ice charts (sea ice concentration, stage of development and floe size), removal of nan values, mask alignment etc. This is the '<em>raw'</em>-version<em>. </em>The netCDF files are bundled together in groups ~25 with the filename format corresponding to the Sentinel-1 satellite from which the SAR image was acquired by, followed by the first file acquisition time to the last, i.e. S1(A/B)_FirstDate_LastDate.zip. Further details are described in the common manual that is published together with the datasets; “AI4Arctic_challenge-dataset-manual”. Code with a get-started toolkit for the '<em>ready-to-train</em>' dataset: https://github.com/astokholm/AI4ArcticSeaIceChallenge A quick challenge video overview of the challenge is available at: https://youtu.be/iuXIeLPyKfg This item is part of the Collection https://doi.org/10.11583/DTU.c.6244065 Version 2 has updated two zip files, which contained four corrupted netCDF files. The zip files in question are: S1A_20190419T203541_20190823T114541.zip S1B_20191028T132359_20200714T184241.zip In addition, 20 more scenes have been added in "added_v2.zip". Version 3 fixes an error with a duplicate zip file starting with "S1A_20190419T203541_".., adds the "S1A_20181018T121002_20190415T211043.zip" file and removed a scene with a faulty ice chart resulting in an updated "S1A_20191201T205227_20200619T122818.zip" file.
AI4北极海冰挑战赛数据集(AI4Arctic Sea Ice Challenge Datasets)是为欧洲空间局(European Space Agency, ESA)φ实验室发起的AI4EO海冰竞赛打造的。本次竞赛的目标是研发深度学习模型,以自动生成包含海冰密集度、海冰发展阶段以及浮冰尺寸(形态)信息的海冰图。
训练数据集包含哨兵-1号(Sentinel-1)主动微波合成孔径雷达(Synthetic Aperture Radar, SAR)数据,以及来自AMSR2卫星传感器的对应被动微波辐射计(MicroWave Radiometer, MWR)数据。尽管SAR数据在开阔水域与海冰之间存在歧义性,但其空间分辨率较高;而MWR数据在开阔水域与海冰之间具备良好的对比度。但AMSR2 MWR观测数据的分辨率较低,由此带来了一系列新的问题,例如陆地溢出效应,该效应可能导致紧邻开阔水域的海岸线区域出现海冰预测误差。
挑战赛数据集的标签数据为海冰图,这些海冰图由丹麦气象研究所(Danish Meteorological Institute, DMI)下属的格陵兰冰务部门,以及加拿大冰务服务中心(Canadian Ice Service, CIS)为航行安全需求制作。挑战赛数据集还包含其他辅助数据,例如离岸距离数据与数值天气预报模式数据。
数据集覆盖的时间范围为2018年1月8日至2021年12月21日。该数据集包含两个版本:分别为「原始(raw)」版本与「可直接训练(ready-to-train)」版本,且配套有对应的测试数据集。两个版本的数据集均包含相同的512个训练场景与20个无标签测试场景。「可直接训练」版本已针对模型训练做了进一步预处理,包括将像素间距从40至80米进行降采样、标准化缩放、转换海冰图格式(涵盖海冰密集度、发展阶段与浮冰尺寸)、去除非数值(nan)值以及掩码对齐等操作。本数据集即为「原始」版本。
netCDF文件以约25个为一组进行打包,文件名格式与获取SAR图像的哨兵-1号卫星对应,格式为卫星标识_首个文件获取时间_末次文件获取时间.zip,即S1(A/B)_FirstDate_LastDate.zip。数据集的更多细节可参阅随数据集一同发布的《AI4Arctic_challenge-dataset-manual》通用手册。
「可直接训练」数据集的入门工具包代码地址:https://github.com/astokholm/AI4ArcticSeaIceChallenge;本次竞赛的快速概览视频地址:https://youtu.be/iuXIeLPyKfg。本数据集隶属于集合资源https://doi.org/10.11583/DTU.c.6244065。
V2版本更新了两个包含4个损坏netCDF文件的压缩包,涉及的压缩包为:S1A_20190419T203541_20190823T114541.zip、S1B_20191028T132359_20200714T184241.zip。此外,「added_v2.zip」中新增了20个场景数据。
V3版本修复了一个以「S1A_20190419T203541_」开头的重复压缩包问题,新增了「S1A_20181018T121002_20190415T211043.zip」文件,并移除了一个存在错误海冰图的场景,由此更新了「S1A_20191201T205227_20200619T122818.zip」文件。
提供机构:
Technical University of Denmark
创建时间:
2023-02-14
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集是为AI4EO海冰竞赛设计的,包含Sentinel-1 SAR和AMSR2 MWR数据,以及DMI和CIS制作的海冰图表标签数据。数据集分为原始版本和准备训练版本,旨在支持深度学习模型开发,用于自动生成海冰图表。
以上内容由遇见数据集搜集并总结生成



