North Australia Sentinel 2 Satellite Composite Imagery - 15th percentile true colour (NESP MaC 3.17, AIMS)
收藏Research Data Australia2024-12-14 收录
下载链接:
https://researchdata.edu.au/north-australia-sentinel-317-aims/2921548
下载链接
链接失效反馈官方服务:
资源简介:
This dataset contains cloud free composite satellite images for the northern Australia region based on 10 m resolution Sentinel 2 imagery from 2015 – 2024. This image collection was created as part of the NESP MaC 3.17 project and is intended to allow mapping of the reef features in northern Australia. A new, improved version (version 2, published July 2024) has succeeded the draft version (published March 2024).
This collection contains composite imagery for 333 Sentinel 2 tiles around the northern coast line of Australia, including the Great Barrier Reef. This dataset uses a true colour contrast and colour enhancement style using the bands B2 (blue), B3 (green), and B4 (red). This is useful to interpreting what shallow features are and in mapping the vegetation on cays and identifying beach rock.
Changelog:
This dataset will be progressively improved and made available for download. These additions will be noted in this change log.
2024-07-22 - Version 2 composites using an improved contrast enhancement and a noise prediction algorithm to only include low noise images in composite (Git tag: "composites_v2")
2024-03-07 - Initial release draft composites using 15th percentile (Git tag: "composites_v1")
Methods:
The satellite image composites were created by combining multiple Sentinel 2 images using the Google Earth Engine. The core algorithm was:
1. For each Sentinel 2 tile filter the "COPERNICUS/S2_HARMONIZED" image collection by
- tile ID
- maximum cloud cover 20%
- date between '2015-06-27' and '2024-05-31'
- asset_size > 100000000 (remove small fragments of tiles)
Note: A maximum cloud cover of 20% was used to improve the processing times. In most cases this filtering does not have an effect on the final composite as images with higher cloud coverage mostly result in higher noise levels and are not used in the final composite.
2. Split images by "SENSING_ORBIT_NUMBER" (see "Using SENSING_ORBIT_NUMBER for a more balanced composite" for more information).
3. For each SENSING_ORBIT_NUMBER collection filter out all noise-adding images:
3.1 Calculate image noise level for each image in the collection (see "Image noise level calculation for more information") and sort collection by noise level.
3.2 Remove all images with a very high noise index (>15).
3.3 Calculate a baseline noise level using a minimum number of images (min_images_in_collection=30). This minimum number of images is needed to ensure a smoth composite where cloud "holes" in one image are covered by other images.
3.4 Iterate over remaining images (images not used in base noise level calculation) and check if adding image to the composite adds to or reduces the noise. If it reduces the noise add it to the composite. If it increases the noise stop iterating over images.
4. Combine SENSING_ORBIT_NUMBER collections into one image collection.
5. Remove sun-glint (true colour only) and apply atmospheric correction on each image (see "Sun-glint removal and atmospheric correction" for more information).
6. Duplicate image collection to first create a composite image without cloud masking and using the 30th percentile of the images in the collection (i.e. for each pixel the 30th percentile value of all images is used).
7. Apply cloud masking to all images in the original image collection (see "Cloud Masking" for more information) and create a composite by using the 30th percentile of the images in the collection (i.e. for each pixel the 30th percentile value of all images is used).
8. Combine the two composite images (no cloud mask composite and cloud mask composite). This solves the problem of some coral cays and islands being misinterpreted as clouds and therefore creating holes in the composite image. These holes are "plugged" with the underlying composite without cloud masking. (Lawrey et al. 2022)
9. The final composite was exported as cloud optimized 8 bit GeoTIFF
Note: The following tiles were generated with no "maximum cloud cover" as they did not have enough images to create a composite with the standard settings:
- 46LGM
- 46LGN
- 46LHM
- 50KKD
- 50KPG
- 53LMH
- 53LMJ
- 53LNH
- 53LPH
- 53LPJ
- 54LVP
- 57JVH
- 59JKJtime then the resulting image would be cloud free. (Lawrey et al. 2022)
Image noise level calculation:
The noise level for each image in this dataset is calculated to ensure high-quality composites by minimizing the inclusion of noisy images. This process begins by creating a water mask using the Normalized Difference Water Index (NDWI) derived from the NIR and Green bands. High reflectance areas in the NIR and SWIR bands, indicative of sun-glint, are identified and masked by the water mask to focus on water areas affected by sun-glint. The proportion of high sun-glint pixels within these water areas is calculated and amplified to compute a noise index. If no water pixels are detected, a high noise index value is assigned.
Sun glint removal and atmospheric correction:
Sun glint was removed from the images using the infrared B8 band to estimate the reflection off the water from the sun glint. B8 penetrates water less than 0.5 m and so in water areas it only detects reflections off the surface of the water. The sun glint detected by B8 correlates very highly with the sun glint experienced by the visible channels (B2, B3 and B4) and so the sun glint in these channels can be removed by subtracting B8 from these channels.
Eric Lawrey developed this algorithm by fine tuning the value of the scaling between the B8 channel and each individual visible channel (B2, B3 and B4) so that the maximum level of sun glint would be removed. This work was based on a representative set of images, trying to determine a set of values that represent a good compromise across different water surface conditions.
This algorithm is an adjustment of the algorithm already used in Lawrey et al. 2022
Cloud Masking:
Each image was processed to mask out clouds and their shadows before creating the composite image.
The cloud masking uses the COPERNICUS/S2_CLOUD_PROBABILITY dataset developed by SentinelHub (Google, n.d.; Zupanc, 2017). The mask includes the cloud areas, plus a mask to remove cloud shadows. The cloud shadows were estimated by projecting the cloud mask in the direction opposite the angle to the sun. The shadow distance was estimated in two parts.
A low cloud mask was created based on the assumption that small clouds have a small shadow distance. These were detected using a 35% cloud probability threshold. These were projected over 400 m, followed by a 150 m buffer to expand the final mask.
A high cloud mask was created to cover longer shadows created by taller, larger clouds. These clouds were detected based on an 80% cloud probability threshold, followed by an erosion and dilation of 300 m to remove small clouds. These were then projected over a 1.5 km distance followed by a 300 m buffer.
The parameters for the cloud masking (probability threshold, projection distance and buffer radius) were determined through trial and error on a small number of scenes. As such there are probably significant potential improvements that could be made to this algorithm.
Erosion, dilation and buffer operations were performed at a lower image resolution than the native satellite image resolution to improve the computational speed. The resolution of these operations was adjusted so that they were performed with approximately a 4 pixel resolution during these operations. This made the cloud mask significantly more spatially coarse than the 10 m Sentinel imagery. This resolution was chosen as a trade-off between the coarseness of the mask verse the processing time for these operations. With 4-pixel filter resolutions these operations were still using over 90% of the total processing resulting in each image taking approximately 10 min to compute on the Google Earth Engine. (Lawrey et al. 2022)
Format:
GeoTiff - LZW compressed, 8 bit channels, 0 as NoData, Imagery as values 1 - 255. Internal tiling and overviews. Average size: 12500 x 11300 pixels and 300 MB per image.
The images in this dataset are all named using a naming convention. An example file name is `AU_AIMS_MARB-S2-comp_p15_TrueColour_51KTV_v2_2015-2024.tif`. The name is made up from:
- Dataset name (`AU_AIMS_MARB-S2-comp`)
- An algorithm descriptor (`p15` for 15th percentile),
- Colour and contrast enhancement applied (`TrueColour`),
- Sentinel 2 tile (example: `54LZP`),
- Version (`v2`),
- Date range (2015 to 2024 for version 2)
References:
Google (n.d.) Sentinel-2: Cloud Probability. Earth Engine Data Catalog. Accessed 10 April 2021 from https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S2_CLOUD_PROBABILITY
Zupanc, A., (2017) Improving Cloud Detection with Machine Learning. Medium. Accessed 10 April 2021 from https://medium.com/sentinel-hub/improving-cloud-detection-with-machine-learning-c09dc5d7cf13
Lawrey, E., & Hammerton, M. (2022). Coral Sea features satellite imagery and raw depth contours (Sentinel 2 and Landsat 8) 2015 – 2021 (AIMS) [Data set]. eAtlas. https://doi.org/10.26274/NH77-ZW79
Data Location:
This dataset is filed in the eAtlas enduring data repository at: data\custodian\2023-2026-NESP-MaC-3\3.17_Northern-Aus-reef-mapping
The source code is available on [GitHub](https://github.com/eatlas/AU_NESP-MaC-3-17_AIMS_S2-comp).
本数据集基于2015年至2024年间分辨率为10米的哨兵2号(Sentinel 2)影像,构建了澳大利亚北部区域的无云合成卫星影像集。该影像集作为NESP MaC 3.17项目的成果之一,旨在支持澳大利亚北部礁体特征的制图工作。2024年7月发布的改进版(版本2)已替代2024年3月发布的草案版本。
本数据集包含澳大利亚北部海岸线周边共333景哨兵2号影像合成结果,覆盖区域包括大堡礁。本数据集采用B2(蓝光)、B3(绿光)与B4(红光)波段构建真彩色对比度与色彩增强样式,可用于识别浅海特征、绘制珊瑚礁沙洲上的植被分布以及识别滩岩。
变更日志:
本数据集将持续优化并开放下载,相关更新将在此变更日志中记录。
2024-07-22:发布版本2合成影像,采用改进的对比度增强算法与噪声预测算法,仅将低噪声影像纳入合成(Git标签:"composites_v2")。
2024-03-07:首次发布草案版合成影像,采用15%分位数(Git标签:"composites_v1")。
制作方法:
本卫星影像合成集通过谷歌地球引擎(Google Earth Engine)将多景哨兵2号影像融合生成,核心算法流程如下:
1. 针对每景哨兵2号影像,筛选“COPERNICUS/S2_HARMONIZED”影像集,筛选条件包括:
- 影像瓦片ID
- 最大云量占比20%
- 成像日期范围为'2015-06-27'至'2024-05-31'
- 资源文件大小>100000000(剔除过小的瓦片碎片)
注:设置20%的最大云量阈值以提升处理效率。多数情况下该筛选不会影响最终合成结果,因为高云量影像通常噪声水平更高,不会被纳入最终合成。
2. 按“SENSING_ORBIT_NUMBER”(传感轨道编号)对影像进行分组(详见“使用SENSING_ORBIT_NUMBER构建更均衡的合成影像”了解更多细节)。
3. 针对每个传感轨道编号分组,剔除会引入噪声的影像:
3.1 计算分组内每景影像的噪声水平(详见“影像噪声水平计算”了解更多细节),并按噪声水平对影像进行排序。
3.2 移除所有噪声指数>15的影像。
3.3 选取至少30景影像计算基准噪声水平(min_images_in_collection=30),该最小样本量可确保合成影像平滑无缺,即某景影像中的云遮蔽区域可由其他影像填补。
3.4 遍历剩余影像(未纳入基准噪声水平计算的影像),判断将该影像加入合成是否会降低整体噪声。若降低噪声则纳入合成,反之则停止遍历。
4. 将各传感轨道编号分组的影像集合并为单一影像集。
5. 移除太阳耀斑(仅针对真彩色影像)并对每景影像进行大气校正(详见“太阳耀斑移除与大气校正”了解更多细节)。
6. 复制原始影像集,首先生成未应用云掩膜的合成影像,采用影像集中所有像素的30%分位数值作为合成结果(即每个像素值取所有影像中该位置像素的30%分位数)。
7. 对原始影像集中的所有影像应用云掩膜(详见“云掩膜”了解更多细节),并通过影像集中所有像素的30%分位数值生成合成影像。
8. 将上述两种合成影像(未云掩膜合成影像与云掩膜合成影像)进行融合,解决部分珊瑚礁沙洲被误识别为云从而在合成影像中产生空洞的问题。该空洞将由未应用云掩膜的底层合成影像填补(Lawrey等人,2022)。
9. 最终合成影像以云优化8位GeoTIFF格式导出。
注:以下瓦片未采用标准最大云量筛选设置,因无法通过标准参数获取足够影像以完成合成:
- 46LGM
- 46LGN
- 46LHM
- 50KKD
- 50KPG
- 53LMH
- 53LMJ
- 53LNH
- 53LPH
- 53LPJ
- 54LVP
- 57JVH
- 59JKJ,此时生成的影像将无云。(Lawrey等人,2022)
影像噪声水平计算:
为确保合成影像的高质量,本数据集将对每景影像的噪声水平进行计算,以尽可能排除高噪声影像。该流程首先通过近红外(NIR)与绿光波段计算归一化差异水体指数(Normalized Difference Water Index, NDWI)生成水体掩膜。识别近红外与短波红外(SWIR)波段中的高反射区域(太阳耀斑特征区域),并通过水体掩膜将分析范围限定在受太阳耀斑影响的水体区域。计算该水体区域内高太阳耀斑像素的占比并进行放大,以此得到噪声指数。若未检测到水体像素,则赋予高噪声指数值。
太阳耀斑移除与大气校正:
本数据集使用红外波段B8估算水面的太阳耀斑反射,以此移除影像中的太阳耀斑。B8波段的穿透水深不足0.5米,因此在水体区域仅能探测到水面反射信号。B8波段探测到的太阳耀斑与可见光波段(B2、B3、B4)的太阳耀斑高度相关,因此可通过从可见光波段中减去B8波段信号来移除耀斑。
埃里克·劳里(Eric Lawrey)通过调整B8波段与各可见光波段(B2、B3、B4)的缩放系数,实现了太阳耀斑的最大程度移除。该工作基于一组代表性影像开展,旨在确定一组可在不同水面条件下取得良好平衡的参数。
本算法是对Lawrey等人2022年已使用算法的优化调整。
云掩膜:
在生成合成影像前,需对每景影像进行处理,掩膜去除云与云影。
本云掩膜使用由SentinelHub开发的COPERNICUS/S2_CLOUD_PROBABILITY数据集(Google, 未注明日期; Zupanc, 2017)生成,掩膜范围包括云区域,同时额外添加云影掩膜。云影通过将云掩膜沿与太阳相反的方向投影进行估算,投影距离分为两部分计算:
1. 低云掩膜:基于“小云影距较短”的假设生成。通过35%的云概率阈值识别小云,将其投影400米后,添加150米的缓冲区以扩展最终掩膜范围。
2. 高云掩膜:用于覆盖高大云体产生的长阴影。通过80%的云概率阈值识别高大云体,经300米的腐蚀与膨胀操作去除小型云团后,将其投影1.5公里,再添加300米的缓冲区。
云掩膜的参数(概率阈值、投影距离与缓冲区半径)通过对少量场景的反复试验确定,因此该算法仍存在较大的优化空间。
腐蚀、膨胀与缓冲区操作均在低于卫星原生分辨率的尺度下执行以提升计算速度。本次操作的分辨率调整为约4个像素的尺度,使得云掩膜的空间分辨率远低于10米的哨兵2号影像分辨率。该分辨率是掩膜粗糙程度与处理耗时之间的权衡结果:即便采用4像素的滤波尺度,这些操作仍占用了谷歌地球引擎中90%以上的总处理资源,单景影像的计算耗时约为10分钟。(Lawrey等人,2022)
数据格式:
采用GeoTIFF格式,LZW压缩,8位通道,无数据值为0,影像像素值范围为1-255。支持内部分块与概视图。单景影像平均尺寸为12500×11300像素,大小约300MB。
本数据集内的影像均遵循统一命名规范,示例文件名如下:`AU_AIMS_MARB-S2-comp_p15_TrueColour_51KTV_v2_2015-2024.tif`。文件名各组成部分说明如下:
- 数据集名称(`AU_AIMS_MARB-S2-comp`)
- 算法标识(`p15`代表15%分位数)
- 应用的色彩与对比度增强方案(`TrueColour`代表真彩色)
- 哨兵2号影像瓦片编号(示例:`54LZP`)
- 版本号(`v2`)
- 成像日期范围(版本2的日期范围为2015年至2024年)
参考文献:
Google (未注明日期). Sentinel-2: Cloud Probability. Earth Engine 数据集目录. 2021年4月10日访问自 https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S2_CLOUD_PROBABILITY
Zupanc, A., (2017) Improving Cloud Detection with Machine Learning. Medium. 2021年4月10日访问自 https://medium.com/sentinel-hub/improving-cloud-detection-with-machine-learning-c09dc5d7cf13
Lawrey, E., & Hammerton, M. (2022). Coral Sea features satellite imagery and raw depth contours (Sentinel 2 and Landsat 8) 2015 – 2021 (AIMS) [数据集]. eAtlas. https://doi.org/10.26274/NH77-ZW79
数据存储位置:
本数据集存储于eAtlas永久数据仓库中,路径为:datacustodian2023-2026-NESP-MaC-33.17_Northern-Aus-reef-mapping
源代码可在[GitHub](https://github.com/eatlas/AU_NESP-MaC-3-17_AIMS_S2-comp)获取。
提供机构:
Australian Ocean Data Network



