SDO 2H Machine Learning Dataset
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/10465436
下载链接
链接失效反馈官方服务:
资源简介:
This dataset provides a compact Machine Learning ready dataset of SDO EUV and HMI medium-resolution (1024x1024 pixels) images, for a total of 56,664 samples from May 14, 2010, to April 18, 2023, with a temporal cadence of 2 hours.
EUV images are provided at the following wavelength : 1600A, 304A, 211A, 193A, 171A and 94A.They are processed from the level 1.5 AIA-synoptic dataset (http://jsoc.stanford.edu/data/aia/synoptic/) and are successively:
corrected for instrument degradation
normalised by exposure time
log-transformed (x->log(1+x)), symetrically on positive and negative values
saturated to the 99.9 percentile maximum pixel value of the dataset, up to 2020*, for each channel
linearly scaled between 0 and 255, converted to 8bit integers and compressed as jpegs
The HMI's line-of-sight magnetograms (blos.zip) are retrieved from JSOC from the level 1.5 45-second line-of-sight serie and are successively :
downscaled to 1024x1024 pixels
standardized to a 2.4 arcec-to-pixel resolution (equal to the EUV images)
aligned with the EUV images
log-transformed (x->log(1+x))
saturated to the 99.9 percentile maximum pixel value of the dataset, up to 2020*
linearly scaled between 0 and 255, with 127 representig original null values, 0 and 255 respecivelly the negative and positive saturation value (approximately 4644G before log-transformation)
converted to 8bit integers and compressed as jpegs
Downscaled and cropped images (224x448 pixels) used in Francisco et al., 2023 are aso provided in pcnn_images.zip
An outlier study is also provided in anomalies.zip, from which '{wavelength}_anomalies_grades.csv' files can be used to exclude the dates where abnormal samples of a given type ('anomalies_grade_scale.txt') are identified.
*The percentile values are computed on the pixels joint distribution using all sample from 2010 to 2019-12 included, so that the period starting from 2020-01 can be used as a completely independant test set.Original exposure and instrument degradation corrected values can be retrieved using the saturation values provided belows. Althought the JPEG encoding results in the loss of small scale information, the dataset processing preserve the physical intensity of the original inputs so that the provided compressed images can efficiently be used to estimate large and medium-scale Active Regions physical features.
HMI / BLOS
±4644 G
1600
9,360 DN
304
44,488 DN
171
29,599 DN
193
81,139 DN
211
8,179 DN
94
6,099 DN
创建时间:
2024-05-08



