SEN2VENµS, a dataset for the training of Sentinel-2 super-resolution algorithms
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/6514158
下载链接
链接失效反馈官方服务:
资源简介:
1 Description
SEN2VENµS is an open dataset for the super-resolution of Sentinel-2 images by leveraging simultaneous acquisitions with the VENµS satellite. The dataset is composed of 10m and 20m cloud-free surface reflectance patches from Sentinel-2, with their reference spatially-registered surface reflectance patches at 5 meters resolution acquired on the same day by the VENµS satellite. This dataset covers 29 locations with a total of 132 955 patches of 256x256 pixels at 5 meters resolution, and can be used for the training of super-resolution algorithms to bring spatial resolution of 8 of the Sentinel-2 bands down to 5 meters.
Changelog with respect to version 1.0.0 (https://zenodo.org/records/6514159)
All patches are now stored in indivual geoTiFF files with proper geo-referencing, regrouped in zip files per site and per category,
The dataset now includes 20 meter resolution SWIR bands B11 and B12 from Sentinel-2 (L2A from Theia). Note that there is no HR reference for those bands, since the VENµS sensor has no SWIR band.
2 Files organization
The dataset is composed of separate sub-datasets embedded in separate zip files, one for each site, as described in table 1. Note that there might be slight variations in number of patches and number of pairs with respect to version 1.0.0, due do incorrect count of samples in previous version (an empty tensor was still accounted for).
Table 1: Number of patches and pairs for each site, along with VENµS viewing zenith angle
Site
Number of patches
Number of pairs
VENµS Zenith Angle
FR-LQ1
4888
18
1.795402
NARYN
3813
24
5.010906
FGMANAUS
129
4
7.232127
MAD-AMBO
1442
18
14.788115
ARM
15859
39
15.160683
BAMBENW2
9018
34
17.766533
ES-IC3XG
8822
34
18.807686
ANJI
2312
14
19.310494
ATTO
2258
9
22.048651
ESGISB-3
6057
19
23.683871
ESGISB-1
2891
12
24.561609
FR-BIL
7105
30
24.802892
K34-AMAZ
1384
20
24.982675
ESGISB-2
3067
13
26.209776
ALSACE
2653
16
26.877071
LERIDA-1
2281
5
28.524780
ESTUAMAR
911
12
28.871947
SUDOUE-5
2176
20
29.170244
KUDALIAR
7269
20
29.180855
SUDOUE-6
2435
14
29.192055
SUDOUE-4
935
7
29.516127
SUDOUE-3
5363
14
29.998115
SO1
12018
36
30.255978
SUDOUE-2
9700
27
31.295256
ES-LTERA
1701
19
31.971764
FR-LAM
7299
22
32.054056
SO2
738
22
32.218481
BENGA
5857
28
32.587334
JAM2018
2564
18
33.718953
Each site zip file contains a subfolder with the site name. This subfolder contains secondary zip files for each date, following this naming convention as the pair id: {site_name}_{acquisition_date}_{mgrs_tile}. For each date, 5 zip files are available, as shown in table 2.Each zip file contain subfolder {bands}/{resolution}/ in which one GeoTiFF file per patch is stored, with the following naming convention: {site_name}_{idx}_{acquisition_date}_{mgr_tile}_{bands}_{resolution}.tif. Pixel values are encoded as 16 bits signed integers and should be converted back to floating point surface reflectance by dividing each and every value by 10 000 upon reading.
Table 2: Naming convention for zip files associated to each date.
File
Content
{id}_05m_b2b3b4b8.zip
5m patches (\(256\times256\) pix.) for S2 B2, B3, B4 and B8 (from VENµS)
{id}_10m_b2b3b4b8.zip
10m patches (\(128\times128\) pix.) for S2 B2, B3, B4 and B8 (from Sentinel-2)
{id}_05m_b5b6b7b8a.zip
5m patches (\(256\times256\) pix.) for S2 B5, B6, B7 and B8A (from VENµS)
{id}_20m_b5b6b7b8a.zip
20m patches (\(64\times64\) pix.) for S2 B5, B6, B7 and B8A (from Sentinel-2)
{id}_20m_b11b12.zip
20m patches (\(64\times64\) pix.) for S2 B11 and B12 (from Sentinel-2)
Each file comes with a master index.csv CSV (Comma Separated Values) file, with one row for each pair sampled in the given site. Columns are named after the {bands}_{resolution} pattern, and contains the full path to the corresponding GeoTiFF wihin the corresponding zip file:
{site}_{acquisition_date}_{mgrs_tile}_{bands}_{resolution}.zip/{bands}/{resolution}/{site}_{idx}_{acquisition_date}_{mgrs_tile}_{bands}_{resolution}.tif
3 Licencing
3.1 Sentinel-2 patches
3.1.1 Copyright
Value-added data processed by CNES for the Theia data centre www.theia-land.fr using Copernicus products. The processing uses algorithms developed by Theia's Scientific Expertise Centres. Note: Copernicus Sentinel-2 Level 1C data is subject to this license: https://theia.cnes.fr/atdistrib/documents/TC_Sentinel_Data_31072014.pdf
3.1.2 Licence
Files *_b2b3b4b8_10m.tif, *_b5b6b7b8a_20m.tif and *_b11b12_20m.tif are distributed under the the original licence of the Sentinel-2 Theia L2A products, which is the Etalab Open Licence Version 2.0 2.
3.2 VENµS patches
3.2.1 Copyright
Value-added data processed by CNES for the Theia data centre www.theia-land.fr using VENµS satellite imagery from CNES and Israeli Space Agency. The processing uses algorithms developed by Theia's Scientific Expertise Centres.
3.2.2 Licence
Files *_b2b3b4b8_05m.tif and *_b5b6b7b8a_05m.tif are distributed under the original licence of the VENµS products, which is Creative Commons BY-NC 4.0 3.
3.3 Remaining files
All remaining files are distributed under the Creative Commons BY 4.0 4 licence.
4 Note to users
Note that even if the VenµS2 dataset is sorted by sites and by pairs, we strongly encourage users to apply the full set of machine learning best practices when using it : random keeping separate pairs (or even sites) for testing purpose, and randomization of patches accross sites and pairs in the training and validation sets.
5 Citing
Please cite the following data paper (preprint, submitted to MDPI Data) and zenodo link when publishing work derived from this dataset:
Michel, J.; Vinasco-Salinas, J.; Inglada, J.; Hagolle, O. SEN2VENµS, a Dataset for the Training of Sentinel-2 Super-Resolution Algorithms. Data 2022, 7, 96. https://doi.org/10.3390/data7070096
10.5281/zenodo.14603764
Footnotes:
1
https://pytorch.org/
2
https://theia.cnes.fr/atdistrib/documents/Licence-Theia-CNES-Sentinel-ETALAB-v2.0-en.pdf
3
https://creativecommons.org/licenses/by-nc/4.0/
4
https://creativecommons.org/licenses/by/4.0/
创建时间:
2025-01-17



