A dataset of Earth Observation Data for Lithological Mapping using Machine Learning
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/7806929
下载链接
链接失效反馈官方服务:
资源简介:
Dataset Information
Machine Learning (ML) algorithms had successfully contributed in the creation of automated methods of recognizing patterns in high-dimensional data. Remote sensing data covers wide geographical areas and could be used to solve the problem of the demand of various in-situ data. Lithologicall mapping using remotely sensed data is one of the most challenging applications of ML algorithms. In the framework of the “AI for Geoapplications” project , ML and especially Deep Learning (DL) methodologies are investigated for the identification and characterization of the lithology based on remote sensing data in various pilot areas in Greece. In order to train and test the various ML algorithms, a dataset consisting of 30 ROIs selected mainly from low -vegetated areas, that cover 2% of the total area of Greece was created
Dataset Preprocessing
Dataset preprocessing was executed using a combination of SNAP, QGIS and ENVI tools.
Preprocessing steps:
Defining areas with the following properties:
Zero cloud and snow coverage
No water bodies
Minimum vegetation
For the Aster Images:
Subset on defined areas
Mosaic images when needed
Digitising clouds
For the Labels:
We got the Soil map from YPEN (https://ypen.gov.gr/)
Subset on defined areas
All categories are represented with good analogies
Clip label files with digitised clouds
Rasterize
For the Labels we have eighteen categories for the twenty-eight areas that we collected data. We use the following coding for the Labels of our Dataset:
Alluvial deposits
0
Limestone colluvial deposits
1
Limestones
2
Schists
3
Quaternary sediments
4
Gneiss
5
Slope fan debris
6
Mixed flysch
7
Flysch shale and cherts
8
Dolomites
9
Granite
10
Sandstone flysch
11
Flysch colluvial deposits
12
Peridotite and Gabbro
13
River bed deposits
14
Gneiss colluvial deposits
15
Not available
-100
cloud coverage
-999
The following table lists the available areas and the categories that each contains: Lithology_Dataset
For the Sentinel-2 images, we made the following process:
Resampling 10m
Subset on defined areas
The Sentinel-2 map contains: Sentinel 2 false colour composite 11/8/4 with OSM background
The Final step is the collocation of the previous into a datacube i.e a multidimensional array with 25 bands (datacube dimensions differentiate for every area) using the Aster image as base (15m spatial resolution).
Bands 1-14: Aster
Bands 15-24: S2
Band 25: Label
The code for preprocessing the dataset in order to be used for machine learning algorithms can be found in the following link:
https://github.com/georgegiannop/Lithology
Citation
If you use this dataset in your work, please cite our paper:
Vernikos, I., Giannopoulos, G., Christopoulou, A., Begaj, A., Stefouli, M., Bratsolis, E., and Charou, E.: A dataset of Earth Observation Data for Lithological Mapping using Machine Learning, EGU General Assembly 2023, Vienna, Austria, 24–28 Apr 2023, EGU23-17570, https://doi.org/10.5194/egusphere-egu23-17570, 2023.
创建时间:
2024-07-12



