Crowds & Machines Next level: Meditteranean wheat classification labels from gamified crowd-sourcing
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/7849548
下载链接
链接失效反馈官方服务:
资源简介:
Machine learning (and especially deep learning) algorithms need lots of training and validation datasets, which are often unavailable. Creating on-ground datasets is costly and time consuming. Within the European Space Agency funded project ‘Crowds & Machine – Next Level’ (by Blackshore B.V., 52impact B.V. and The Hague Centre for Strategic Studies), we aimed to solve this issue by generating labelled data effectively using an innovative gamified crowdsourced-based method.
The objective of the project ‘Crowds & Machines Next Level’ was to generate labelled data for the training and validation of machine learning algorithms to classify the crop wheat. We make those labelled datasets freely available as open data to organisations that use machine learning for their activities, mainly companies and knowledge institutes. As part of the project we developed example scripts (Jupyter notebooks) that enable organisations to use the crowdsourced generated data smoothly for their own machine learning systems.
BlackShore has developed the online platform Cerberus to enable large scale generation of labelled datasets, which is deployed on twenty locations around the Mediterranean Sea to generate labelled datasets of wheat and other land cover classes (see table). Those different locations encompass a diversity of climate regions, harvest cultures and crop calendars, posing a challenge to the training of machine learning algorithms. Gamers click on hexagons plotted on top of very high resolution satellite imagery (captured during the harvest period in 2021), and by combining 3 different hexagon grids those clicks are converted into triangles. Each triangle has a number of clicks (by different users) per land cover category, which provides a measure of accuracy to the label.
52impact developed example tutorials to use the data to train pixel-based (Random Forest) and segmentation-based (U-Net) machine learning models, using Sentinel-2 imagery (provided in the data folder), which can be forked here: https://bitbucket.org/52impact/crowds-machines.
Overview of locations
ID
location_id
Country
Region
Shape
Harvest period
VHR image date
S-2 pre-harvest
S-2 harvest
S-2 post-harvest
01
portugalAlentejo
Portugal
Alentejo
01_Portugal_Alentejo_SELECTION
10 Jul - 1 Aug
07/07/2021
14/05/2021
13/07/2021
22/08/2022
02
spainAndalusia
Spain
Andalusia
02_Spain_Andalusia_SELECTION
10 Jul - 1 Aug
02/07/2021
16/05/2021
15/07/2021
03/09/2021
03
spainAragon
Spain
Aragon
03_Spain_Aragon_SELECTION
10 Jul - 1 Aug
26/10/2021
20/05/2021
19/07/2021
05/09/2021
04
franceAude
France
Aude
04_France_Aude_SELECTION
1 Jul - 1 Oct
22/09/2021
12/05/2021
10/08/2021
18/11/2021
05
franceCamargue
France
Camargue
05_France_Camargue_SELECTION
1 Jul - 1 Oct
07/10/2021
12/05/2021
10/08/2021
18/11/2021
06
franceProvence
France
Provence
06_France_Provence_SELECTION
1 Jul - 1 Oct
26/10/2021
19/05/2021
17/08/2021
20/11/2021
07_08
italyMarche
Italy
Marche (East and West)
07_08_Italy_Marche_SELECTION
1 Jul - 1 Sept
09/08/2021
26/05/2021
25/07/2021
20/11/2021
09
italySardinia
Italy
Sardinia
09_Italy_Sardinia_SELECTION
1 Jul - 1 Sept
31/08/2021
26/05/2021
22/07/2021
10/10/2021
10
italySicily
Italy
Sicily
10_Italy_Sicily_SELECTION
1 Jul - 1 Sept
19/09/2021
22/05/2021
26/07/2021
10/10/2021
11
italyPugliaNorth
Italy
Puglia (North)
11_Italy_PugliaNorth_SELECTION
1 Jul - 1 Sept
06/10/2021
11/06/2021
31/07/2021
04/10/2021
12
italyPuglia
Italy
Puglia
12_Italy_Puglia_SELECTION
1 Jul - 1 Sept
19/08/2021
03/06/2021
02/08/2021
21/10/2021
13
greeceWest
Greece
West
13_Greece_West_SELECTION
1 Sept - 1 Nov
02/09/2021
27/07/2021
05/10/2021
14/12/2021
14
greeceThessaly
Greece
Thessaly
14_Greece_Thessaly_SELECTION
1 Sept - 1 Nov
14/07/2021
27/07/2021
25/09/2021
19/12/2021
15
greeceMacedoniaCentral
Greece
Macedonia (Central)
15_Greece_MacedoniaCentral_SELECTION
1 Jun - 1 Aug
22/07/2021
13/05/2021
22/07/2021
15/09/2021
16
greeceMacedoniaEast
Greece
Macedonia (East)
16_Greece_MacedoniaEast_SELECTION
1 Jun - 1 Aug
05/08/2021
25/05/2021
29/07/2021
27/10/2021
17
greeceRhodes
Greece
Rhodes
17_Greece_Rhodes_SELECTION
15 May - 1 Jul
09/05/2021
25/03/2021
24/05/2021
22/08/2021
18
cyprusLarnaca
Cyprus
Larnaca
18_Cyprus_Larnaca_SELECTION
15 May - 1 Jul
05/06/2021
19/03/2021
07/06/2021
21/08/2021
19
turkeyCyprus
Cyprus (T)
Farmagusta
19_Turkey_Cyprus_SELECTION
15 May - 1 Jul
05/06/2021
29/03/2021
17/06/2021
26/08/2021
20
egyptBehera
Egypt
Behera
20_Egypt_Behera_SELECTION
1 Apr - 1 Jul
06/03/2021
26/01/2021
07/03/2021
19/08/2021
The following data is provided:
Triangulated_data.zip: contains per region and per category a geopackage (gpkg) file containing triangular polygons with the number of clicks per polygon. The filename of the polygon files depends on the location and category. For example, a file that contains the triangles corresponding to Cattle in Alentejo, Portugal, is called: 01_Portugal_Alentejo_Cattle.gpkg
Data.zip: all data necessary to run the Jupyter notebooks, i.e., location data, cropped Sentinel-2 satellite imagery (for training location IDs 01, 02, 12 and 15, and validation locations near IDs 02 and 15) and also the triangulated polygons.
Models.zip: pre-trained random forest and U-Net models based on the data, which can be generated by the Jupyter notebooks.
创建时间:
2023-08-14



