Transfer Learning Dataset for Metal Oxide Semiconductor Gas Sensors
收藏NIAID Data Ecosystem2026-03-14 收录
下载链接:
https://zenodo.org/record/6821339
下载链接
链接失效反馈官方服务:
资源简介:
The "Transfer Learning Dataset for Metal Oxide Semiconductor Gas Sensors" can be used to test machine learning approaches on their capability of interpreting sensor patterns of commercially available MOS gas sensors, i.e., SGP40 (Sensirion AG, Stäfa, Switzerland), to predict multiple different gas concentrations and the relative humidity. Furthermore, the dataset can be used to test the transferability between sensors.
The dataset was recorded with the help of a custom-built gas mixing apparatus (GMA). The GMA allows applying well-known gas mixtures to multiple gas sensors. For this experiment, three SGP40 with four sub-sensors each were exposed to 900 different unique gas mixtures (UGMs) consisting of ten different gases. In detail, the dataset consists of eight volatile organic compounds (VOCs) (acetic acid, acetone, ethanol, ethyl acetate, formaldehyde, isopropanol, toluene, and xylene), two background gases (carbon monoxide and hydrogen), and the relative humidity at 20 °C. During exposure, the sensors are operated in a temperature-cycled operation. The temperature cycle consists of alternating high and low-temperature phases. The high-temperature phases are set at 400 °C and have a duration of 5 seconds, while the low-temperature steps increase in 25 °C steps from 100 °C-375 °C, where each step has a duration of 7 seconds. The only exception is sub-sensor 4, where the temperature is only alternated between 250 °C and 300 °C. The total duration of the temperature cycle is 144 seconds, and during this time, the logarithmic sensor resistance is read out at 10 Hz. Each gas mixture was recorded for ten temperature cycles to ensure that stable gas mixtures were applied to the sensor. Only stable samples 6 (not always),7,8, and 9 were used for further evaluation. The 900 UGMs can be separated into three parts, and for each part, the mixtures were generated based on Latin hypercube sampling and the ranges specified in Table 1.
Tabel 1: Uniform distributed ranges for all gasses within the gas mixtures
UGM 1-200
UGM 201-500
UGM501-900
Carbon monoxide
100 - 2000 ppb
100 - 2000 ppb
100 - 2000 ppb
Hydrogen
400 - 2000 ppb
400 - 2000 ppb
400 - 2000 ppb
Relative humidity
25 - 80 %
25 - 80 %
25 - 80 %
Acetic acid
1 - 50 ppb
1 - 150 ppb
1 - 500 ppb
Acetone
3 - 50 ppb
3 - 150 ppb
3 - 500 ppb
Ethanol
1 - 50 ppb
1 - 150 ppb
1 - 500 ppb
Ethyl acetate
1 - 50 ppb
1 - 150 ppb
1 - 500 ppb
Formaldehyde
1 - 50 ppb
1 - 150 ppb
1 - 300 ppb
Isopropanol
1 - 50 ppb
1 - 150 ppb
1 - 500 ppb
Toluene
1 - 75 ppb
1 - 75 ppb
1 - 250 ppb
Xylene
2 - 150 ppb
2 - 150 ppb
2 - 500 ppb
To be able to use this dataset for transfer learning, the dataset consists of three different SPG40; two are from the same batch (sensor A and sensor B), and sensor C is from a different batch.
The dataset consists of the sensors' data and a target for evaluation. The data is already split into training and Validation and is stored in cells for each sensor:
sensorA_train
sensorA_test
sensorB_train
sensorB_test
sensorC_train
sensorC_test
Each sensor cell contains four arrays, one for each sub-sensor within one SGP40. The number of rows in the arrays represents the number of observations (693 for test and 2401 for training), and the number of columns represents the number of samples per observation (1440).
The targets, i.e., the concentrations of each gas, are given in the target_train and targe_test structs. Since the data were recorded simultaneously, those structs can be used as targets for all sensors. The ten different gases, relative humidity, and TVOCsens are actual targets, while the range parameter represents the specific unique gas mixture ID.
Although this is a mat file, it can be opened as an hdf5 file.
创建时间:
2022-09-30



