Low cost sensor in field calibrations (training and test data) - Beijing 2017
收藏DataCite Commons2025-02-04 更新2026-05-07 收录
下载链接:
https://pure.york.ac.uk/portal/en/datasets/1a0c64b0-433b-4eec-b5c7-64d3de0a0351
下载链接
链接失效反馈官方服务:
资源简介:
DESCRIPTION Sensor box containing a multi-sensor ensemble was located at in Beijing alongside other reference instruments which were part of the AIRPOLL/AIRPRO campaign. There were 50 low-cost sensors inside the sensor instrument: • 6 x NO2 electrochemical • 6 x OX electrochemical • 6 x CO electrochemical • 32 x Total VOC Metal oxide sensors • 2 x Humidity and temperature probes The data from the sensor instrument was recorded every two seconds, with collection and storage on a Latte Panda micro-computer. The file contains the data used for training machine learning algorithms. Both the sensor data and the reference measurements are included. COLUMN NAMES TheTime: 1 minute averaged date and time, local time for Beijing, China Temp : Temperature of the air that comes into contact with the sensors (oC) RH: Relative humidity of the air that comes into contact with the sensors (%) ##_ref_ppb: The reference measurements for NO2, OX and CO (all in units of ppb). A NO2 Teledyne CAPS instrument was used for the NO2 reference, ozone was measured with a TEI 49 UV absorption monitor and [OX] was calculated by adding the [NO2] and [O3]. CO was measured with a CO Aerolaser VUV fluorescence and sample inlet located 100m above ground. Orig_typ_ECname: The median of the six EC sensors (ECname can be NO2, OX or CO), after the standard conversion factors were applied to each one (ppb). Orig_ind_ECname: The median of the six EC sensors (ECname can be NO2, OX or CO), after each sensor had its unique factory conversion factors applied (ppb). These conversion factors were supplied by the sensor company after calibration at the company’s factory. ECname_gbtree_pred: The prediction made using the XGBoost boosted regression trees ML algorithm for the different EC sensors. Each prediction was made using all of the sensor instrument data to make the predictions. During training the respective reference measurements were used as a target for the algorithms. CO starts a day late as there was less CO data. ECname_gblinear_pred: The prediction made using the XGBoost boosted linear regression ML algorithm for the different EC sensors. Each prediction was made using all of the sensor instrument data to make the predictions. During training the respective reference measurements were used as a target for the algorithms. CO starts a day late as there was less CO data. NO2_GP_pred: The Gaussian Process prediction for the NO2 sensor. Each prediction was made using all of the sensor instrument data to make the predictions. During training the NO2 reference measurements were used as a target for the algorithms. NO2_GP_std: One standard deviation from the NO2 Gaussian Process prediction. PEOPLE RESPONSIBLE FOR DATA COLLECTION Sensor data : Kate Smith and Pete Edwards NO2, O3 and CO reference data: James Lee and Freya Squires. Data embargoed until June 25 2019 due to funder requirements.
数据集说明:
本数据集包含一套多传感器集成观测箱,部署于北京,与隶属于AIRPOLL/AIRPRO观测计划的其他参考仪器一同布设。该传感器集成箱内集成有50台低成本传感器,具体包括:6台二氧化氮(NO2)电化学传感器、6台总氧化剂(OX)电化学传感器、6台一氧化碳(CO)电化学传感器、32台总挥发性有机化合物(Total VOC)金属氧化物传感器,以及2台温湿度探头。
该传感器集成箱的采样间隔为2秒,数据由Latte Panda微型计算机完成采集与存储。本数据集包含用于机器学习算法训练的全部数据,涵盖传感器原始数据与参考仪器观测数据两类。
字段说明:
TheTime:经1分钟平均的日期时间,采用中国北京本地时区
Temp:与传感器接触的空气温度(单位:℃)
RH:与传感器接触的空气相对湿度(单位:%)
##_ref_ppb:NO2、OX与CO的参考观测值(单位:ppb)。其中NO2参考数据采用Teledyne CAPS仪器测量,臭氧(O₃)参考数据采用TEI 49紫外吸收监测仪测量,OX浓度通过NO₂浓度与O₃浓度求和计算得到。CO参考数据由Aerolaser VUV荧光法仪器测量,采样进气口设置于地面以上100米处。
Orig_typ_ECname:对6台EC传感器(ECname可选NO2、OX或CO)应用标准转换系数后取中位数,单位为ppb
Orig_ind_ECname:对6台EC传感器(ECname可选NO2、OX或CO)应用厂商出厂定制的专属转换系数后取中位数,单位为ppb。该转换系数由传感器厂商在工厂内完成校准后提供。
ECname_gbtree_pred:针对各类EC传感器,采用XGBoost提升回归树机器学习算法生成的预测值。所有预测均基于该传感器集成箱的全部原始数据生成,训练阶段以对应参考观测值作为算法的目标标签。因CO原始数据量较少,CO类预测数据的起始日期滞后1天。
ECname_gblinear_pred:针对各类EC传感器,采用XGBoost提升线性回归机器学习算法生成的预测值。所有预测均基于该传感器集成箱的全部原始数据生成,训练阶段以对应参考观测值作为算法的目标标签。因CO原始数据量较少,CO类预测数据的起始日期滞后1天。
NO2_GP_pred:针对NO2传感器的高斯过程(Gaussian Process)预测值。所有预测均基于该传感器集成箱的全部原始数据生成,训练阶段以NO2参考观测值作为算法的目标标签。
NO2_GP_std:NO2高斯过程预测值的1倍标准差。
数据采集负责人:
传感器原始数据采集:Kate Smith与Pete Edwards
NO2、O3与CO参考数据采集:James Lee与Freya Squires
根据资助方要求,本数据集的公开限制期至2019年6月25日结束。
提供机构:
University of York
创建时间:
2018-08-28



