Synthetic Datasets for Investigating Additive Feature Attribution for Regression
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/10115806
下载链接
链接失效反馈官方服务:
资源简介:
Synthetic datasets were generated as benchmarks capturing the intrinsic characteristics of original data to investigate the performance of additive feature attribution methods for regression tasks. The synthetic datasets were generated based on 2, 6 and 8 clusters formed with the original data. The 6-cluster dataset was used for primary analysis and the other two were used for sensitivity analysis.
The synthetic dataset was generated from the original data acquired from Aviation Data for Research Repository, which was collected and processed by EUROCONTROL from the Enhanced Tactical Flow Management System (ETFMS) flight data messages containing all flights in Europe throughout the year 2019, from May to October. The original dataset consisted of fundamental details of the flights, flight status, preceding flight legs, ATFM regulations, weather conditions, calendar information, etc.
A brief description of the columns in the synthetic data files is presented in the file 'data_description.pdf' and a more detailed discussion on features can be found in the works of Koolen and Coliban [1] and Dalmau et al. [2].
References[1] H. Koolen and I. Coliban, Flight Progress Messages Document, EUROCONTROL, Brussels, Belgium, Tech. Rep., 2020.[2] R. Dalmau, F. Ballerini, H. Naessens, S. Belkoura, and S. Wangnick, An Explainable Machine Learning Approach to Improve Take-off Time Predictions, Journal of Air Transport Management, vol. 95, p. 102 090, Aug. 2021. doi: 10.1016/j.jairtraman.2021.102090.
创建时间:
2024-07-10



