five

Wavelet feature screening

收藏
Taylor & Francis Group2024-05-20 更新2026-04-16 收录
下载链接:
https://tandf.figshare.com/articles/dataset/Wavelet_feature_screening/25605970/1
下载链接
链接失效反馈
官方服务:
资源简介:
An initial screening of which covariates are relevant is a common practice in high-dimensional regression models. The classic feature screening selects only a subset of covariates correlated with the response variable. However, many important features might have a relevant albeit highly nonlinear relation with the response. One screening approach that handles nonlinearity is to compute the correlation between the response and nonparametric functions of each covariate. Wavelets are powerful tools for nonparametric and functional data analysis but are still seldom used in the feature screening literature. We propose a wavelet feature screening method that can be easily implemented, and we prove that, under suitable conditions, it captures the true covariates with high probability. Simulation results also show that our approach outperforms other screening methods in highly nonlinear models. We apply feature screening to two data sets about ozone concentration and epilepsy. In both applications, the proposed method selects features that match findings in the literature of their respective research fields, illustrating the applicability of feature screening. Supplementary material for this article is available online.

在高维回归模型中,对相关协变量(covariates)开展初步筛选是一项通用实践。经典的特征筛选方法仅选取与响应变量存在相关性的协变量子集。然而,诸多重要特征虽与响应变量存在关联,却呈现高度非线性的关系。一种可处理非线性关系的筛选思路,是计算响应变量与每个协变量的非参数函数之间的相关性。小波(wavelets)是非参数与函数型数据分析的有力工具,但在特征筛选领域的相关研究中仍鲜有应用。本文提出一种易于实现的小波特征筛选方法,并证明在适当条件下,该方法能够以高概率识别出真实协变量。仿真实验结果同样表明,在高度非线性模型中,本文方法的性能优于其他同类筛选方法。我们将特征筛选方法应用于两个分别针对臭氧浓度与癫痫的数据集。在两项应用场景中,本文所提方法筛选得到的特征均与对应研究领域的已有文献结论相符,验证了特征筛选方法的实际适用性。本文的补充材料可在线获取。
提供机构:
Pinheiro, Aluísio; Fonseca, Rodney; Morettin, Pedro
创建时间:
2024-04-15
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作