five

Contains the following files: Fig A. Frequency that each variable is selected for the stepwise variable selection method when sample size changes.

收藏
NIAID Data Ecosystem2026-03-08 收录
下载链接:
https://figshare.com/articles/dataset/_Improved_Variable_Selection_Algorithm_Using_a_LASSO_Type_Penalty_with_an_Application_to_Assessing_Hepatitis_B_Infection_Relevant_Factors_in_Community_Residents_/1494886
下载链接
链接失效反馈
官方服务:
资源简介:
Plot based on 100 simulations using various sample sizes (n = 100, 200, 300 and 500). Left panel: the number of true predictors r = 8; right panel: the number of true predictors r = 12. Red bars represent the selection frequency of significant variables (true non-zero predictors) and grey bars represent that of noise variables (true zero predictors) in the simulated data. Fig B. Frequency that each variable is selected for the stability selection method when sample size changes. Plot based on 100 simulations using various sample sizes (n = 100, 200, 300 and 500). Left panel: the number of true predictors r = 8; right panel: the number of true predictors r = 12. Red bars represent the selection frequency of significant variables (true non-zero predictors) and grey bars represent that of noise variables (true zero predictors) in the simulated data. Fig C. Frequency that each variable is selected for the LASSO variable selection method when sample size changes. Plot based on 100 simulations using various sample sizes (n = 100, 200, 300 and 500). Left panel: the number of true predictors r = 8; right panel: the number of true predictors r = 12. Red bars represent the selection frequency of significant variables (true non-zero predictors) and grey bars represent that of noise variables (true zero predictors) in the simulated data. Fig D. Frequency that each variable is selected for the Bolasso variable selection method when sample size changes. Plot based on 100 simulations using various sample sizes (n = 100, 200, 300 and 500). Left panel: the number of true predictors r = 8; right panel: the number of true predictors r = 12. Red bars represent the selection frequency of significant variables (true non-zero predictors) and grey bars represent that of noise variables (true zero predictors) in the simulated data. Fig E. Frequency that each variable is selected for the two-stage hybrid variable selection method when sample size changes. Plot based on 100 simulations using various sample sizes (n = 100, 200, 300 and 500). Left panel: the number of true predictors r = 8; right panel: the number of true predictors r = 12. Red bars represent the selection frequency of significant variables (true non-zero predictors) and grey bars represent that of noise variables (true zero predictors) in the simulated data. Fig F. Frequency that each variable is selected for the bootstrap ranking variable selection method when sample size changes. Plot based on 100 simulations using various sample sizes (n = 100, 200, 300 and 500). Left panel: the number of true predictors r = 8; right panel: the number of true predictors r = 12. Red bars represent the selection frequency of significant variables (true non-zero predictors) and grey bars represent that of noise variables (true zero predictors) in the simulated data. Fig G. Sensitivity analysis based on the metric AUC to evaluate the performance of the compared methods when the number of true predictors (r = 8, 12, 16 and 20) and sample size (n = 50, 100, 200, 300 and 500) changes with respect to the small effect predictors of group 1. A total number of variables t = 100 were simulated. Six compared methods: stepwise, stability selection, LASSO, Bolasso, two-stage hybrid and bootstrap ranking procedures. Fig H. Sensitivity analysis based on the metric AUC to evaluate the performance of the compared methods when the number of true predictors (r = 8, 12, 16 and 20) and sample size (n = 50, 100, 200, 300 and 500) changes with respect to the small effect predictors of group 2. A total number of variables t = 100 were simulated. Six compared methods: stepwise, stability selection, LASSO, Bolasso, two-stage hybrid and bootstrap ranking procedures. Fig I. The estimation of tuning parameter λ and coefficients for the LASSO model. (A): The deviance with error bar of the LASSO logistic regression model using a 10-fold cross-validation across different values of the tuning parameter (log-scale). The optimal model is the one with a deviance of 0.2301 when the tuning parameter reaches 0.0017. (B): The path of the estimated coefficients over a grid of values for λ and the selected variables corresponding to the optimal λ. Table A. R codes of the two-stage hybrid procedure. The R function TSLasso was used for establishing the two-stage hybrid procedure. Table B. R codes of the bootstrap ranking procedure. The R function Bootranking was used for establishing the bootstrap ranking procedure. Table C. A de-identified dataset of this work was made publicly-available. (DOCX)
创建时间:
2015-07-27
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作