Refined NHANES 2007-2012 spirometry dataset for the comparison of segmented (piecewise) linear models to that of GAMLSS
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://data.mendeley.com/datasets/dwjykg3xww
下载链接
链接失效反馈官方服务:
资源简介:
Background
Current guidelines recommend using Generalized Additive Models for Location, Scale, and Shape (GAMLSS) to create reference equations for lung function. However, these models are complex & require additional spline tables for use. This study aimed to demonstrate that simpler methods, such as simple linear regression for the forced exhaled volume in 1 second to forced vital capacity ratio (FEV1/FVC ratio) and segmented linear regression (SLR) for forced exhaled volume in 1 second (FEV1) and forced vital capacity (FVC), can achieve similar prediction accuracies as GAMLSS in pulmonary function diagnostics.
Data
This study utilized secondary data from the National Health and Nutrition Examination Survey (NHANES) conducted between 2007 and 2012. The dataset includes spirometry measurements from 16,596 participants (from an initial pool of 31,451) aged 6-80 years, representing diverse racial and ethnic backgrounds. Participants' weights ranged from 16.4-218.2 kg, heights from 104.6-203.8 cm, and BMI from 12.5-84.9 kg/m². The refined dataset only includes participants who met the minimum technical quality standards for spirometry maneuvers, as outlined by the American Thoracic Society (2005), specifically those performing A and B quality maneuvers. However, the data file also provides a secondary analysis of the calculated z-scores from the developed GAMLSS and piecewise regression models including whether participants had a restrictive respiratory pattern or an airway obstruction classification based on the calculated lower limit of normal.
Methods
Reference equations FEV1, FVC, and the FEV1/FVC ratio were developed by G.Z. using different modeling techniques: simple linear regression for the FEV1/FVC ratio and segmented linear regression (SLR) for FEV1 and FVC. Initially, all races/ethnicities were grouped together as the primary hypothesis was to compare GAMLSS to SLR models and not to compare different biological ancestries. K-fold cross-validation was applied to calculate the 95% confidence interval (CI) for the root-mean-square error (RMSE), which served as an indicator of prediction accuracy. Additionally, the agreement between both modeling approaches in classifying spirometric patterns [normal, airflow obstruction, restrictive, mixed disorder, or preserved ratio impaired spirometry (PRISm)] was assessed using an unweighted kappa statistic.
Results
The RMSE values and correlation coefficients for FEV1, FVC, and the FEV1/FVC ratio were similar between the two modeling techniques. The agreement between the models in classifying spirometric patterns was also high, with kappa values ranging from 0.78 to 0.80 (95% CI).
Conclusions
Simple linear regression (FEV1/FVC ratio) and segmented linear regression (FEV1, FVC) provide prediction accuracies comparable to those of GAMLSS models. These simpler methods are more straightforward and accessible, making them a practical alternative for broader use in pulmonary function diagnostics.
创建时间:
2024-09-16



