Comparison of clinical and demographic traits of clustered subjects in the discovery and validation stages
Discovery stage (n=196) | Validation stage (n=194) | |||||||||
Cluster 1 | Cluster 2 | Cluster 3 | P value | N used | Cluster 1 | Cluster 2 | Cluster 3 | P value | N used | |
n subjects in cluster | 64 | 95 | 37 | 52 | 101 | 41 | ||||
Age (years) (mean, SD) | 67.8 (8.9) | 66.9 (10.2) | 68.8 (9.4) | 0.592 | 188 | 67.1 (8.1) | 68.5 (7.6) | 66.2 (8.6) | 0.239 | 194 |
Male (%) | 52 (81.3%) | 66 (69.5%) | 23 (62.2%) | 0.091 | 196 | 38 (73.1%) | 72 (71.3%) | 34 (82.9%) | 0.347 | 194 |
European ancestry (%) | 17 (81.0%) | 29 (82.9%) | 3 (75.0%) | 0.883 | 60 | 51 (98.1%) | 91 (90.1%) | 38 (92.7%) | 0.196 | 194 |
Ever smoker (%) | NA | 15 (62.5%) | 18 (78.3%) | 0.389 | 47 | 11 (57.9%) | 21 (60.0%) | 17 (85.0%) | 0.114 | 74 |
Death observed during study (%) | NA | 6 (25.0%) | 16 (66.7%) | 0.009 | 48 | 16 (48.5%) | 13 (19.7%) | 12 (57.1%) | 0.001 | 120 |
FVC % predicted (median, IQR) | 63.0 (35.0) | 70.5 (30.1) | 60.1 (23.4) | 0.342 | 154 | 64.3 (23.6) | 65.0 (24.3) | 63.1 (15.3) | 0.467 | 193 |
DLCO % predicted (median, IQR) | 35.0 (30.0) | 45.0 (29.2) | 34.4 (17.3) | 0.009 | 133 | 42.1 (26.4) | 48.2 (21.1) | 43.4 (20.3) | 0.069 | 194 |
FEV1 % predicted (median, IQR) | NA | 74.9 (23.1) | 65.4 (22.7) | 0.216 | 48 | 74.8 (21.7) | 75.2 (22.2) | 75.4 (17.7) | 0.913 | 75 |
GAP index (mean, SD) | 4.9 (1.4) | 3.9 (1.5) | 4.4 (1.7) | 0.006 | 132 | 4.1 (1.6) | 4.0 (1.5) | 4.3 (1.5) | 0.753 | 193 |
MUC5B genotype: GG (%) | 5 (29.4%) | 11 (27.5%) | 14 (51.9%) | 0.230 | 84 | 2 (11.8%) | 6 (19.4%) | 4 (25.0%) | 0.780 | 64 |
MUC5B genotype: GT (%) | 10 (58.8%) | 26 (65.0%) | 10 (37.0%) | 14 (82.4%) | 24 (77.4%) | 12 (75.0%) | ||||
MUC5B genotype: TT (%) | 2 (11.8%) | 3 (7.5%) | 3 (11.1%) | 1 (5.9%) | 1 (3.2%) | 0 (0%) |
Data are presented as count (percentage), mean (SD) or median (IQR). GAP index, Gender, age and physiology index for IPF mortality.20 P value for count data is from a χ2 test, test comparing means is analysis of variance and test comparing medians is the Kruskal-Wallis log rank test. Significant p values (p<0.05) are highlighted in bold. For percentages, the denominator was the number of participants in that cluster with non-missing data for that trait.
DLCO, diffusing capacity for carbon monoxide; FEV1, forced expiratory volume in 1 second; FVC, forced vital capacity; IPF, idiopathic pulmonary fibrosis; MUC5B genotype, genotype for the MUC5B promoter polymorphism rs35705950; NA, data not available.