Background Treatment and preventative advances for chronic obstructive pulmonary disease (COPD) have been slow due, in part, to limited subphenotypes. We tested if unsupervised machine learning on CT images would discover CT emphysema subtypes with distinct characteristics, prognoses and genetic associations.
Methods New CT emphysema subtypes were identified by unsupervised machine learning on only the texture and location of emphysematous regions on CT scans from 2853 participants in the Subpopulations and Intermediate Outcome Measures in COPD Study (SPIROMICS), a COPD case–control study, followed by data reduction. Subtypes were compared with symptoms and physiology among 2949 participants in the population-based Multi-Ethnic Study of Atherosclerosis (MESA) Lung Study and with prognosis among 6658 MESA participants. Associations with genome-wide single-nucleotide-polymorphisms were examined.
Results The algorithm discovered six reproducible (interlearner intraclass correlation coefficient, 0.91–1.00) CT emphysema subtypes. The most common subtype in SPIROMICS, the combined bronchitis-apical subtype, was associated with chronic bronchitis, accelerated lung function decline, hospitalisations, deaths, incident airflow limitation and a gene variant near DRD1, which is implicated in mucin hypersecretion (p=1.1 ×10−8). The second, the diffuse subtype was associated with lower weight, respiratory hospitalisations and deaths, and incident airflow limitation. The third was associated with age only. The fourth and fifth visually resembled combined pulmonary fibrosis emphysema and had distinct symptoms, physiology, prognosis and genetic associations. The sixth visually resembled vanishing lung syndrome.
Conclusion Large-scale unsupervised machine learning on CT scans defined six reproducible, familiar CT emphysema subtypes that suggest paths to specific diagnosis and personalised therapies in COPD and pre-COPD.
- Imaging/CT MRI etc
- COPD epidemiology
Data availability statement
SPIROMICS and MESA data are available to the scientific community as described in the Acknowledgements section and on the study websites.
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
EDA and JY contributed equally.
AFL and RGB contributed equally.
Contributors EA, JY, WS, AL and RGB contributed to the machine learning; PB, YS, DC, JSK, DM and RGB contributed to the epidemiological analyses; EAH, NA, CSC, MTD, CKG, EH, MKH, NNH, DRJ, JDK, JL, FJM, EO, RP, MRP, WP, BS, KEW, PGW and RGB contributed to data collection or funding; AWM, ERB, RB, MC, SK, TL, VEO, TP, SSR and EKS contributed to the genomic analyses; JHMA and AJS provided radiologist interpretations; EA and JY drafted the manuscript; all authors contributed to revisions and provided final approval.
Funding This work was supported by NIH/NHLBI R01-HL121270, R01-HL077612, R01-HL093081, R01-HL142028, R01-HL130506, R01-HL131565, R01-HL103676 and T32-HL144442. MESA and the MESA SHARe project are conducted and supported by the National Heart, Lung and Blood Institute (NHLBI) in collaboration with MESA investigators. Support for MESA is provided by contracts HHSN268201500003I, N01-HC-95159-69, UL1-TR-000040, UL1-TR-001079, UL1-TR-001420, UL1-TR-001881 and DK063491. Funding for SHARe genotyping was provided by NHLBI Contract N02-HL-64278. SPIROMICS was supported by contracts from NIH/NHLBI (HHSN268200900013C-20C), which were supplemented by contributions made through the Foundation for the NIH and COPD Foundation from AstraZeneca; Bellerophon Pharmaceuticals; Boehringer-Ingelheim Pharmaceuticals; Chiesi Farmaceutici SpA; Forest Research Institute; GSK; Grifols Therapeutics; Ikaria Nycomed; Takeda Pharmaceutical Company; Novartis Pharmaceuticals Corporation; Regeneron Pharmaceuticals and Sanofi. The COPD Gene Study was supported by NIH grants K12HL120004, R01HL113264, U01HL089856 and P01HL105339. The COPD Gene Study is also supported by the COPD Foundation through contributions made to an Industry Advisory Board comprised of AstraZeneca, Boehringer Ingelheim, GlaxoSmithKline, Novartis, Pfizer, Siemens and Sunovion.
Competing interests EDA, PPB, AM, YS, WS, JHMA, MHC, DC, EH, DRJ, SK, JDK, TL, JL, ECO, WP, MRP, SSR, EKS, KEW and AFL reports receiving grants from the National Institutes of Health (NIH). JY performed the work at Columbia University but is now an employee of Google. EAH reports receiving grants from the NIH; being a founder and shareholder of VIDA Diagnostics; and holding patents for an apparatus for analysing CT images to determine the presence of pulmonary tissue pathology, an apparatus for image display and analysis, and a method for multiscale meshing of branching biological structures. EBA reports receiving grants from the American Heart Association and the NIH. CBC reports receiving personal fees from GlaxoSmithKline. MTD reports receiving a grant from the NHLBI and personal fees from AstraZeneca, GlaxoSmithKline, Pulmonx, PneumRx/BTG and Quark. MKH reports consulting for GlaxoSmithKline, AstraZeneca and Boehringer Ingelheim receiving research support from Novartis and Sunovion. NNH reports receiving grants from the NIH, Boehringer Ingelheim, and the COPD Foundation. JDK reports receiving grants from US Environmental Protection Agency and the NIH. FJM reports serving on COPD advisory boards for AstraZeneca, Boehringer Ingelheim, Chiesi, GlaxoSmithKline, Sunovion and Teva; serving as a consultant for ProterixBio and Verona; serving on the steering committees of studies sponsored by the NHLBI, AstraZeneca, and GlaxoSmithKline; having served on data safety and monitoring boards of COPD studies supported by Genentech and GlaxoSmithKline. BMS reports receiving grants from the NIH, Canadian Institutes of Health Research (CIHR), Fonds de la recherche en santé du Québec (FRQS), the Research Institute of the McGill University Health Centre, the Quebec Lung Association and AstraZeneca. PGW reports receiving personal fees for consultancy from Theravance, AstraZeneca, Regeneron, Sanofi, Genentech, Roche and Janssen. RGB reports receiving grants from the COPD Foundation, the US Environmental Protection Agency (EPA), the American Lung Association and the NIH.
Provenance and peer review Not commissioned; externally peer reviewed.
Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.