Article Text

Download PDFPDF
Original article
Do COPD subtypes really exist? COPD heterogeneity and clustering in 10 independent cohorts
  1. Peter J Castaldi1,2,
  2. Marta Benet3,4,5,
  3. Hans Petersen6,
  4. Nicholas Rafaels7,
  5. James Finigan8,
  6. Matteo Paoletti9,
  7. H Marike Boezen10,
  8. Judith M Vonk10,
  9. Russell Bowler8,
  10. Massimo Pistolesi9,
  11. Milo A Puhan11,
  12. Josep Anto3,5,4,12,
  13. Els Wauters13,14,15,
  14. Diether Lambrechts13,14,
  15. Wim Janssens15,
  16. Francesca Bigazzi9,
  17. Gianna Camiciottoli9,
  18. Michael H Cho1,16,
  19. Craig P Hersh1,16,
  20. Kathleen Barnes7,
  21. Stephen Rennard17,18,
  22. Meher Preethi Boorgula7,
  23. Jennifer Dy19,
  24. Nadia N Hansel20,21,
  25. James D Crapo8,
  26. Yohannes Tesfaigzi6,
  27. Alvar Agusti22,
  28. Edwin K Silverman1,17,
  29. Judith Garcia-Aymerich3,5,4
  1. 1 Channing Division of Network Medicine, Brigham and Women’s Hospital, Boston, Massachusetts, USA
  2. 2 Division of General Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, USA
  3. 3 ISGlobal, Centre for Research in Environmental Epidemiology (CREAL), Barcelona, Spain
  4. 4 Universitat Pompeu Fabra (UPF), Barcelona, Spain
  5. 5 CIBER Epidemiología y Salud Pública (CIBERESP), Barcelona, Spain
  6. 6 COPD Program, Lovelace Respiratory Research Institute, Albuquerque, New Mexico, USA
  7. 7 Center for Biomedical Informatics and Personalized Medicine, University of Colorado Anschutz Medical Center, Aurora, Colorado, USA
  8. 8 Department of Medicine, National Jewish Health, Denver, Colorado, USA
  9. 9 Department of Experimental and Clinical Medicine, University of Florence, Florence, Italy
  10. 10 Department of Epidemiology, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
  11. 11 Epidemiology, Biostatistics & Prevention Institute, University of Zurich, Zurich, Switzerland
  12. 12 IMIM (Hospital del Mar Medical Research Institute), Barcelona, Spain
  13. 13 Vesalius Research Center (VRC), VIB, Leuven, Belgium
  14. 14 Laboratory for Translational Genetics, Department of Oncology, KU Leuven, Leuven, Belgium
  15. 15 Respiratory Division, University Hospital Gasthuisberg, KU Leuven, Leuven, Belgium
  16. 16 Pulmonary and Critical Care Division, Brigham and Women’s Hospital and Harvard Medical School, Boston, Massachusetts, USA
  17. 17 Division of Pulmonary and Critical Care Medicine, University of Nebraska Medical Center, Omaha, Nebraska, USA
  18. 18 Clinical Discovery Unit, AstraZeneca, Cambridge, UK
  19. 19 Department of Computer Science, Northeastern University, Boston, Massachusetts, USA
  20. 20 Department of Medicine, School of Medicine, Johns Hopkins University, Baltimore, Maryland, USA
  21. 21 Department of Environmental Health Sciences, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, Maryland, USA
  22. 22 Respiratory Institute, Hospital Clinic, University of Barcelona, IDIBAPS and CIBERES, Barcelona, Spain
  1. Correspondence to Dr Peter J Castaldi, Channing Division of Network Medicine, 181 Longwood Ave, Boston, MA 02115, USA; peter.castaldi{at}


Background COPD is a heterogeneous disease, but there is little consensus on specific definitions for COPD subtypes. Unsupervised clustering offers the promise of ‘unbiased’ data-driven assessment of COPD heterogeneity. Multiple groups have identified COPD subtypes using cluster analysis, but there has been no systematic assessment of the reproducibility of these subtypes.

Objective We performed clustering analyses across 10 cohorts in North America and Europe in order to assess the reproducibility of (1) correlation patterns of key COPD-related clinical characteristics and (2) clustering results.

Methods We studied 17 146 individuals with COPD using identical methods and common COPD-related characteristics across cohorts (FEV1, FEV1/FVC, FVC, body mass index, Modified Medical Research Council score, asthma and cardiovascular comorbid disease). Correlation patterns between these clinical characteristics were assessed by principal components analysis (PCA). Cluster analysis was performed using k-medoids and hierarchical clustering, and concordance of clustering solutions was quantified with normalised mutual information (NMI), a metric that ranges from 0 to 1 with higher values indicating greater concordance.

Results The reproducibility of COPD clustering subtypes across studies was modest (median NMI range 0.17–0.43). For methods that excluded individuals that did not clearly belong to any cluster, agreement was better but still suboptimal (median NMI range 0.32–0.60). Continuous representations of COPD clinical characteristics derived from PCA were much more consistent across studies.

Conclusions Identical clustering analyses across multiple COPD cohorts showed modest reproducibility. COPD heterogeneity is better characterised by continuous disease traits coexisting in varying degrees within the same individual, rather than by mutually exclusive COPD subtypes.

  • COPD epidemiology
View Full Text

Statistics from


  • Contributors Conception and design: PJC, JGA; acquisition, analysis and/or interpretation: PJC, MB, HP, JF, MP, HMB, JMV, MAP, EW, DL, WJ, MHC, KB, SR, MPB, JDC, YT, EKS; drafting the manuscript for important intellectual content: all authors.

  • Funding CLIPCOPD was funded by the Ministry of the University and the Ministry of Health of Italy. The COPDGene study (NCT00608764) was supported by Award Number R01HL089897 (JDC), R01HL089856 (EKS) and R01 HL075478 (EKS) from the National Heart, Lung, and Blood Institute. The COPDGene project is also supported by the COPD Foundation through contributions made to an Industry Advisory Board comprising AstraZeneca, Boehringer-Ingelheim, Novartis, Pfizer, Siemens and Sunovion. This work was supported by the US National Institutes of Health (NIH) grants R01 HL124233 and R01 HL126596 (PJC), R01 HL113264 and the Alpha-1 Foundation (MHC). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Heart, Lung, And Blood Institute or the National Institutes of Health. The ECLIPSE study was funded by GSK (NCT00292552). The ICE COLD ERIC study was supported by the Swiss National Science Foundation (grant 3233B0/115216/1), Dutch Asthma Foundation (grant and Zurich Lung League (unrestricted grant). LifeLines has been funded by a number of public sources, notably the Dutch Government, The Netherlands Organization of Scientific Research NWO, the Northern Netherlands Collaboration of Provinces (SNN), the European fund for regional development, Dutch Ministry of Economic Affairs, Pieken in de Delta, Provinces of Groningen and Drenthe, the Target project, BBMRI-NL, the University of Groningen and the University Medical Center Groningen, The Netherlands. The Lovelace Smokers Cohort was funded by the State of New Mexico (appropriation from the Tobacco Settlement Fund) and by institutional funds. The Lung Health Study was supported by GENEVA (U01HG004738) and by contract NIH/N01-HR-46002. The NJH cohort was supported by National Jewish Health internal funds. The PAC-COPD study was supported by grants from the Fondo de Investigación Sanitaria (grants PI020541, PI052486, PI052302 and PI060684), Ministry of Health, Madrid, Spain; the Agència d’ Avaluació de Tecnologia i Recerca Mèdiques (grant 035/20/02), Catalonia Government, Barcelona, Spain; the Spanish Society of Pneumology and Thoracic Surgery (grant 2002/137); the Catalan Foundation of Pneumology (grant 2003 Beca Marià Ravà); the Red Respira (grant C03/11); the Red de Centros de Investigación Cooperativa en Epidemiología y SaludPública (grant C03/09); the Fundació La Marató de TV3 (grant 041110) and NovartisFarmacèutica, Barcelona, Spain. The CIBERESP is funded by the Instituto de SaludCarlos III, Ministry of Health, Madrid, Spain.

  • Competing interests Over the past 3 years, PJC has received research support and consulting fees from GSK. Other authors have no competing interests to declare.

  • Ethics approval All participating institutional review boards.

  • Provenance and peer review Not commissioned; externally peer reviewed.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Linked Articles

  • Airwaves
    The Triumvirate