Background COPD is a heterogeneous disease, but there is little consensus on specific definitions for COPD subtypes. Unsupervised clustering offers the promise of ‘unbiased’ data-driven assessment of COPD heterogeneity. Multiple groups have identified COPD subtypes using cluster analysis, but there has been no systematic assessment of the reproducibility of these subtypes.
Objective We performed clustering analyses across 10 cohorts in North America and Europe in order to assess the reproducibility of (1) correlation patterns of key COPD-related clinical characteristics and (2) clustering results.
Methods We studied 17 146 individuals with COPD using identical methods and common COPD-related characteristics across cohorts (FEV1, FEV1/FVC, FVC, body mass index, Modified Medical Research Council score, asthma and cardiovascular comorbid disease). Correlation patterns between these clinical characteristics were assessed by principal components analysis (PCA). Cluster analysis was performed using k-medoids and hierarchical clustering, and concordance of clustering solutions was quantified with normalised mutual information (NMI), a metric that ranges from 0 to 1 with higher values indicating greater concordance.
Results The reproducibility of COPD clustering subtypes across studies was modest (median NMI range 0.17–0.43). For methods that excluded individuals that did not clearly belong to any cluster, agreement was better but still suboptimal (median NMI range 0.32–0.60). Continuous representations of COPD clinical characteristics derived from PCA were much more consistent across studies.
Conclusions Identical clustering analyses across multiple COPD cohorts showed modest reproducibility. COPD heterogeneity is better characterised by continuous disease traits coexisting in varying degrees within the same individual, rather than by mutually exclusive COPD subtypes.
- COPD epidemiology
Statistics from Altmetric.com
Contributors Conception and design: PJC, JGA; acquisition, analysis and/or interpretation: PJC, MB, HP, JF, MP, HMB, JMV, MAP, EW, DL, WJ, MHC, KB, SR, MPB, JDC, YT, EKS; drafting the manuscript for important intellectual content: all authors.
Funding CLIPCOPD was funded by the Ministry of the University and the Ministry of Health of Italy. The COPDGene study (NCT00608764) was supported by Award Number R01HL089897 (JDC), R01HL089856 (EKS) and R01 HL075478 (EKS) from the National Heart, Lung, and Blood Institute. The COPDGene project is also supported by the COPD Foundation through contributions made to an Industry Advisory Board comprising AstraZeneca, Boehringer-Ingelheim, Novartis, Pfizer, Siemens and Sunovion. This work was supported by the US National Institutes of Health (NIH) grants R01 HL124233 and R01 HL126596 (PJC), R01 HL113264 and the Alpha-1 Foundation (MHC). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Heart, Lung, And Blood Institute or the National Institutes of Health. The ECLIPSE study was funded by GSK (NCT00292552). The ICE COLD ERIC study was supported by the Swiss National Science Foundation (grant 3233B0/115216/1), Dutch Asthma Foundation (grant 3.4.07.045) and Zurich Lung League (unrestricted grant). LifeLines has been funded by a number of public sources, notably the Dutch Government, The Netherlands Organization of Scientific Research NWO, the Northern Netherlands Collaboration of Provinces (SNN), the European fund for regional development, Dutch Ministry of Economic Affairs, Pieken in de Delta, Provinces of Groningen and Drenthe, the Target project, BBMRI-NL, the University of Groningen and the University Medical Center Groningen, The Netherlands. The Lovelace Smokers Cohort was funded by the State of New Mexico (appropriation from the Tobacco Settlement Fund) and by institutional funds. The Lung Health Study was supported by GENEVA (U01HG004738) and by contract NIH/N01-HR-46002. The NJH cohort was supported by National Jewish Health internal funds. The PAC-COPD study was supported by grants from the Fondo de InvestigacioÌn Sanitaria (grants PI020541, PI052486, PI052302 and PI060684), Ministry of Health, Madrid, Spain; the AgeÌ€ncia dâ€™AvaluacioÌ de Tecnologia i Recerca MeÌ€diques (grant 035/20/02), Catalonia Government, Barcelona, Spain; the Spanish Society of Pneumology and Thoracic Surgery (grant 2002/137); the Catalan Foundation of Pneumology (grant 2003 Beca MariaÌ€ RavaÌ€); the Red Respira (grant C03/11); the Red de Centros de InvestigacioÌn Cooperativa en EpidemiologiÌa y Salud PuÌblica (grant C03/09); the FundacioÌ La MaratoÌ de TV3 (grant 041110) and Novartis FarmaceÌ€utica, Barcelona, Spain. The CIBERESP is funded by the Instituto de Salud Carlos III, Ministry of Health, Madrid, Spain.
Competing interests Over the past 3 years, PJC has received research support and consulting fees from GSK. Other authors have no competing interests to declare.
Ethics approval All participating institutional review boards.
Provenance and peer review Not commissioned; externally peer reviewed.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.