Article Text

Original research
Bronchial gene expression signature associated with rate of subsequent FEV1 decline in individuals with and at risk of COPD
  1. Elizabeth J Becker1,2,
  2. Alen Faiz3,4,
  3. Maarten van den Berge4,
  4. Wim Timens5,
  5. Pieter S Hiemstra6,
  6. Kristopher Clark7,
  7. Gang Liu1,
  8. Xiaohui Xiao1,
  9. Yuriy O Alekseyev8,
  10. George O'Connor9,
  11. Stephen Lam10,
  12. Avrum Spira1,2,8,
  13. Marc E Lenburg1,2,8,
  14. Katrina Steiling1,2,9
  1. 1 Division of Computational Biomedicine, Boston University School of Medicine, Boston, Massachusetts, USA
  2. 2 Bioinformatics Program, Boston University, Boston, Massachusetts, USA
  3. 3 Respiratory Bioinformatics and Molecular Biology (RBMB), School of Life Sciences, University of Technology Sydney, Sydney, New South Wales, Australia
  4. 4 Department of Pulmonary Diseases, University Medical Center Groningen, Groningen, The Netherlands
  5. 5 Department of Pathology and Medical Biology, University Medical Centre Groningen, Groningen, The Netherlands
  6. 6 Department of Pulmonology, Leiden University Medical Center, Leiden, The Netherlands
  7. 7 Internal Medicine Residency Program, Boston Medical Center, Boston, Massachusetts, USA
  8. 8 Department of Pathology and Laboratory Medicine, Boston University School of Medicine, Boston, Massachusetts, USA
  9. 9 Division of Pulmonary, Allergy, Sleep, and Critical Care Medicine, Boston University School of Medicine, Boston, Massachusetts, USA
  10. 10 British Columbia Cancer Agency, Vancouver, British Columbia, Canada
  1. Correspondence to Dr Katrina Steiling, Division of Computational Biomedicine, Boston University School of Medicine, Boston, MA 02118, USA; steiling{at}bu.edu

Abstract

Background COPD is characterised by progressive lung function decline. Leveraging prior work demonstrating bronchial airway COPD-associated gene expression alterations, we sought to determine if there are alterations associated with differences in the rate of FEV1 decline.

Methods We examined gene expression among ever smokers with and without COPD who at baseline had bronchial brushings profiled by Affymetrix microarrays and had longitudinal lung function measurements (n=134; mean follow-up=6.38±2.48 years). Gene expression profiles associated with the rate of FEV1 decline were identified by linear modelling.

Results Expression differences in 171 genes were associated with rate of FEV1 decline (false discovery rate <0.05). The FEV1 decline signature was replicated in an independent dataset of bronchial biopsies from patients with COPD (n=46; p=0.018; mean follow-up=6.76±1.32 years). Genes elevated in individuals with more rapid FEV1 decline are significantly enriched among the genes altered by modulation of XBP1 in two independent datasets (Gene Set Enrichment Analysis (GSEA) p<0.05) and are enriched in mucin-related genes (GSEA p<0.05).

Conclusion We have identified and replicated an airway gene expression signature associated with the rate of FEV1 decline. Aspects of this signature are related to increased expression of XBP1-regulated genes, a transcription factor involved in the unfolded protein response, and genes related to mucin production. Collectively, these data suggest that molecular processes related to the rate of FEV1 decline can be detected in airway epithelium, identify a possible indicator of FEV1 decline and make it possible to detect, in an early phase, ever smokers with and without COPD most at risk of rapid FEV1 decline.

  • respiratory measurement
  • airway epithelium

Data availability statement

Data are available in a public, open access repository. The data are available on GEO as GSE37147.

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Key messages

What is the key question?

  • Can gene expression profiling of the airway epithelium be used to identify molecular processes associated with the rate of FEV1 decline?

What is the bottom line?

  • An airway gene expression signature associated with the rate of subsequent decline in FEV1 is identified and replicated. This signature is enriched for genes that are regulated by XBP1, a key transcription factor involved in the unfolded protein response.

Why read on?

  • The present study highlights that molecular processes associated with the rate of FEV1 decline can be detected by bronchial epithelial gene expression profiles. This work identifies a possible indicator of FEV1 decline and demonstrates the potential for bronchial gene expression to serve as an intermediate endpoint for studying the rate of FEV1 decline.

Introduction

COPD is the third leading cause of death in the world.1 In 2016, 3 million people died of COPD, which accounted for 6% of all deaths globally.1 Accelerated lung function decline is considered a feature of COPD and is most commonly measured by change in FEV1. Lower FEV1 is associated with an increased risk of death,2 and even smokers who do not yet meet the clinical definition of COPD may experience more rapid FEV1 decline.3 The rate of FEV1 decline is highly variable between individuals.4 Though some risk factors for rapid FEV1 decline have been identified, such as cigarette smoking,5 higher blood neutrophil counts,6 albuminuria7 and alpha 1-antitrypsin deficiency,8 these do not fully explain the heterogeneity in COPD and have not yet been useful in predicting FEV1 decline for individual patients. The ability to predict FEV1 decline would enable clinicians to stratify at-risk patients towards more aggressive management. It might also facilitate clinical trials of therapies to modify the natural history of COPD, specifically targeting individuals more likely to experience greater decline in FEV1. Finally, it could lead to further indications for finding therapeutic targets to slow disease progression.

Previous studies have demonstrated that bronchial epithelial gene expression is altered both by cigarette smoking and in diseases associated with cigarette smoking.9–11 We have previously described the Steiling et al 10 bronchial airway gene expression signature of COPD and disease severity as measured by FEV1. Importantly, these gene expression alterations in the more proximal airway were similar to disease-associated changes present in more distal diseased lung tissue, suggesting that bronchial airway gene expression in COPD can be used to study factors in its pathobiology.

Based on these observations, we hypothesised that bronchial airway epithelial gene expression might reflect molecular processes associated with accelerated FEV1 decline. In this study, we identified and replicated a baseline gene expression signature associated with the rate of FEV1 decline observed during subsequent follow-up. We found this signature to be significantly enriched for genes with binding sites for the transcription factor encoded by XBP1, which is involved in the unfolded protein response (UPR) to endoplasmic reticulum (ER) stress.

Some of the results reported here have been previously published in abstract form.12–15

Methods

Primary dataset and longitudinal FEV1

The individuals included in the primary dataset were recruited as part of the British Columbia Lung Health Study16 and Pan-Canadian Lung Health Study. Additional information about the study population can be found in the supplement. We previously profiled 267 bronchial airway brushings obtained from current and former smokers in this cohort using Affymetrix Human Gene 1.0 ST Arrays.10 In the current study, we used this existing gene expression data together with spirometry data that were collected during longitudinal follow-up subsequent to bronchoscopy. FEV1, FEV1% predicted and FVC were measured using a flow-sensitive spirometer. The ratio between FEV1 and FVC were used to determine COPD status as previously described in Steiling et al. 10 We excluded samples from individuals who did not have a spirometry recording within 1 year of their bronchoscopy (n=8), did not have at least two spirometry measurements at least 4 years apart (n=104) or who developed cancer (n=19). As previously reported, two samples were excluded due to sample labelling errors.10 Data from the remaining 134 current and former smokers were included in the analysis. Because spirometry was not performed at regular intervals, the rate of FEV1 decline (∆FEV1) for each study participant was estimated using linear regression with all available spirometry measurements from that individual subsequent bronchoscopy. The relationship between the rate of FEV1 decline and other clinical variables was evaluated by analysis of variance (ANOVA) of linear models.

Identification of a rate of FEV1 decline gene expression signature

Genes associated with the future rate of FEV1 decline were identified using the following linear models calculated using the lm function and the anova function using R statistical software V.3.4.017 and RStudio V.1.0.143.18

Embedded Image (1)

Embedded Image (2)

where ge is the expression level of a single gene; age is the age at the time of bronchoscopy, pack years is the calculated cumulative cigarette smoke exposure at the time of bronchoscopy and smoke status is the smoking status at the time of bronchoscopy (participants were considered former smokers if they had quit for at least a year). Baseline FEV1 is the FEV1 within 1 year of bronchoscopy. The rate of FEV1 decline (∆FEV1) is calculated as described above. ε is an error term. The false discovery rate (FDR) was calculated from the ANOVA p values.19 Genes with FDR <0.05 were considered to be associated with the rate of FEV1 decline and included in the signature. The signature was divided into genes that are increased or decreased with more rapid FEV1 decline by hierarchical clustering which segregated genes according to the sign of the linear model coefficient. The rate of FEV1 decline signature was compared with the previously published Steiling et al 10 airway gene expression signature of COPD severity by determining the number of genes overlapping between the signatures, and by using Gene Set Enrichment Analysis (GSEA).20

To evaluate the extent to which the FEV1 decline signature is a reflection of other factors that correlate with COPD, we first summarised the expression of the FEV1 decline signature per sample as the sample loading on the first principal component which we refer to as the FEV1 decline signature. We then tested the association between the signature score and the rate of FEV1 decline in several subsets of the data. A linear model adjusting for age, sex, smoking status, pack years and baseline FEV1 was performed to test the association between the FEV1 decline signature score and the rate of lung function decline in current smokers who continued to be current smokers throughout the follow-up period, former smokers, individuals with COPD, individuals without COPD and individuals who were not using inhaled medications.

Replication of the gene expression signature of rate of FEV1 decline in GLUCOLD

We investigated the association between the expression of genes in the rate of FEV1 decline signature and the observed rate of FEV1 decline in a previously published independent dataset of individuals with COPD who were enrolled in the GLUCOLD trial, a placebo controlled randomised double-blind clinical trial of fluticasone with or without salmeterol21 22 (GSE36221). Briefly, these participants underwent bronchoscopy with endobronchial biopsy followed by spirometry every 3 months during the 2.5-year trial. After the 2.5-year drug treatment trial, participants performed spirometry every year up to a total of 7.5 years (mean=6.91). The rate of FEV1 decline was estimated by the coefficient from a linear model for each individual using their baseline spirometry measurement (t=year 0), excluding their time on treatment (t=0.25 to t=2.5 years) and including measurements from 3.5 years forward, to control for treatment effect. In the GLUCOLD participants, the gene expression signature associated with subsequent FEV1 decline was calculated using principal component analysis (PCA). First, the eigenvector for the first principal component of the signature genes in the z-score normalised discovery set was calculated using the prcomp function in R. A summarised signature score for each sample in the discovery set and the GLUCOLD dataset was then calculated from the eigenvector and the z-score normalised expression data using the predict method of prcomp. The relationship between summarised signature score at baseline and the rate of FEV1 decline was evaluated using the linear model and ANOVA strategy outlined above for the gene expression analysis.

Identification of enriched biological pathways

To identify transcription factors enriched in the FEV1 decline signature, we used the Molecular Signature Database (MSigDB)20 to search computationally derived datasets of transcription factor binding sites. We divided the genes into two clusters when searching: genes that increase with worse FEV1 decline and genes that decrease with worse FEV1 decline. For each cluster of genes, we identified transcription factor binding sites with an FDR <0.05. The transcription factor binding sites were based on Xie et al’s work23 and TRANSFAC V.7.4.24

The transcription factor XBP1, which was identified by the above method, was selected for in silico validation because it has previously been implicated in COPD25 and because it is a well-studied transcription factor with several publicly available knockout and overexpression datasets. The Gene Expression Omnibus (GEO) was searched using the key terms “XBP1 knock out” and “XBP1 overexpression” to identify potentially useful datasets. This identified 38 datasets.

After searching GEO to identify publicly available datasets investigating the gene expression effects of modulating XBP1 activity, we identified two datasets which we explored further: a dataset examining the effects of XBP1 overexpression in mouse adipocytes (GSE46178)26 and a dataset examining the effects of XBP1 knockout in mouse hepatocytes (GSE64824).27 Using t-statistics from a linear model, we ranked genes by their change in expression following XBP1 overexpression in mouse adipocytes (n controls=4, n overexpression=4) using data from Affymetrix Mouse Genome 430A 2.0 Arrays. For the XBP1 knockout study, we ranked genes according to their change in expression between wild-type (WT) and XBP1-knockout hepatocytes by subtracting the gene expression of the controls (n=2) from the knockout hepatocytes (n=2) which had been profiled by RNA-seq using an Illumina HiSeq2000.27 Before sequencing, the replicates were pooled. We explored the distribution of the two gene clusters from the rate of FEV1 decline signature in these ranked lists using GSEA.20 Importantly, neither of these in silico datasets were included in the TRANSFAC data.28–31

Because previous work has also identified association of a T helper type 2 cell (Th2) signature in a subset of individuals with COPD,32 we sought to evaluate the association of this Th2 signature with the FEV1 decline signature. We used a three-gene Th2 signature derived in patients with asthma,33 which included POSTN, SERPINB2 and CLCA1. Expression levels of these genes were z-score normalised, and the first principal component was computed to summarise their expression. The association between the Th2 score and FEV1 decline was assessed using a linear model controlling for age, sex, smoking status, pack years and baseline FEV1. We also tested the association between the first principal component of the FEV1 decline signature and the Th2 score.

To determine if genes related to mucus hypersecretion were associated with FEV1 decline, we used GSEA to evaluate the enrichment of two mucin-related gene sets among a ranked list of all genes ranked by association with FEV1 decline. The gene sets used included the mucin type O-glycan biosynthesis gene set from KEGG 2019,34 and the mucin granule gene set from Jensen Compartments.35 In order to determine if there was an association between ER stress and mucin hypersecretion, Gene Set Variation Analysis36 scores were calculated for the XPB1 target gene set, the mucin-related gene sets and the FEV1 signature score. Linear models were used to test pairwise associations between the three scores.

To determine if genes previously identified in genome-wide association studies (GWAS) of COPD were enriched in the FEV1 decline signature, we used GSEA to evaluate the enrichment of a gene set comprised of COPD GWAS genes among the ranked list of all genes ordered by association with FEV1 decline. The gene set of COPD GWAS genes was derived from a review of GWAS in COPD.37

Results

Participant demographics

A total of 134 current and former smokers with (n=49) and without COPD (n=85) were included in this analysis. Clinical and demographic characteristics of the study cohort are provided in table 1. Demographic characteristics separated by GOLD status are available in online supplemental table 1. The correlations between demographic variables are listed in online supplemental table 2.

Supplemental material

Table 1

Characteristics of the study participants

The average baseline FEV1 for participants with COPD was significantly lower than in participants without COPD. The rate of FEV1 decline is significantly higher in individuals with lower baseline FEV1 and/or COPD (p<0.05) (online supplemental table 1). Only 5 of the 59 current smokers quit smoking during follow-up. The initial spirometry measurements were performed within 1 year of bronchoscopy, with 85% performed within 6 months of bronchoscopy and 70% within 90 days (online supplemental figure 1 and online supplemental table 3).

Bronchial airway gene expression signature of FEV1 rate of decline

The expression levels of 171 genes were significantly associated with the rate FEV1 decline (FDR <0.05) (figure 1 and online supplemental table 4). A total of 120 genes had higher expression in individuals with faster FEV1 decline (cluster 1), while 51 genes had lower expression in individuals with faster FEV1 decline (cluster 2).

Figure 1

Heatmap of 171 genes associated with change in FEV1. One hundred and seventy-one genes were associated with the change in the rate of FEV1 decline using a linear model controlling for age, sex, smoking status, pack years and baseline FEV1 (FDR <0.05). The participants (columns) are arranged from slowest FEV1 decline (white) to most rapid FEV1 decline (black). These genes were grouped into two clusters based on unsupervised hierarchical clustering. Cluster 1 consists entirely of genes expressed at higher levels in patients with more rapid FEV1 decline, and cluster 2 consists entirely of genes expressed at lower levels in patients with more rapid FEV1 decline. FDR, false discovery rate.

We next wanted to determine the sensitivity of the results we obtained to using individuals with 4 or more years of longitudinal spirometry. Therefore, we performed the same analysis using individuals with a shorter duration of follow-up spirometry to determine if the increase in the number of individuals in the analysis resulting from requiring less follow-up would outweigh the potentially less accurate estimates of the rate of FEV1 decline. No genes were significantly associated with FEV1 decline when we required a minimum of 2 years or 3 years of spirometry follow-up. We compared the t-statistics obtained with the 4-year estimate of FEV1 decline and the shorter estimates for the 171 genes in the FEV1 decline signature. We did not detect a significant correlation of the t-statistics using the estimates of FEV1 decline based on 4 years and 2 years of follow-up (p=0.369), but did identify a significant association between the t-statistics obtained using the 4-year and 3-year estimates of FEV1 decline (p=2.0×10−16; online supplemental figure 2).

We also explored the sensitivity of our results to subsetting the cohort based on potential confounders. Toward this end, we summarised the expression of the genes associated with FEV1 decline per sample as the first principal component from PCA and termed this value the FEV1 decline signature score. Not surprisingly, using all samples, there is a strong association between the FEV1 decline signature score and FEV1 decline across the entire cohort (p=1.73×10−9; figure 2A). We also observed significant associations between the FEV1 decline signature score and FEV1 decline in current smokers, smokers who remained current smokers during follow-up, former smokers, individuals with COPD, individuals without COPD and individuals not on inhaled medication (figure 2). We also found that the association between the FEV1 decline signature score and FEV1 decline in former smokers remained even after correcting for the years of smoking cessation. Similarly, the association of the FEV1 signature score and FEV1 decline remained after correcting for time interval between spirometry and bronchoscopy.

Figure 2

Sensitivity analysis of the FEV1 decline signature. The expression levels of the 171 genes associated with FEV1 decline were summarised into a single value using the eigenvector for the first principal component in the discovery dataset. This summarised expression is termed the FEV1 decline signature score. The scatter plot and best-fit line of each panel show the association between the FEV1 decline signature score and the change in FEV1 in various subsets of the dataset. (A) All the individuals included in the training analysis, (B) only former smokers, (C) only current smokers, (D) only individuals not on inhaled medications, (E) only individuals without COPD, (F) only individuals with COPD.

We next examined the overlap between the longitudinal FEV1 decline signature and the Steiling et al bronchial airway 98-gene expression signature of COPD severity we had previously identified in a cross-sectional analysis.10 A total of 10 genes were shared by the two signatures. Nine genes that were increased in association with worse FEV1 decline were also increased in COPD. One gene that was decreased with more rapid FEV1 decline was also decreased in COPD airway (online supplemental table 4). Next, GSEA was also used to evaluate for enrichment between these signatures. There was significant concordant enrichment of the 171 gene signatures associated with rate of FEV1 decline and COPD status ranked list (positive FDR q<0.0001, negative FDR q<0.0001).

Replication of the airway gene expression signature of rate of FEV1 decline in the GLUCOLD trial

We next sought to determine whether the rate of FEV1 decline signature is significantly associated with rate of FEV1 change in an independent dataset using the signature score derived from PCA. Demographics of this replication cohort are listed in online supplemental table 5. In the discovery dataset, higher signature scores are associated with a more rapid decrease in FEV1 (p=1.73×10−9; figure 3A). Signature scores generated in an independent dataset of individuals with COPD who were enrolled in a placebo controlled study of inhaled fluticasone±salmeterol21 showed that the scores are significantly associated with future FEV1 decline in the independent dataset (p=0.018; figure 3B). These findings suggest that the airway gene expression signature for the rate of decline in FEV1 is similarly associated with rate of FEV1 decline in an independent dataset of participants with COPD.

Figure 3

Airway gene expression signature associated with rate of FEV1 decline replicates in an independent dataset of patients with COPD. The principal component 1 eigenvector used to generate the FEV1 decline signature score in the training data was also used to generate signature scores from baseline bronchial biopsies of patients with COPD who were followed for subsequent change in FEV1. The signature scores in the discovery dataset (A) and the independent dataset (B) are each significantly correlated with the rate of FEV1 decline (p=1.73×10−9 and p=0.018, respectively).

Enrichment of transcription factor binding sites in the FEV1 rate of decline signature

To explore the potential regulators of the genes associated with the rate of FEV1 decline, we queried MSigDB to identify transcription factors whose predicted binding sites are over-represented among the signature genes. Genes whose expression levels are increased in individuals with more rapid FEV1 decline are enriched for genes with binding sites for XBP1 (FDR q=0.0302) among other transcription factors. A full list of all the MSigDB results can be found in online supplemental table 6.

We further investigated XBP1 due to its role in the UPR which has been implicated in COPD,25 and we are able to identify publicly available datasets profiling the gene expression effects of modulating XBP1 activity. Using GEO, we identified a dataset examining the effect of XBP1 overexpression in mouse adipocytes (GSE46178)26 and a dataset examining the effect of XBP1 knockout in mouse hepatocytes (GSE64824)27 that we used in further analysis.

We ranked the genes that change with XBP1 overexpression in mouse adipocytes (n=4 controls, n=4 XBP1 overexpression) and used GSEA to examine the distribution of genes in the FEV1 decline signature in this ranked list. Genes whose expression is increased in individuals with more rapid decline in FEV1 are enriched among the genes that are induced by XBP1 overexpression (p<0.0001) (figure 4A). We next selected the subset of genes contributing the most to this significant enrichment, which are also known as the leading edge genes. We used these leading edge genes to plot a heatmap across the human bronchial airway gene expression data and the mouse adipocyte data. The leading edge genes that increased with more rapid FEV1 decline were also increased with XBP1 overexpression (figure 4A).

Figure 4

Genes increased in individuals with faster FEV1 decline are among the genes most induced by XBP1 overexpression and genes increased in individuals with faster FEV1 decline are among the genes most decreased by XBP1 knockout. (A) Genes that increase in individuals with more rapid FEV1 decline are significantly enriched among the genes that are most induced by XBP1 overexpression in murine adipocytes (GSEA p<0.0001). The vertical lines are the position of the genes with increased expression in individuals with more rapid FEV1 decline in a list of all genes ranked from most induced by XBP1 overexpression to most repressed. The height of the vertical line represents the running enrichment score, and the lines highlighted in red represent the leading edge. (B) Genes that increase in individuals with more rapid FEV1 decline are significantly enriched among the genes that are most repressed in murine hepatocytes deleted for XBP1 (GSEA p=0.025). We previously found the gene highlighted in green to have increased expression in bronchial epithelium from individuals with COPD.10 The gene highlighted in purple is involved in mucin type O-glycan biosynthesis. The genes in black are predicted targets of XBP1. GSEA, Gene Set Enrichment Analysis.

We also created a ranked list of genes based on expression changes in XBP1-knockout versus WT-control hepatocytes from 16-week-old mice (n=2 controls, n=2 XBP1 knockout).26 Genes whose expression is increased in individuals with more rapid FEV1 decline are enriched among the genes that decreased in XBP1 knockout (p=0.025; figure 4B). There were 22 leading edge genes from the XBP1 overexpression analysis and 35 in the knockout analysis. Fifteen of these genes overlapped between the two sets. We also compared the leading edge genes to the signature genes that were in the XBP1 transcription binding site list (n=133). Of the four genes that overlap between the FEV1 decline signature and the XBP1 transcription factor binding site gene set, three of them were also in the leading edge of the analyses (GALE, SEC61A1 and ARMCX3). Together, these data suggest that XBP1-regulated genes are among the genes with increased expression in bronchial epithelial cells of individuals with more rapid FEV1 decline.

In addition to investigating the potential role of XBP1 in the FEV1 decline signature, we used a similar approach to evaluate the potential role of other significantly enriched transcription factors. We focused on ATF6, SP1 and FOS/JUN because they have been previously implicated in COPD, and we found datasets examining the gene expression effects of modulating these transcription factors (GSE124797, GSE87298, GSE37935, GSE31628, GSE97226, GSE7742). We examined the perturbation of genes either increased or decreased with more rapid FEV1 decline among the genes whose expression was most altered by transcription factor perturbation and found no significant association in the datasets we examined which modulated ATF6, SP1, c-FOS or JNK (data not shown).

Exploration of other biological pathways

Given the possibility that elevated XBP1 activity might be due to ER stress resulting from elevated protein secretion, we next sought to determine if the FEV1 decline signature might be associated with genes involved in mucin production and secretion, given that airway mucus hypersecretion is a well-established feature of COPD.37 We evaluated two curated gene sets consisting of genes involved in mucin biosynthesis, including the KEGG 2019 mucin type O-glycan biosynthesis pathway35 and the Jensen Compartments mucin granule gene set.34 We identified significant enrichment of both gene sets among the genes expressed more highly in individuals with more rapid FEV1 decline (KEGG 2019 FDR q<1×10−4; Jensen FDR q=0.008). The leading edge genes contributing most to the enrichment of these gene sets included several polypeptide N-acetylglucosaminyltransferase genes, several mucin genes and carcinoembryonic antigen cell adhesion molecule 5 (online supplemental table 7). We also examined the potential associations between ER stress (using the expression of predicted targets of XBP1 as a surrogate of ER stress-driven XBP1 activity), mucus hypersecretion and the signature of FEV1 decline. We found that the association between the summarised expression of the mucin gene sets and the summarised expression of the predicted targets of XBP1 is less pronounced (p=0.00234) than the associations between the summarised expression of the predicted targets of XBP1 or the summarised expression of the mucin-related genes and the FEV1 decline signature (p=2.0×10−16 and p=4.82×10−12, respectively).

We also examined the potential association between markers of Th2 inflammation and FEV1 decline based on previous work describing an association of a Th2 signature in a subset of individuals with COPD and without a clinical history of asthma.32 Using a three-gene Th2 score that had previously been developed in patients with asthma,33 we found a significant association between Th2 inflammation and rate of change in FEV1 when controlling for age, sex, smoking status, pack years and baseline FEV1 (p=0.045). We also found a significant correlation between the FEV1 decline signature and the three-gene Th2 score (r2=0.1187; p=2.71×10−5).

We were also interested whether the genes in the FEV1 decline signature included genes previously identified in GWAS of COPD. We found that a gene set composed of genes previously identified in COPD GWAS38 was significantly enriched among genes that are decreased in individuals with more rapid FEV1 decline (GSEA p=0.035). Genes in the leading edge of this analysis included ARMC2, DLG2, THSD4, KLHL7, CCDC101, ANKH and CHRNA3.

Discussion

We have identified gene expression differences at baseline in bronchial epithelium from current and former smokers that are associated with the subsequent rate of change in FEV1. We have replicated the airway gene expression signature of FEV1 decline in an independent dataset of participants with COPD. There is significant enrichment of the FEV1 decline signature among genes ranked according to the presence or severity of COPD, suggesting that the FEV1 decline signature is related to the Steiling et al 10 airway gene expression signature of COPD. Interestingly, we found that a subset of the airway gene expression changes associated with more rapid FEV1 decline may be in part explained by increased activity of the transcription factor XBP1 and mucus hypersecretion. There are also significant associations between the FEV1 decline signature and a signature of Th2 inflammation.

The replication of the gene expression signature of FEV1 decline in endobronchial biopsies from participants with COPD in the GLUCOLD trial is notable in two regards. First is the replication of the signature in a different sample type (bronchial biopsies vs brushes). Second, as GLUCOLD is comprised of only patients with COPD, replication in this cohort suggests that the signature can be relevant to COPD progression. Further studies should be conducted to determine whether the gene expression signature of FEV1 decline can be used to predict which individuals are at risk of more rapid disease development or progression.

The transcription factor binding sites enriched in the rate of FEV1 decline signature support protein response39 is hypothesised to play a role in the development of COPD.25 40–42 Bronchial airway expression of XBP1 targets is increased in advance of FEV1 decline in our data, even when controlled for smoking status.43 XBP1 has been previously shown to increase in human bronchial epithelial cells derived from patients with COPD compared with cells derived from smokers without COPD and non-smokers.44 Elevated XBP1 increases the expression of cytokines (interleukin 8 (IL-8) and IL-1β) and Th1 chemokines.25 These cytokines have been shown to be elevated in COPD.45

XBP1’s potential role to contribute to the gene expression changes associated with more rapid FEV1 decline is interesting given the role of XBP1 in the UPR to ER stress.25 41 42 46 The UPR is triggered by protein misfolding and/or ER overload, that may result from oxidative stress caused by cigarette smoking. The UPR is mediated by three families of signal transducers, including the IRE1/XBP1 pathway.47 XBP1 binds to ER stress elements and increases the transcription of chaperone proteins that assist in protein folding and reducing protein synthesis.39 Both ER stress and XBP1 expression levels are increased by cigarette smoking.40 41 Dysregulation of proteostasis and the UPR have previously been described in patients with smoking-associated COPD and have been implicated in COPD progression.48 While potential reducers of XBP1 such as IRE1α inhibitors49 and a synthetic analogue of TRPC150 have previously been investigated in the setting of ER stress in cancer, this is the first study to our knowledge to identify XBP1 as a potential regulator of gene expression changes associated with the rate of FEV1 decline. We have found that gene expression consequences of XBP1 perturbation in a cell line and a mouse model recapitulate components of the FEV1 decline signature, supporting a potential regulatory role for XBP1 in the processes that contribute to the rate of FEV1 decline. However, further studies are needed to evaluate the effects of perturbing XBP1 in human airway epithelial cells, and the possible role of XBP1 in COPD pathogenesis and disease progression.

We also found the expression of several mucin-related genes (GALNT4, GALNT5, GALNT12, MUC2) to be increased in association with more rapid FEV1 decline.51 52 The observation of increased mucin-associated gene expression with more rapid FEV1 decline is further supported by the enrichment of two mucin-related gene sets34 35 among genes increased with more rapid FEV1 decline. Further studies are needed to determine if the apparent increase in XBP1 activity associated with more rapid FEV1 decline might be a consequence of increased mucin production. We also identified enrichment of COPD GWAS genes among genes whose airway expression decreases with more rapid FEV1 decline. These genes include CCDC101, also known as SGF29, which is a subunit of a histone acetyltransferase complex; THSD4, which has metalloendopeptidase activity; and CHRNA3, which is a member of the nicotinic acetylcholine receptor family of proteins.53–55 Together, these findings support the biological relevance of the FEV1 decline signature.

We have also identified a significant association between a Th2 score and the FEV1 decline signature in current and former smokers with and without COPD. Th2 cells mediate the inflammatory response that drives a subtype of asthma which is associated with a more favourable response to corticosteroids.33 While Th2 inflammation is traditionally associated with asthma, previous work has shown that a subset of patients with COPD and without a clinical history of asthma have increased expression of Th2-associated genes.32 This suggests that a similar process leads to airflow obstruction in asthma, and in a subgroup of patients with COPD without a clinical history of asthma. In this study, we show that expression of Th2-associated genes is also associated with more rapid decline in FEV1. This finding lends additional support to the hypothesis that a subgroup of patients with COPD have more ‘asthma-like’ molecular features which also place them at risk of faster decline in FEV1.

Though the data described here support a replicable gene expression signature of FEV1 decline, there are limitations to the cohort and samples from which it was derived. First, the signature was derived in an older population of individuals with and at risk of COPD and may not represent a wider spectrum of disease or normal airway biology. Furthermore, it is possible that FEV1 decline may be quicker in the earlier stages of COPD, and that our FEV1 decline signature reflects disease severity (despite our having controlled for FEV1 in the analysis), or that factors other than those represented by our gene expression signature might influence the rate of FEV1 decline at other points on the disease-severity spectrum. Second, the signature was derived from a cohort that was initially recruited as part of cancer-related studies. As a result, details about some COPD-related traits are unavailable, and the frequency of spirometry was variable. We found that a minimum of 4 years of follow-up spirometry was required to detect gene expression differences associated with the rate of FEV1 decline, presumably because of variability in the measurement of FEV1, and this may have limited our study power, especially in subgroup analyses. For example, we were unable to evaluate the effect of smoking cessation on the signature due to the low number of individuals who quit smoking during follow-up, or gene expression differences associated with the rate of change of other lung function metrics. Similarly, we were unable to evaluate the association of bronchodilator responsiveness, symptoms or exacerbations as these data were not available for the cohort. Race is known to impact FEV1 decline,56 but we were unable to evaluate the effect of race on FEV1 decline as the majority of samples were obtained from individuals who were white which may limit the generalisability of the signature. Additionally, while bronchoscopy brushings are composed of predominantly epithelial cells,21 57 we do not know the exact cell types that lead to the observed gene expression patterns, and variations in the cellular composition of the bronchial airway epithelium might contribute to the FEV1 decline signature. Finally, the signature was also derived using data from microarrays generated as part of a previous study. Gene expression profiling by RNA-seq could potentially improve signal-to-noise characteristics and allow for the identification of novel transcripts.

While we have derived a signature of FEV1 decline from smokers with and at risk of COPD and validated these findings in an independent cohort of individuals with COPD, our analysis indicates that there are individuals who have a pattern of gene expression that is not entirely consistent with their observed rate of FEV1 decline. Beyond the potential issues of confounders and other limitations of our study described above, it is also likely that the biological process or processes that contribute to the observed gene expression signature may not be perfect predictors of FEV1 decline. FEV1 has been found to associate with the rate of FEV1 decline.58 Because we have previously described a significant number of genes associated with baseline FEV1,21 our analysis included baseline FEV1 as a covariate. However, low baseline FEV1 may reflect low peak FEV1 in early adulthood, or accelerated FEV1 decline up to the time of baseline FEV1 measurement and controlling for baseline FEV1 may obscure the impact of FEV1 on the rate of FEV1 decline. The FEV1 decline signature was derived using samples from smokers with and at risk of COPD, and was validated in a population of subjects with COPD. While this suggests a specific utility of the signature for assessing progression of COPD, we were unable to assess whether the signature reflects FEV1 decline in other clinical contexts.

In exploring the implications of the gene expression alterations associated with FEV1 decline, we largely focused our analysis on targets of known transcription factors given the potential to generate straightforward hypotheses about the molecular regulation of the observed gene expression changes. However, there is the potential to gain additional insights from future studies focused on other molecular enrichments. The XBP1 knockout and overexpression datasets where we observed significant enrichment of the FEV1 decline signature genes are derived from adipose cells and hepatocytes. Transcription factors have been shown to have effects that are similar in non-lung cells59 and these changes have been used to identify new targets.60 Other studies have suggested that there are cell-specific effects.61 The changes from knockout or overexpression in these datasets may not apply fully in lung cells.

Despite these limitations, we have identified and replicated gene expression differences associated with the rate of subsequently observed FEV1 decline using baseline gene expression profiling of bronchoscopy brushings. Further studies using larger sample sizes are needed to determine whether airway gene expression profiling can prospectively identify individuals who will experience more rapid FEV1 decline and whether this would also apply in more severe COPD, which may present with more parenchymal emphysema than primary airway disease. A previous study from Boudewijn et al identified a nasal gene expression signature associated with severe COPD versus controls that was also similar to the previously identified bronchial airway gene expression signature of COPD from Steiling et al. 10 62 It may therefore be possible to identify a nasal gene expression signature of FEV1 decline, which would be less invasive than a bronchial signature. Such markers could be used to stratify patients with and at risk of COPD, and to potentially evaluate the response to therapies aimed at diminishing the rate of FEV1 decline.

Data availability statement

Data are available in a public, open access repository. The data are available on GEO as GSE37147.

Ethics statements

Patient consent for publication

Ethics approval

This study was approved by the ethics committees and institutional review boards of all participating study sites and informed written consent was acquired from all individuals included in the study.

References

Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

Footnotes

  • MEL and KS are joint senior authors.

  • MEL and KS contributed equally.

  • Contributors KS, AS and MEL conceived the idea. KS, AS, MEL, MvdB, AF and GO provided guidance during the analysis. EJB, AF and KC performed the computational analysis. WT, PSH, SL, MvdB and AF collected the original data. GL, XX and YOA profiled the data. EJB drafted the manuscript and all the authors read and provided feedback.

  • Funding National Institutes of Health/National Heart, Lung, and Blood Institute (R01HL095388 and R01HL118542-01) and Dutch Longfonds Foundation (4.2.16.132JO).

  • Competing interests GO reports unrelated personal fees from AstraZeneca and grants from Janssen Pharmaceuticals. PSH reports unrelated grants from Boehringer Ingelheim and Galapagos. MvdB reports unrelated research grants from GlaxoSmithKline, TEVA Pharmaceuticals and Chiesi. AS reports unrelated founder’s equity from Metera Pharmaceuticals as well as grants and personal fees from Janssen Research and Development. KS reports grants from CHEST Foundation, unrelated grants from Lungevity Foundation Early Detection Award and royalties from UpToDate. MEL reports unrelated founder’s equity from Metera Pharmaceuticals and unrelated grants from Janssen Research and Development. WT reports unrelated personal fees from Pfizer, GSK, Roche Diagnostics/Ventana, Merck Sharp Dohme, Novartis, Lilly Oncology, Boehringer Ingelheim, AstraZeneca, Bristol-Myers-Squibb and AbbVie. KS, MEL and AS have US patent 9,677,138 issued. EJB, KS, MEL and AS have a relevant patent pending (application no. 62/916,431).

  • Provenance and peer review Not commissioned; externally peer reviewed.