Article Text
Abstract
Background Alpha-1 antitrypsin deficiency (AATD) is a genetic condition that causes early onset pulmonary emphysema and airways obstruction. The complete mechanisms via which AATD causes lung disease are not fully understood. To improve our understanding of the pathogenesis of AATD, we investigated gene expression profiles of bronchoalveolar lavage (BAL) and peripheral blood mononuclear cells (PBMCs) in AATD individuals.
Methods We performed RNA-Seq on RNA extracted from matched BAL and PBMC samples isolated from 89 subjects enrolled in the Genomic Research in Alpha-1 Antitrypsin Deficiency and Sarcoidosis (GRADS) study. Subjects were stratified by genotype and augmentation therapy. Supervised and unsupervised differential gene expression analyses were performed using Weighted Gene Co-expression Network Analysis (WGCNA) to identify gene profiles associated with subjects’ clinical variables. The genes in the most significant WGCNA module were used to cluster AATD individuals. Gene validation was performed by NanoString nCounter Gene Expression Assay.
Result We observed modest effects of AATD genotype and augmentation therapy on gene expression. When WGCNA was applied to BAL transcriptome, one gene module, ME31 (2312 genes), correlated with the highest number of clinical variables and was functionally enriched with numerous immune T-lymphocyte related pathways. This gene module identified two distinct clusters of AATD individuals with different disease severity and distinct PBMC gene expression patterns.
Conclusions We successfully identified novel clusters of AATD individuals where severity correlated with increased immune response independent of individuals’ genotype and augmentation therapy. These findings may suggest the presence of previously unrecognised disease endotypes in AATD that associate with T-lymphocyte immunity and disease severity.
- alpha1 antitrypsin deficiency
- COPD mechanisms
Statistics from Altmetric.com
Key messages
What is the key question?
What are the effects of alpha-1 antitrypsin genotype and augmentation therapy on bronchoalveolar lavage (BAL) and peripheral blood mononuclear cells (PBMCs) gene expression profiles?
What is the bottom line?
While these effects of genotype and augmentation therapy are not strong, we identified a signature of genes that distinguish two clusters of patients that differ in extent of emphysema, and lymphocytic infiltration, but not in therapy or genotype, potentially reflecting novel endotypes of disease driven by inflammation.
Why read on?
This is the largest study that describes the transcriptome of BAL or PBMC samples from well-characterised individuals with alpha-1 antitrypsin deficiency and the first study that aims to use PBMC and BAL gene expression to understand the effects of the causal genetic variant and augmentation therapy on their connection to patient clinical characteristics.
Introduction
Alpha-1 antitrypsin deficiency (AATD) is a genetic condition that causes early-onset airway obstruction, emphysema and liver cirrhosis. It is estimated that 60 000–100 000 individuals in the USA have AATD.1 AATD is caused by a mutation in the gene that encodes the alpha-1 antitrypsin protein (AAT), serpin peptidase inhibitor, clade A, member 1 (SERPINA1). Multiple AATD mutations have been identified that associate with low AAT serum levels, but the Z-allele is the most common point mutation in SERPINA1 (rs28929474) associated with COPD. This mutation results in a substitution of glutamic acid for lysine at position 342 in the AAT protein,2 and in the homozygous state produces an 85%–90% serum and lung reduction in this antiprotease that neutralises neutrophil elastase.3 As a consequence, there is an inability of AAT to neutralise neutrophil elastase, and this contributes to tissue destruction and matrix remodelling that underlie the pathogenesis of COPD.4
Currently, the only specific treatment for AAT-related lung diseases is augmentation therapy. The rationale for the AAT augmentation therapy is that replenishing AAT in patients deficient for this protein protects the lungs from excessive neutrophil elastase activity.2 While augmentation therapy has been shown to slow the decline in CT lung tissue density, it is not a cure for this disease.5–9 Augmentation therapy does not fully reverse accelerated lung function decline and has no proven effect on COPD exacerbations.5 Therefore, identifying other mechanisms by which AATD causes COPD and emphysema is necessary to develop therapies to better treat this disease10–12 while taking into consideration highly heterogenous AATD clinical presentations.13 14
To determine the extent and diversity of effects that AAT genotype and augmentation therapy have on gene expression, we analysed gene expression profiles of paired bronchoalveolar lavage (BAL) and peripheral blood mononuclear cell (PBMC) samples based on the Genomic Research in Alpha-1 Antitrypsin Deficiency and Sarcoidosis Study (GRADS). This multicentre cohort study provided clinical data and RNAseq profiling of BAL cells and PBMC on a well-characterised cohort of AATD individuals.13
Methods
Study participants
The GRADS Alpha-1 Study was a prospective, multicentre study of adults older than age 35 years with PiZZ or PiMZ alpha-1 antitrypsin genotypes.13 The study protocol, including study design, recruitment and measurements of clinical data, has been previously described.13 Briefly, 130 individuals with AATD were recruited through the Alpha-1 Foundation Research Registry at the Medical University of South Carolina and directly through physician offices. Written informed consent for genomic research was obtained from all participants. Figure 1 shows the overview of the study design, enrolment process and the computational analysis flow. The final population consists of 89 individuals with matched BAL and PBMC samples,13 including individuals with PiZZ not receiving augmentation therapy (n=29), individuals with PiZZ receiving augmentation therapy (n=22) and individuals with PiMZ not receiving augmentation therapy (n=38, see table 1). All patients with pulmonary diseases were more than 6 weeks from an exacerbation and 4 weeks from any antibiotics.
RNAseq
Detailed RNA isolation, library construction, sequencing, normalisation and analysis methods are described in an online supplement. After sequencing, reads were mapped to human genome (hg38) as described15 16 and normalised to address systematic variation that resulted from the sequencing process.17 18 Differential gene expression analyses were performed to evaluate the association of PiZ genotype and augmentation therapy status on BAL and PBMC gene expression patterns. For the test of augmentation therapy effect, due to possible confounding between treatment and disease severity that requires treatment, we adjusted for age, sex and disease severity as measured by the FEV1 % predicted. See the online supplemental data for additional details on differential expression analyses.
Supplemental material
Weighted Gene Co-expression Network Analysis (WGCNA) and clustering analysis
We performed WGCNA19 on BAL to identify gene modules and gene networks significantly correlated with clinical phenotypes (demographics, pulmonary function tests, chest CT findings and augmentation therapy effects). Genes expressed in over 5% of total BAL samples, and with coefficients of variation greater than 1 (n=10 718), were analysed in WGCNA.19 The default WGCNA setting was applied for data cleaning, outlier detection, network construction and module detection. A Module was defined as a group of co-expressed genes across the samples.20 Module eigengenes (first principle component of the gene expressions within the module) were taken as the representative of the module and were correlated with AATD health data. Table 2 shows the variables used in WGCNA. Gene module network functional annotation and gene set enrichment analysis were performed using GeneMANIA.21 GeneMANIA searches large publicly available data sets for connectivity patterns within a module based on coexpression, protein–protein interaction, pathways and other connections that might suggest any functional relations. Gene Ontology pathway enrichment analysis was performed based on FDR-corrected hypergeometric tests. K-means clustering analysis was performed using genes from the chosen WGCNA gene module to create a heatmap and reveal visually distinct clusters. Additional details for our clustering analysis are provided in the online supplemental data.
NanoString validation
To validate the results from WGCNA analysis, we measured the gene expression of 14 genes in BAL by NanoString nCounter Gene Expression Assay, which is generally considered a more accurate technology, in particular for low-quality RNA samples.22 Among the genes selected were 14 genes with the highest expression level and fold change from WGCNA clustering analysis (CCL5, IL32, LY9, IFITM1, CD3E, PDCD1, ZAP70, LCK, TGFBR3, FOXP3, IL12RB2, IL18RAP, PRF1 and GZMA) and genes related to genotype (EGR3) and therapy effect (CCDC40, MORN2 and SPA17) in BAL and PBMC samples. Analysis was performed on 77 available samples following manufacturer instructions. nSolver V.3.0 digital analyser software was used to analyse data (see online supplemental material).
Results
Supervised analysis for genotypes and augmentation therapy
The subject demographics are provided in table 1. There were no significant differences between the three groups (PiZZ on therapy, PiZZ off therapy and PiMZ) in basic demographic characteristics other than age (table 1, also see table 2), but individuals not on augmentation therapy (group 1, PiZZ off) were younger, healthier and had better lung function than individuals on AAT augmentation therapy (group 2, PiZZ on). BAL cell counts were similar among three groups, although a trend for PiZZ on having more neutrophils was seen.
We examined the genotype effects in gene expression profiles in BAL and PBMC isolated from individuals with PiZZ off therapy and PiMZ (group 1 vs 3, table 1). Overall, we did not observe significant changes in gene expression profiles. A total of 113 genes in BAL and 181 in PBMC were differentially expressed between PiZZ off therapy and PiMZ (FDR <0.05, see online supplemental tables 2 and 3 and online supplemental figure 1). The 113 BAL genes were enriched for inflammatory, cellular chemotaxis and antiviral responses (online supplemental table 3). In addition, miR-5010–3 p, miR-6807–3 p and miR-6797–3 p were significantly downregulated in PiMZ group in our BAL data set, while their predicted target genes (RNF11, SUMO2, CYCS and MCMDC2) were significantly upregulated (online supplemental table 4). Downregulated miRs: -let-7b, -9, -30, -34, -3653, -4537, 5196 and 6516 were also identified in PBMC data set. While miR-9 expression is downregulated, its target gene BIK expression was shown upregulated in PiMZ (online supplemental table 5).
Supplemental material
Supplemental material
Supplemental material
Supplemental material
Supplemental material
Supplemental material
We also compared gene profiles of individuals with PiZZ not receiving augmentation therapy (group 1) with PiZZ receiving augmentation therapy (group 2, table 1) the majority of differentially expressed genes were low expressed genes (see online supplemental figure 1 and online supplemental tables 6 and 7). A total of 733 and 64 genes were found to be differentially expressed in BAL and PBMC data set, respectively. These genes were enriched for cilium morphogenesis, chemotaxis and inflammatory response (online supplemental table 8). Many of these genes were predicted to be target genes of following significantly differentially expressed miRs: miR-155-5 p, miR-27-3 p, miR-10b-5p, miR-5001-5 p, miR-1909-5 p, miR-3605-5 p (online supplemental table 9).
Supplemental material
Supplemental material
Supplemental material
Supplemental material
To investigate commonality between gene expression profiles in lung and blood within genotypes and therapy effect, we performed comparisons of all expressed genes in BAL and PBMC samples (online supplemental figure 1). While we observed high overall correlation in gene expression between BAL and PBMC (r=0.91, see online supplemental figure 2), the overlap of differentially expressed genes across the two tissues was minimal, with only five and six genes (FPKM >1) significantly differentially expressed in both BAL and PBMC for genotype effect and therapy effect, respectively (online supplemental figure 1 and online supplemental tables 10 and 11).
Supplemental material
Supplemental material
Supplemental material
Weighted Gene Co-expression Network Analysis
Since we observed rather low effect of genotype and therapy on BAL and PBMC gene expression in clinically defined groups, we applied an unsupervised approach to analyse gene expression patterns from BAL, the more disease-relevant tissue compartment in the study of AATD lung diseases. We performed WGCNA analysis on 89 BAL samples to identify gene networks and modules associated with AATD genotypes and augmentation therapy samples. A total of 10 718 genes (CV >1%) were included in the WGCNA. This analysis aimed at identifying modules that are significantly correlated with the measured clinical traits for AATD individuals. Online supplemental figure 3 shows the cluster dendrogram. Thirty-one modules were identified, of which 10 were significantly correlated with multiple clinical variables (figure 2 and online supplemental figure 4). Some of the strongest correlations were identified for the module ME 31 (2312 genes) that correlated with emphysema presence (dichotomised variable; emphysema present defined as greater than 5% of lung voxels less than a −950 HU threshold, r=0.24, p=0.03), bronchiectasis presence as measured by a visual scoring system, r=0.2, p=0.07), alveolar macrophage %, r=−0.6, p=1E-08) and alveolar lymphocyte % (r=0.57, p=6E-08).
Supplemental material
Supplemental material
Clustering analysis
To examine if the genes in gene module ME31 (2312 genes) could cluster AATD individuals into clinically meaning groups, we performed K-means clustering (figure 3) that distinguished three clusters of AATD individuals. Analysis of individuals’ demographics, clinical and CT chest data (table 3) among the three clusters revealed a small group of seven individuals in cluster 1 with slightly increased emphysema. Cluster 2 had younger individuals (p=0.007) compared with cluster 3, as well as slightly higher FEV1 and diffusing capacity of the lungs for carbon monoxi (DLCO). Cluster 3 had more subjects with a PD15 value above the median and emphysema presence than observed in cluster 2. Cluster 3 had higher BAL lymphocyte % than cluster 2 (11.0±7.6 vs 2.8±2.3) as well as lower macrophage % (85.7±9.3 vs 95.9±2.3), while neutrophil % was not significantly different. There was no significant difference in genotype, therapy, gender or smoking status between clusters 2 and 3 (table 3). Despite the strong associations between these clusters and non-transcriptomic variables, our attempt to cluster patients based on non-transcriptomic variables did not replicate these clusters (data not shown). These findings all together suggest the gene expression signature from module ME31 identified two well-separated groups of individuals in clusters 2 and 3 with distinct clinical emphysema characteristics that are not primarily driven by either AATD genotype or therapy.
Network analysis and functional enrichment analysis
As the WGCNA module ME 31 (2312 genes) showed significant correlation with key AATD clinical variables and identified two well-defined clusters of AATD individuals, we aimed to further investigate the biological function of these genes. We focused on a subset of 666 genes that were downregulated in cluster 2 and visually differentiated clusters 2 and 3 (marked in red in the lower half of figure 3, online supplemental tables 12 and 13). Figure 4 shows 135 genes in the module that belong to the multiple T cell and immune response enriched pathways (exp: PDCD1, CD3, LCK, ZAP70, CCL5, BTLA, GRAP2 and CD247). Functional enrichment analysis revealed that several immune system pathways were over-represented in this module, including T cell activation, regulation of immune response and antigen receptor-mediated and plasma membrane signalling pathways (see figure 4 and online supplemental table 14). Together these findings suggest an active immune T cell response related to presence of emphysema in AATD individuals.
Supplemental material
Supplemental material
Supplemental material
Differential gene expression for PBMC between newly identified AATD clusters
To better understand biological differences between the individuals in clusters 2 and 3, we performed differential gene expression analysis on PBMC expression profiles using edgeR. We found 125 differentially expressed genes between clusters 2 and 3 (see figure 5 and online supplemental table 15). These genes were enriched for plasma lipoproteins, lipid transport and inflammatory responses (online supplemental table 16). Most notably, some of these genes, including CCL18, FABP4, FN1 and miR-9, have been implicated in COPD and were indeed highest in cluster 3 characterised by more severe disease.
Supplemental material
Supplemental material
NanoString validation
Validation by NanoString confirmed that EGR3, a key negative regulator of T cell activation,23 was upregulated in PiZZ versus PiMZ in both BAL and PBMC (figure 6), and CCDC40, MORN2 and SPA17 genes related to cilia dysfunction in lung disease24 were upregulated in PiZZ on therapy in BAL (figure 6). In addition, validation of genes associated with emphysema (CCL525 and IL3226), IFN pathway (IFITM1)27 and T cell activation and PD1- immune signalling (LY9,28 CD3E, PDCD1, ZAP70, LCK, TGFBR3, FOXP3, IL12RB2, IL18RAP, PRF1 and GZMA)29 revealed complete concordance with RNA-Seq data (figure 6).
Discussion
In this study, we applied supervised and unsupervised methods on gene expression profiling of BAL isolated from AATD individuals, and we identified genes associated with genotype and augmentation therapy as well as a gene module enriched for T cell pathways and immune response that correlated with severity of disease independent of genotype and therapy in AATD affected individuals.
Our most impressive finding was a previously unrecognised endotype of individuals with AATD that regardless of genotype or augmentation therapy had more severe respiratory disease and an altered pulmonary inflammatory and immune response driven by lymphocytic influx (cluster 3, table 3). Although this AATD cluster had individuals with increased alveolar lymphocyte %, the cluster was not significantly correlated with AATD therapy or changes in lungs such as: FEV1 % predicted, DLCO, bronchiectasis or airway wall thickness. Instead, the chest CT analysis using PD15 and emphysema presence were the most informative30–32 of disease severity.33 We show that lymphocyte % was elevated in individuals with PD15 lower than the median (p=0.05) and in individuals with emphysema presence (p=0.009, data not shown) suggesting that increase in lymphocyte % is a biological signal of AATD disease severity.
A potential mechanistic clue of AATD lies in the genes that characterised this cluster. Among the most expressed and upregulated BAL genes from cluster 3 were genes implicated in immune responses via T cell activation and PD-1 signalling suggesting the presence of a PD1 immunosuppressive and proinflammatory endotype.29 Recently, it has been suggested that alpha-1 antitrypsin protein (A1AT) has roles that extend beyond its antiproteolytic effects. For instance, it can regulate inflammatory milieu by inhibiting proliferation of T helper cells and by controlling antigen presentation.34–36 While our study did not study T cell functions, the identification of an AATD individual cluster characterised by emphysema and T cells suggest the need for detailed immunophenotyping of AATD patients and exploration of specific therapies. This is also supported by our PBMC gene expression analysis. The PBMC of the individuals in cluster 3 exhibited distinct gene expression patterns with increased expression of genes such as CCL18, FABP4, FN1 and miR-9, which were previously associated with COPD. CCL18 was shown to be upregulated in COPD and predict risk for exacerbations.37 38 FABP4 was shown increased in COPD and was negatively correlated with lung function.39 FN1 was also show to be a gene correlated with progression of COPD.40 Mir-9 was shown to be associated with the loss of muscle force in patients during an acute exacerbation of COPD.41 Our unsupervised analysis may have revealed a subgroup of AATD individuals that cluster together based on gene expression and segregates a group of individuals with biological changes in lungs and lung function during development of emphysema that goes beyond clinical phenotyping described previously.42 The challenge will be to develop and validate diagnostic parameters that characterise these individuals that could be used in the clinic. There also will need to be much work done to exclude environmental stimuli that correlate with these genomic signatures of AATD.
Our study has several limitations that should be considered when interpreting the results. The first is that it contains a relatively small cohort of individuals with AATD, with no independent replication. While this is true, our study in fact represents the first and largest study that describes the BAL and PBMC transcriptome in a carefully phenotyped cohort of AATD individuals. We believe that our findings will encourage others to follow-up on our results and potentially focus on the subgroup we identified to get detailed mechanistic understanding of the mechanisms regulating emphysema in A1AT. Another limitation is the lack of PiMM individuals in the cohort. While we did not find significant difference between gene expression profiles of PiMZ and PiZZ individuals, more difference might have been detected had PiMM individuals been included. Finally, the obvious limitation of bulk RNAseq of BAL cells and PBMC is that we do not necessarily know whether the changes we observed are derived from changes in cell content, transcriptional regulation or both. For example, it appears that the separation between subphenotypes clusters 2 and 3 is primarily driven by lymphocyte %. However, a comparison with lymphocyte marker genes in single cell RNAseq data from control BAL (data not shown) suggests that there are signals specific to the subphenotypes not explained by lymphocyte %. While gene expression values could have been normalised for cell proportions, doing so would have hidden potentially relevant signals. Therefore, we decided to consider the cell counts as clinical phenotypic attributes. The fact that we found very few differential expression genes for genotype and therapy effects suggests that the changes in gene expression here were not primary driven by differences in cell counts. As detailed immunophenotyping was beyond the scope of our study, we believe that we have shown convincingly that gene expression associated with T cell inflammation is common in patients with emphysema. We hope that this work will motivate other more detailed studies as well as the development of interventions that target the immune aberrations that we observed.
In conclusion, using unsupervised data analysis methods, we identified a subgroup of AATD individuals characterised by more severe disease, and increased T cell inflammation, that is independent of genotype or augmentation therapy. This finding may represent a novel endotype in AATD. Further studies that apply advanced immunophenotyping approaches, including longer follow-up, and focus on emphysema may help to validate this endotype and potentially identify specific interventions.
References
Supplementary materials
Footnotes
Twitter @jenhwachu, @yingzezhang, @maorsauler, @kaminskimed
J-hC, WZ and MV contributed equally.
Presented at Parts of this work were presented as abstracts in American Thoracic Society International Conference in San Diego, California, May 2018.
Contributors NK, RS, FCS, SRW and CS conceived and designed the experiments. ESC, RGC, RS, CS, MJB, HH, KFG, ELH and JKL participated in subject phenotyping and classification. KCP, RGC, NK, RS, CS, EM, MB, HH, KFG, ELH, RS and SRW supervised sample and data collection, MV, TA, and GD performed the RNA sequencing experiments. JCS performed the single cell RNA sequencing experiments. JC, WZ, NK, MV, XY, BH, AM and MS analysed the data. NK, CS and FCS supervised the analytic plan. JC, WZ and MV wrote the manuscript with input from all other authors. All authors have read and approved the manuscript.
Funding This work is supported by NIH/NHLBI grant U01HL112707.
Patient consent for publication Not required.
Ethics approval The recruitment and consenting procedures were approved by the local Institutional Review Board.
Provenance and peer review Not commissioned; externally peer reviewed.
Data availability statement Data are available in a public, open access repository. RNAseq data described in this paper has been deposited in the NCBI Gene Expression Omnibus (GEO) under accession code GSE109515.