Statistics from Altmetric.com
Chronic obstructive pulmonary disease (COPD) is defined as airflow obstruction that is not fully reversible. It results from abnormal inflammation following exposure to noxious particles or gases.1 This is typically exposure to cigarette smoke but may also include exposure to biomass fuels and some industrial dusts. COPD clusters within families, suggesting that heritable factors play a role in the pathogenesis of this disease.2 3 The only genetic factor that is widely accepted to be associated with COPD is severe deficiency of α1-antitrypsin.4 This depletes an important member of the antiproteinase screen and causes excessive intrapulmonary inflammation.5 6 Indeed, even asymptomatic non-smoking heterozygotes for the Z allele (PiMZ) without airflow obstruction have increased intrapulmonary inflammation.7
Severe α1-antitrypsin deficiency is found in only 1–2% of all individuals with COPD. There is growing interest in the genetic factors that predispose to COPD in those individuals who do not have α1-antitrypsin deficiency. Early papers focused on association studies that looked at genetic variation within individuals with COPD compared with those individuals who were matched for all factors that are associated with COPD (most importantly age, smoking history and ethnic background) but who did not have airflow obstruction. The studies were typically small (100–150 cases and controls) and were often confounded by failure to match cases and controls carefully. There was often the issue of correcting for multiple comparisons as well as the inherent complexity of the COPD phenotype.8 Larger family-based studies have shown the independent clustering of the airway disease and emphysema component of COPD within families.9 This suggests that different genetic factors predispose to each of these components of the phenotype. The only way of overcoming the inherent variation in COPD is to focus on groups of well-characterised individuals with components of the COPD phenotype or to undertake studies with large sample sizes and then to replicate any positive findings in other cohorts. This is now the case with candidate gene studies and there is good evidence that heterozygotes for α1-antitrypsin deficiency (PiMZ) and polymorphisms in genes involved in oxidative stress (microsomal epoxide hydrolase, glutathione S-transferase, haeme oxygenase and superoxide distumase 3) are associated with an increased risk of COPD.10 Furthermore, polymorphisms in transforming growth factor-beta, a protein important in maintaining the extracellular matrix, are also likely to be important in the pathogenesis of this disease. More recently, a minor allele of a single nucleotide polymorphism (SNP) in matrix metalloproteinase-12 (MMP-12) has been shown to protect against COPD in adult smokers.11
The limitation of association studies is that genes are assessed in pathways that are already recognised to be associated with COPD—the proteinase–antiproteinase and oxidative stress pathways—and genes that maintain the integrity of the extracellular matrix. In recent years the collection of large cohorts has allowed unbiased genome-wide association studies (GWAS) in individuals with COPD. The largest study was undertaken in a cohort from Bergen, Norway, and then replicated in the International COPD Genetics Network, the National Emphysema Treatment Trial with controls from the Normative Ageing Study and then finally in the Boston Early Onset COPD cohort.12 Top hits from this analysis were SNPs in the alpha-nicotinic acetylcholine receptor and the hedgehog interacting protein (HHIP). The SNP in the alpha-nicotinic acetylcholine receptor (rs8034919) was also identified in three GWAS of lung cancer and is thought to be important in peripheral vascular disease and nicotine addiction.13–16 It is likely that this SNP functions as a marker for an addiction gene. Individuals who carry this SNP may require more cigarettes to satisfy nicotine addiction, may inhale more deeply and may find it more difficult to withdraw from cigarette smoking. Alternatively, the association may result from linkage disequilibrium with SNPs in the iron responsive element binding protein 2 (IREB2).17 This was identified by expression analysis in lung tissue from individuals with COPD and then confirmed in three separate COPD cohorts. IREB2 is localised to the human epithelial cell surface and may play a role in protecting against epithelial damage from oxidative stress. Finally, the analysis of even larger numbers of individuals with COPD has identified a SNP in FAM13A as being associated with COPD. The role of this gene in the disease is unclear but expression has been associated with hypoxia.18 FAM13A has also been associated with lung function in a second independent study.19 A detailed analysis of these genes in well-characterised cohorts showed that SNPs in the α-nicotinic acetylcholine receptor are associated with smoking intensity, airflow obstruction and emphysema,20 21 SNPs in the hedgehog interacting protein are associated with systemic features of COPD (low body mass index) and exacerbations,21 while SNPs in FAM13A are associated with airflow obstruction.21
Taken together, there is clear evidence that genes associated with the protease–antiprotease pathway, oxidative stress and the integrity of the extracellular matrix are involved in the susceptibility of smokers to COPD. GWAS have identified genes that are likely to have a role in addiction and other genes that may protect against oxidative stress. It is clear that the gene mining has only just begun, with good prospects for identifying further novel genes associated with COPD. This is important as they will allow us to stratify individuals with COPD and identify pathways that provide new insights into the mechanisms of disease. Clearly, the long-term aim is to intervene in these pathways and so prevent the relentless progression of the pulmonary and systemic manifestations of COPD. It is clear that further collaboration is required to bring together even larger numbers of well-phenotyped cohorts to identify more genes that are associated with COPD. The potential value of this approach has been demonstrated in two large meta-analyses examining lung function in general population samples. These studies, run by the Spirometa and CHARGE consortia, identified a number of novel genes which predict forced expiratory volume in 1 s (FEV1) and/or the ratio of FEV1 to forced vital capacity. Four genes identified were common to both studies- the previously observed HHIP, GSTCD (glutathione S transferase C-terminal domain), HTR4 (the serotonin 4 receptor subtype) and AGER (the receptor for advanced glycosylation end products).19 22 While these studies concentrated on general population samples, it is highly likely that genetic factors predicting FEV1 will be strong risk factors for COPD and other diseases where airflow obstruction is present; analyses addressing this issue are ongoing. The populations studied in each consortium included primary discovery samples of >20 000 in each. It seems to us therefore that the future for COPD genetics is very bright, but only if we think big!
There have been many more studies addressing the genetics of asthma than have been performed examining COPD, although the overall conclusion that very large studies are required to identify reproducible effects is the same when considering asthma. Initial attempts at hunting an asthma gene used predominantly family-based linkage approaches, which resulted in multiple loci on almost every chromosome being identified in different populations. Candidate gene studies have also been widely performed but, as with COPD, the ability to replicate findings has been very limited. The genes which have attracted the most interest from these studies include ADAM33, PHF11, GPRA and DPP10. While there have been a number of studies which have replicated initial associations, the effect size of individual SNPs for asthma risk has generally been small and functional mutations remain to be identified. For example, in a study which involved genotyping the complete 1958 birth cohort for risk SNPs in these genes prioritised on the basis of linkage data, the strongest signals seen were for alleles with odds ratios of <1.2 in ADAM33 and PHF11, with little evidence being observed for major effects at each locus.23
It is possible that the key SNPs in these genes remain to be defined or that the contribution of these genes is at best minimal. More recently, GWAS approaches have taken over as the method of choice to identify the genetic factors underlying asthma risk. The first two studies to examine this area resulted in the identification of the region containing the gene for ORMDL3 as an asthma locus, and a number of regions which were associated with eosinophilia in patients with and without asthma, including the regions containing the genes such as IL5, IL1RL1 and GATA2.24 25 There have been a number of recent publications using GWAS approaches in small studies but these have generally failed to identify signals which reach conventional genome-wide significance, probably because the studies have been underpowered. Greater clarity on the key genetic regions involved in determining asthma risk will undoubtedly come from larger studies which are shortly to be published, such as that organised by the GABRIEL consortium—again it seems ‘biggest is best’ when it comes to asthma genetics. Finally, the ability to interrogate the human genome is about to shift up another step with the use of deep sequencing approaches to perform, for example, whole exome sequencing studies which could allow us to move from identifying common variants associated with disease to the identification of rare or intermediate variants of interest.
Funding Work in both authors' laboratories is supported by the MRC. In addition, IPH is supported by Asthma UK and the Nottingham Biomedical Research unit and DAL receives support from the Wellcome Trust, EPSRC, BBSRC, the Alpha-1 Foundation and Papworth Hospital.
Competing interests None.
Provenance and peer review Commissioned; not externally peer reviewed.