Article Text

Download PDFPDF

Translating genomics into risk prediction
  1. Emily S Wan,
  2. Dawn L DeMeo
  1. Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, Massachusetts, USA
  1. Correspondence to Dr Emily S Wan, Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA 02115, USA; emily.wan{at}

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Cigarette smoking is a leading risk factor in the development of cardiovascular, pulmonary and malignant diseases worldwide.1 Yet, it remains one of the most challenging environmental exposures to quantify. Rudimentary categorisations of ‘never’, ‘former’ and ‘current’ smoker capture only a fraction of the complexity associated with the exposure. Attempts to quantify cumulative cigarette smoke exposure, for example by calculating ‘pack-years’, are notoriously imprecise, partially because of a failure to account for variations in behaviour such as (1) depth of inhalations, retention time before exhalation and timing between puffs, (2) intermittent periods of abstinence or relapse, and (3) a reliance on self-reported recall of smoking habits over extended periods of time (often years to decades). Additionally, biological differences related to age, gender, body size, and variations in absorption and metabolism among smokers further complicate our attempts to quantify the ‘effective exposure’ to cigarette smoke for any given individual.

Given these complexities, the use of biomarkers, both qualitative and quantitative, as an objective measurement of ‘effective exposure’ is conceptually appealing. To date, the most widely used smoking biomarkers are cotinine and carbon monoxide (CO). Measurement of CO, either in exhaled breath or in blood as carboxyhaemoglobin, has been shown to correlate with smoking intensity, however the relatively short half-life of CO (t1/2 1–6 hours) limits its utility to assessments of recent smoking behaviours.2 Cotinine, a direct metabolite of nicotine which is measurable in the blood, urine and tissues is another well validated biomarker of cigarette smoke exposure. The slightly longer half-life (t1/2 16–20 hours), as well as the assessment of cotinine in slower growing tissues such as hair and nail clippings, extends the potential period of monitoring for this assay to the order of weeks to months.3 However, neither cotinine nor CO are suitable for assessments of long-term smoke exposure and the associations between these exposures and epidemiological outcomes such as mortality and cancer incidence.

DNA methylation, a process where methyl groups are covalently bonded to cytosine residues in genomic DNA, can reflect environmental exposures over long periods of time and has drawn interest as a possible mediator between the extended latency between cigarette smoke exposure and the development of chronic diseases such as atherosclerosis, COPD and lung cancer. Prior to the genomics revolution, evaluation of DNA methylation was resource intensive and largely focused on a small number of candidate genes such as imprinted loci and tumour suppressors. By modifying the technology employed in genome-wide genetic studies, quantitative assessments of DNA methylation at tens to hundreds of thousands of sites throughout the genome became possible4 and led to the identification of numerous, previously unsuspected smoking-associated loci.5–7

A site located in the third intron of the aryl hydrocarbon receptor repressor (AHRR) gene (OMIM 606517), known by its industry-assigned name as ‘cg05575921’, has been consistently identified in epigenome-wide association studies of blood and lung-derived tissues as one of the most strongly associated sites with smoking.6 Active smoking causes progressive demethylation at cg05575921 in blood-derived leucocytes.6 ,8 ,9 In addition to reflecting current smoking status, a strong inverse correlation between methylation levels and self-reported measures of smoking intensity exists and can be corroborated through established biomarkers such as serum cotinine.10 In former smokers, methylation levels at cg05575921 in blood are correlated with time since quitting,8–10 with evidence of active remethylation detectable as early as a few months following cessation.9 Notably, remethylation at this locus is often incomplete, even after years of abstinence from smoking; thus methylation at cg05575921 is unique among smoking biomarkers in the ability to distinguish former smokers from never smokers.10

Dynamic changes in methylation at cg05575921 in blood may provide additional information on the biological impact of chronic cigarette smoke exposure on the risk of the future development of disease. Associations between methylation at cg05575921 and subclinical carotid atherosclerosis11 and cardiovascular mortality12 have been reported in observational cohorts. Case-control studies have found that methylation levels at AHRR, both alone and in conjunction with data from other smoking-related methylation sites, was strongly associated with the future development of lung cancer13 and lung cancer-specific mortality.14 Importantly, in many of these studies, AHRR methylation levels remained predictive even after adjustment for self-reported smoking parameters.

In Thorax, Bojesen et al15 examine the associations between methylation at the AHRR and morbidity and mortality associated with smoking-related diseases in a large cohort with extended longitudinal follow-up (median 19 years). The authors begin by assessing the performance of a targeted, PCR-based assay for methylation at cg05575921 in blood. The validity of the assay is supported by its ability to detect differential methylation associated with self-reported smoking variables as well as a biologically plausible interaction between methylation levels and a genetic variant (rs1051730) within the nicotinic acetylcholine receptor gene (CHRNA3) which has been associated with nicotine addiction and smoking intensity.16 The authors then identify associations between methylation at cg05575921 in blood and all-cause mortality, COPD hospitalisations and, among ever-smokers, lung cancer incidence. Among these, the association with COPD hospitalisations represents the most novel finding, though a potential limitation in this analysis is the lack of spirometric data and the possible misdiagnosis of COPD exacerbations in patients presenting clinically with acute dyspnoea.

Lastly, the authors present an intriguing analysis in a subset of high-risk ever smokers eligible for CT screening which demonstrates that the incorporation of cg05575921 methylation data allows for additional risk stratification in subjects with similar risk estimates based on the modified Prostate, Lung, Colorectal Ovarian (PLCOM2012) Score.17 Although too cumbersome for clinical applications, the PLCOM2012 Score, which is calculated using detailed information on smoking behaviour, demographic, socioeconomic, and personal and family history, has been shown to have increased sensitivity and comparable specificity relative to the less selective criteria typically employed for CT screening based upon the National Lung Screening Trial.18 Thus, the potential utility of a simple biological assay to provide meaningful data which could potentially obviate unnecessary screening in low-risk individuals as well as target more intensive screening in higher-risk individuals is obvious. However, several steps remain to be taken before we can truly translate these findings into clinical practice. First, validation and generalisability of the findings reported by Bojesen et al15 in independent cohorts should be demonstrated. The population studied in this manuscript, though large, is relatively isolated and genetically homogeneous. Several studies have demonstrated that differential methylation by ancestry exists, with at least one study demonstrating reduced sensitivity of AHRR methylation in identifying current smokers among populations not of European ancestry.19 Second, the values reported in this manuscript are relative (quintiles) and do not take into account the possibility of tissue-type and/or cellular heterogeneity. DNA methylation patterns differ by tissue type and, because whole blood is comprised of a collection of distinct cell types in variable proportions, assessing the impact of differences in cell count will be critical in the development of a standardised assay for which defined absolute cut-offs and performance attributes can be obtained. Finally, randomised clinical trials to test the added value of an AHRR methylation assay relative to current standards of care will need to be conducted.

A significant lag between variant discovery and translation into clinical practice exists for most genomics-based discoveries due to the modest effect sizes and the relative paucity of knowledge on how such variants are mechanistically related to disease development; most discoveries require years of additional work in functional laboratories before potential therapeutic strategies are developed. In contrast, the effect sizes from methylation studies at the AHRR locus are larger than those observed from most genetic studies and also demonstrate a direct association with public health outcomes. These attributes, which are applied towards several clinically relevant end points in the manuscript by Bojesen et al, may allow for future application of AHRR methylation data towards clinical care guidelines—more research is needed.

Given the accelerating pace of translation of genomics-based discoveries into clinical care, practitioners will likely be called upon to administer, apply and interpret these discoveries in the care of their patients in the near future. However, while the majority of healthcare providers believe that genomics-based discoveries and developments will directly impact future clinical practice and general interest in personalised medicine is high, genomics knowledge in both healthcare providers and the lay public is low.20 Increasing the number of opportunities and types of forums for education to elevate the level of genomic literacy in both patients and healthcare providers will be an essential component in realising the promise of the precision medicine movement.



  • Competing interests None declared.

  • Provenance and peer review Commissioned; internally peer reviewed.

Linked Articles