Article Text

Appraising the causal role of smoking in idiopathic pulmonary fibrosis: a Mendelian randomization study
  1. Jiahao Zhu1,
  2. Dan Zhou2,3,
  3. Min Yu4,
  4. Yingjun Li5
  1. 1 Department of Epidemiology and Health Statistics, Hangzhou Medical College, Hangzhou, Zhejiang, China
  2. 2 Department of Big Data in Health Science, School of Public Health, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
  3. 3 Vanderbilt University Medical Center, Nashville, Tennessee, USA
  4. 4 Zhejiang Provincial Center for Disease Control and Prevention, Hangzhou, Zhejiang, China
  5. 5 Department of Epidemiology and Health Statistics, School of Public Health, Hangzhou Medical College, Hangzhou, Zhejiang, China
  1. Correspondence to Prof. Min Yu, Zhejiang Provincial Center for Disease Control and Prevention, Hangzhou, 310051, Zhejiang, China; mycdc1234{at}; Dr Yingjun Li, Department of Epidemiology and Health Statistics, School of Public Health, Hangzhou Medical College, Hangzhou, 310053, Zhejiang, China; 2016034036{at}


Smoking has been considered a risk factor for idiopathic pulmonary fibrosis (IPF) in observational studies. To assess whether smoking plays a causal role in IPF, we performed a Mendelian randomization study using genetic association data of 10 382 cases with IPF and 968 080 controls. We found that genetic predisposition to smoking initiation (based on 378 variants) and lifetime smoking (based on 126 variants) were associated with a higher risk of IPF. Our study suggests a potential causal effect of smoking on increasing IPF risk from a genetic perspective.

  • Idiopathic pulmonary fibrosis
  • Tobacco and the lung

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


Idiopathic pulmonary fibrosis (IPF) is a lethal lung disease characterised by progressive fibrosis of lung parenchyma for which there is currently no cure.1 Although the cause of IPF is unclear, tobacco smoking is thought to play a part in the pathogenetic processes of IPF.1 However, whether smoking represents a causal determinant remains uncertain, because the available evidence is scarce and originates mainly from observational studies, which are vulnerable to confounding bias and reverse causation. Mendelian randomization (MR) is an increasingly used approach that overcomes these challenges by exploiting randomly allocated genetic variants as instruments to make reliable causal inferences. Here, we conducted a MR study to investigate the causal association between smoking and the risk of IPF.


In this study, we applied a two-sample design based on summary data from genome-wide association studies (GWASs). As genetic instruments, 378 independent, genome-wide significant (p<5 ×10−8) single-nucleotide polymorphisms (SNPs) associated with smoking initiation (ever vs never being a regular smoker) were identified from the GWAS and Sequencing Consortium of Alcohol and Nicotine use (GSCAN), involving 1.2 million individuals.2 In a secondary analysis, we used 126 independent SNPs associated with lifetime smoking (a continuous measure that takes into account smoking initiation, duration, heaviness, and cessation) as genetic instruments from the UK Biobank with 462 690 participants.3 These instrumental SNPs explained 2.3% of the variance in smoking initiation and 1.3% in lifetime smoking and have been treated as robust instruments with F-statistics>10 in prior MR studies.4 GWAS summary data for IPF were derived from a meta-analysis of 5 cohorts (UK, Chicago, Colorado, UUS [US, UK, and Spain], and Genentech Study) by the International IPF Genetics Consortium (4125 cases and 20 464 controls)5 and another meta-analysis of 9 biobanks (BioVU, Colorado Centre for Personalised Medicine, Estonian Biobank, FinnGen, HUNT Study, Michigan Genomics Initiative, Mass General Brigham, UCLA Precision Health Biobank, and UK Biobank) by the Global Biobank Meta-analysis Initiative (6257 cases and 947 616 controls).6 IPF was defined using European Respiratory Society/American Thoracic Society guidelines in the International IPF Genetics Consortium and using International Classification of Diseases codes (515 and 515.0 for ninth Revision and J84.1, J84.8, J84.89, J84.17, J84.1, and J84.10 for 10th Revision) in the Global Biobank Meta-analysis Initiative. GWAS models have adjusted for age, sex, and study-specific covariates where possible in the original studies. All participants included in the study were of European ancestry. The smoking and IPF studies involved some overlapping participants (43% of subjects in the Global Biobank Meta-analysis Initiative from the UK Biobank where the instruments for lifetime smoking were identified). However, sample overlap is not expected to introduce significant bias, because strong instruments (eg, F-statistics>10) for smoking were used and these overlapping samples were from large biobanks (eg, UK Biobank).

The principal analysis was performed using the inverse-variance weighted (IVW) method. Sensitivity analyses robust to pleiotropy were conducted, including weighted median, MR-pleiotropy residual sum and outlier (MR-PRESSO), and MR-Egger (summarised in table 1). The estimates from two IPF datasets were combined using random-effects meta-analysis. We examined horizontal pleiotropy and heterogeneity using the MR-Egger intercept and the Cochran’s Q statistic, respectively. Reverse MR was performed to test for the potential reverse causation using 23 SNPs associated with IPF as genetic instruments.5 Characteristics of instrumental SNPs are given in online supplemental tables 1–3. Statistical analyses were conducted using R (version 3.6.3) with “TwoSampleMR” and “MRPRESSO” packages. The significant threshold was 2-tailed p<0.05.

Supplemental material

Table 1

Summary of applied Mr methods


The IVW results showed that genetic predisposition to smoking initiation (OR (OR) = 1.29; p=0.002) and lifetime smoking (OR=1.63; p<0.001) were associated with an increased risk of IPF in the meta-analysis of two datasets (table 2). The direction of effect was consistent across datasets, although the association in the International IPF Genetics Consortium did not reach the significant threshold. There were indications of horizontal pleiotropy or heterogeneity for some associations (table 3). MR-PRESSO identified only rs6948707 at MAD1L1 (a known IPF susceptibility signal) as an outlier and showed an effect directionally consistent with the IVW method after correction (table 2). Results were relatively stable in other sensitivity analyses although with wide confidence intervals. In the reverse MR analysis, genetic predisposition to IPF showed a null association with smoking initiation (OR=1.00; p=0.720) and lifetime smoking (beta=0.002; p=0.378), indicating the unidirectionality of the inferred relationship (online supplemental table 4). The leave-one-out analysis demonstrated that no single SNP (including rs6948707) substantially influenced the overall estimate (online supplemental figures 1–4). Scatter plots are presented in online supplemental figures 5–8.

Supplemental material

Supplemental material

Supplemental material

Supplemental material

Supplemental material

Supplemental material

Supplemental material

Supplemental material

Table 2

Associations of genetic predisposition to smoking initiation and lifetime smoking with the risk of IPF

Table 3

Heterogeneity, horizontal pleiotropy, and outlier tests


Because of the low incidence of IPF, to date only two cohort studies have evaluated the longitudinal association of smoking with IPF.7 8 Both studies showed that smoking could increase the risk of IPF in a dose-response manner. However, a recent one-sample MR study reported that smoking is unlikely to be a causal factor for IPF, based on 871 IPF cases from the UK Biobank and an instrument constituted of 52 SNPs (explaining 0.7% of the variance in smoking volume).9 The limited power of this study may have prevented detection of a causal effect. In contrast, our two-sample MR analysis had a much larger sample size (10 382 IPF cases) and utilised stronger instruments that explain more variance in smoking, providing additional evidence for a potential causal effect of smoking on IPF. Several potential pathways may explain the role of cigarette smoke in the pathogenesis of IPF, including oxidative stress, inflammation, and telomere shorteneding.10

A common limitation in the MR setting is the presence of horizontal pleiotropy, which cannot be fully addressed even with sensitivity analyses based on different assumptions, because pleiotropy is widespread across the genome. Moreover, due to the use of summary data, we were unable to conduct a stratified analysis by smoking status or a nonlinear analysis to explore the threshold effect.

In conclusion, this study provides evidence for the potential causal effect of smoking on IPF. Further well-designed MR studies with more clinically diagnosed IPF cases are warranted to confirm our findings.

Ethics statements

Patient consent for publication


The authors thank the International IPF Genetics Consortium and Global Biobank Meta-analysis Initiative for sharing the summary statistics for IPF.


Supplementary materials


  • Contributors JZ contributed to analysis and interpretation of data and drafting the work. DZ contributed to data interpretation. MY contributed to study design and analysis plan. YL contributed to acquisition and interpretation of data. All authors participated in revisions and approved the final version.

  • Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.

Linked Articles