Article Text

other Versions


Original article
Mucin 5B promoter polymorphism is associated with idiopathic pulmonary fibrosis but not with development of lung fibrosis in systemic sclerosis or sarcoidosis
  1. Carmel J Stock1,
  2. Hiroe Sato1,
  3. Carmen Fonseca2,
  4. Winston A S Banya3,
  5. Philip L Molyneaux1,
  6. Huzaifa Adamali1,
  7. Anne-Marie Russell1,
  8. Christopher P Denton2,
  9. David J Abraham2,
  10. David M Hansell4,
  11. Andrew G Nicholson5,
  12. Toby M Maher1,
  13. Athol U Wells1,
  14. Gisela E Lindahl1,
  15. Elisabetta A Renzoni1
  1. 1Interstitial Lung Disease Unit, Royal Brompton Hospital and National Heart and Lung Institute, Imperial College London, London, UK
  2. 2Centre for Rheumatology and Connective Tissue Diseases, University College London Medical School, London, UK
  3. 3Clinical Trials and Evaluation Unit, Royal Brompton and Harefield NHS Foundation Trust, London, UK
  4. 4Department of Radiology, Royal Brompton Hospital and National Heart and Lung Institute, Imperial College London, London, UK
  5. 5Histopathology Department, Royal Brompton Hospital and National Heart and Lung Institute, Imperial College London, London, UK
  1. Correspondence to Carmel Stock, Interstitial Lung Disease Unit, Royal Brompton Hospital and National Heart and Lung Institute, Imperial College London, Emmanuel Kaye Building, 1B Manresa Road, London SW3 6LR, UK; c.stock{at}


Background A polymorphism (rs35705950) 3 kb upstream of MUC5B, the gene encoding Mucin 5 subtype B, has been shown to be associated with familial and sporadic idiopathic pulmonary fibrosis (IPF). We set out to verify whether this variant is also a risk factor for fibrotic lung disease in other settings and to confirm the published findings in a UK Caucasian IPF population.

Methods Caucasian UK healthy controls (n=416) and patients with IPF (n=110), sarcoidosis (n=180) and systemic sclerosis (SSc) (n=440) were genotyped to test for association. The SSc and sarcoidosis cohorts were subdivided according to the presence or absence of fibrotic lung disease. To assess correlation with disease progression, time to decline in forced vital capacity and/or lung carbon monoxide transfer factor was used in the IPF and SSc groups, while a persistent decline at 4 years since baseline was evaluated in patients with sarcoidosis.

Results A significant association of the MUC5B promoter single nucleotide polymorphism with IPF (p=2.04×10–17; OR 4.90, 95% CI 3.42 to 7.03) was confirmed in this UK population. The MUC5B variant was not a risk factor for lung fibrosis in patients with SSc or sarcoidosis and did not predict more rapidly progressive lung disease in any of the groups. Rather, a trend for a longer time to decline in forced vital capacity was observed in patients with IPF.

Conclusions We confirm the MUC5B variant association with IPF. We did not observe an association with lung fibrosis in the context of SSc or sarcoidosis, potentially highlighting fundamental differences in genetic susceptibility, although the limited subgroup numbers do not allow a definitive exclusion of an association.

  • Interstitial Fibrosis
  • Sarcoidosis
  • Systemic disease and lungs

Statistics from

Key messages

What is the key question?

  • Is the MUC5B rs35705950 polymorphism, recently reported as strongly associated with idiopathic pulmonary fibrosis, a risk factor for lung fibrosis in the context of systemic sclerosis and sarcoidosis?

What is the bottom line?

  • In our Caucasian UK based cohorts, we confirm a significant association of the MUC5B variant with idiopathic pulmonary fibrosis. By contrast, we found no significant association with lung fibrosis in the context of systemic sclerosis or sarcoidosis, although fibrotic subgroup numbers may have resulted in insufficient power to detect an association, especially if less pronounced than in IPF.

Why read on?

  • The lack of an association with lung fibrosis in the context of systemic sclerosis and sarcoidosis suggests that the MUC5B variant is associated with an idiopathic pulmonary fibrosis-specific pathway, differing from SSc- and sarcoidosis-associated lung fibrosis which are likely to have distinct genetic and pathogenetic risk factors.


A single nucleotide polymorphism (SNP) (rs35705950) located 3 kb upstream of the transcriptional start site of MUC5B, the gene encoding Mucin 5 subtype B, has been shown to be strongly associated with both familial interstitial pneumonia1 and sporadic idiopathic pulmonary fibrosis (IPF) in US populations.1 ,2 MUC5B is a gel-forming mucin and a major component of mucus in the respiratory tract.3 The IPF-associated SNP is located within a region highly conserved across vertebrate species and, based on a prediction algorithm, is potentially involved in gene regulation.4

By playing important roles in airway mucus rheology5 and mucosal immune defence,6 MUC5B is essential in protecting the surface epithelium of the airways. MUC5B overexpression was observed in IPF lungs, with patchy staining of the epithelial cells lining honeycomb cysts.1 Although the mechanisms through which MUC5B dysregulation plays a part in the development of IPF are currently unknown, the strength of the genetic association suggests a pathogenetic role of mucins and/or mucin-producing cells which was hitherto unsuspected. However, whether the MUC5B variant is a prognostic marker of IPF is currently unknown.

Fibrotic lung disease occurs in a large proportion of patients with systemic sclerosis (SSc) and, together with pulmonary hypertension, represents the main cause of death in these patients.7 In sarcoidosis the development of irreversible lung fibrosis, seen in 10–20% of patients, is associated with poorer quality of life, the need for long-term treatment and a worse prognosis.8 A small number of genes associated with the development of lung fibrosis in the context of SSc9 or of sarcoidosis10–12 have been reported, highlighting a genetic component likely to involve several genes with small individual effects. Although SSc-associated interstitial lung disease (SSc-ILD) and pulmonary sarcoidosis differ from IPF in many respects, including better survival for equivalent disease severity,13 there are many similarities in the fibrotic process and alveolar epithelium abnormalities occur in all three diseases. In particular, the membrane tethered mucin MUC1 (also known as KL-6), a member of the mucin family expressed by type II alveolar epithelial cells, has been suggested as a biomarker for disease activity in IPF,14 pulmonary sarcoidosis15 and SSc-ILD,16 ,17 although large prospective studies are yet to be performed. We therefore investigated whether the MUC5B polymorphism increased the risk of lung fibrosis in the context of SSc and sarcoidosis and set out to confirm the reported association with IPF. We also assessed whether this polymorphism was associated with the severity of lung fibrosis and the likelihood of disease progression in the three fibrotic lung diseases.

Materials and methods

Study populations

DNA samples were collected from consecutive newly presenting patients with IPF (n=110) and sarcoidosis (n=180) attending the respective clinics at the Royal Brompton Hospital, London, and from patients with SSc (n=440) attending clinics at the Royal Brompton and Royal Free Hospitals, London. The diagnoses were made from well-defined criteria for IPF,18 ,19 sarcoidosis20 and SSc.21 The majority of patients with IPF received combination immunosuppressive treatment and low-dose prednisolone.22 None were included in trials with novel agents which could have influenced functional decline. Patients with SSc-ILD and sarcoidosis were treated with intended ‘best management’ with introduction/changes in immunosuppression dictated by progression of disease and side effects. The control population (n=416) comprised healthy blood donors mainly collected from the south-east of the UK. Individuals were judged as healthy based on a self-administered questionnaire and by the routine laboratory investigations performed on blood donors. Only Caucasian individuals of Northern European descent were included.

As both the Royal Brompton Hospital and the Royal Free Hospital Units are tertiary referral centres, patient cohorts are likely to be a mixture of incident and prevalent cases. To indirectly assess whether there could have been a selection bias favouring prevalent cases in the MUC5B variant carriers in the IPF cohort, we compared duration of dyspnoea at presentation between carriers and non-carriers. Details on duration of dyspnoea were retrievable from the clinical notes of 98/110 patients with IPF. Duration of dyspnoea at first presentation to the Royal Brompton Hospital was not significantly different between MUC5B T allele carriers and non-carriers (32.1 months in carriers vs 28.3 months in non-carriers, p=0.5 Mann–Whitney U test), suggesting there was not a significant imbalance between prevalent/incident cases according to MUC5B allele carriage.

Clinical assessment

Pulmonary function tests and high-resolution CT (HRCT) were performed as previously reported.23 All investigations were performed as part of a prospective routine clinical protocol. Pulmonary function tests (expressed as percent predicted) from the time of first presentation at the Royal Brompton Hospital were available for patients with IPF (n=107), SSc-ILD (n=206) and sarcoidosis (n=178). Among patients with SSc, ILD was defined as a forced vital capacity (FVC) <75% and/or the presence of fibrosis on chest imaging (chest x-ray or HRCT).

Patients with SSc were also classified into autoantibody subgroups by previously described standard methods.24 ,25 The patients with SSc for whom antibody status was known (n=417) were grouped according to status for the two most common antibodies found in the cohort, anti-DNA topoisomerase 1 antibodies (ATA), which is known to be tightly linked to the presence of ILD, and anti-centromere antibodies (ACA), which is associated with limited skin disease and pulmonary hypertension but protective for ILD.26

Quantification of disease severity and progression in IPF and SSc-ILD

The composite physiological index (CPI), a functional index of disease severity, was calculated as: CPI=91.0−(0.65×Tlco% predicted)− (0.53×FVC% predicted)+(0.34×FEV1% predicted), where Tlco is lung carbon monoxide transfer factor, and was used as a marker of lung disease severity in patients with IPF and SSc-ILD.27

Time to decline was quantified using serial pulmonary functional indices starting from first presentation to the Royal Brompton Hospital. Significant functional deterioration was defined as a decline (quantified as percentage change from baseline) of ≥10% in FVC and/or ≥15% in Tlco, according to established criteria.18 ,19 In IPF, a relentlessly progressive disease, one subsequent time point from baseline was the minimum requirement for follow-up, and time to first observed decline was used. In SSc-ILD, to allow for possible response to treatment or spontaneous fluctuations, at least two follow-up time points from baseline were required. For SSc-ILD, time to irreversible decline was used, defined as time to the first significant change in FVC and/or Tlco (as above) observed on at least two consecutive occasions, as previously described.28 When follow-up ended with functional decline on a single occasion, this was accepted as a significant decline provided there was symptomatic or radiological evidence of worsening. Data at a sufficient number of time points were available to calculate time to decline in 95 patients with IPF and 177 with SSc-ILD.

Quantification of disease severity and progression in sarcoidosis

Based on the chest x-ray, staging of pulmonary disease was classified into stage 0 (normal chest x-ray), stage I (bilateral hilar lymphadenopathy (BHL)), stage II (BHL with pulmonary infiltrates), stage III (pulmonary infiltrates without BHL) and stage IV (pulmonary fibrosis), according to established criteria.20 Patients with sarcoidosis for whom lung function tests at presentation and 4 years were available (n=140) were also categorised into patients who showed a significant decline in lung function at the 4-year time point and those who did not. In order for the decline to be deemed clinically significant, it had to be confirmed at the following clinic visit. A significant decline in lung function was defined as a reduction in one or more of: Tlco by ≥15%, forced expiratory volume in 1 s (FEV1) by ≥10% or FVC by ≥10%.


Genotyping was carried out according to the manufacturer's instructions using a commercially available TaqMan assay and TaqMan universal PCR master mix, no AmpErase UNG (Applied Biosystems, Carlsbad, California, USA) on a Rotor-Gene 6000 real-time PCR machine (Qiagen Inc, Valencia, California, USA). Quality control and genotype determination were performed using Rotor Gene 6000 Series Software 1.7 (Corbett Research, Sydney, Australia).

Statistical analysis

To test for deviation from Hardy–Weinberg equilibrium (HWE), genotype frequencies were determined by direct counting and the χ2 statistic or Fisher exact test used as appropriate. χ2 analyses for association were carried out in Unphased V.3.129 and the non-parametric Mann–Whitney test was used in GraphPad Prism V.4 to test for differences in median lung function measures according to carriage of the minor T allele. p Values <0.05 were considered significant. The current study had 96% power to detect an association with an OR of at least 3 in the IPF cohort, well below the published OR for the association between the MUC5B SNP and IPF in two separate populations.1 ,2 The study had 80% power to detect an association with an OR of at least 2.0 with SSc-ILD and of at least 2.6 with sarcoidosis with lung fibrosis. The prognostic value of MUC5B genotype and allele carriage was assessed in relation to disease progression. Based on the number of patients, rate of decline in FVC and Tlco and length of follow-up, we estimated that this study had a power of 80% to detect a HR of at least 2 (or 0.5) according to MUC5B allele carriage in both the IPF and SSc-ILD cohorts. Proportional hazards analysis was used to evaluate time to decline in FVC and/or time to decline in Tlco, using the non-zero slope test developed by Therneau and Grambsch as implemented in Stata (Computing Resource Centre, Santa Monica, California, USA). In both IPF and SSc-ILD univariate and multivariate Cox analysis, the slope was not significantly different from zero, thereby indicating non-violation of the proportional hazard assumption. Stepwise multivariate analysis was used to adjust for potential confounding factors including age, gender, smoking status and disease severity (CPI at first presentation), with retention of significant variables in the final regression equations.


Demographic and clinical data of the IPF, SSc and sarcoidosis cohorts are presented in table 1. Genotyping was successful for all samples except for one control sample. This ungenotyped sample is not included in the reported cohort size (n=416). The genotype distribution in all of the analysed cohorts, including all patient subgroups, conformed to HWE. The minor MUC5B T allele previously associated with IPF1 ,2 was present at a significantly higher frequency in the IPF patient cohort than in the healthy controls (36% vs 10%, p=2.04×10−17, table 2). The genotype p value was 9.11×10−17 (see online supplementary table S1), and the OR for IPF in individuals heterozygous or homozygous for the T allele were 6.62 (95% CI 4.10 to 10.67) and 11.81 (95% CI 4.26 to 33.72), respectively.

Table 1

Patient demographics and clinical characteristics at presentation

Table 2

Allele frequency in control and ILD patient cohorts

By contrast, there was no significant difference in the MUC5B allele or genotype frequencies between the healthy controls and patients with SSc as a whole, or according to presence or absence of ILD (table 2 and online supplementary table S1). Similarly, no difference was found between the control cohort and patients with sarcoidosis as a whole or according to the presence of non-fibrotic (stages 0–III) or fibrotic (stage IV) pulmonary involvement (table 2 and online supplementary table S1).

Association with disease severity and rate of decline

Idiopathic pulmonary fibrosis

Median FVC% predicted, Tlco% predicted and CPI did not differ according to carriage or non-carriage of the minor T allele (table 3). With a median follow-up time of 13.7 months, 61 of 95 patients (64.2%) had a decline in FVC (median time to decline 10.7 months) and 60 of 95 patients (63.3%) had a decline in Tlco (median time to decline 10.8 months). No significant difference in time to decline in FVC (HR 0.78, 95% CI 0.47 to 1.30, p=0.3) or Tlco (HR 1.03, 95% CI 0.61 to 1.73, p=0.9) was seen according to carriage of the disease-associated T allele on univariate analysis (see online supplementary figures S1A,B). However, after multivariate stepwise regression leaving only significant covariates in the equation (CPI), the minor T allele was associated with a longer time to decline in FVC, just reaching the limits of statistical significance (HR 0.59, 95% CI 0.35 to 1.005, p=0.052), while no association was seen with time to decline in Tlco (HR 0.96, 95% CI 0.56 to 1.62, p=0.9), even after adjustment for CPI.

Table 3

Lung function impairment in IPF and SSc-ILD according to carriage of the T allele

Systemic sclerosis-associated interstitial lung disease

Median FVC% predicted, Tlco% predicted and CPI did not differ according to carriage or non-carriage of the minor T allele (table 3). With a median follow-up time of 98.4 months, 104 of 177 patients (58.8%) had a decline in FVC (median time to decline 35.1 months) and 124 of 177 patients (70.1%) had a decline in Tlco (median time to decline 42 months). Time to decline in FVC, Tlco or either measure did not correlate with carriage of the T allele, both on univariate and multivariate stepwise regression analysis, after adjustment for age, gender, smoking status and CPI (adjusted HR for time to decline in FVC 1.16, 95% CI 0.7 to 1.9, p=0.5; adjusted HR for time to decline in Tlco 1.19, 95% CI 0.7 to 1.9, p=0.5) (see online supplementary figures S2A and S2B). With regard to autoantibody status, no significant difference was observed in allele frequencies between controls and patients with SSc according to the presence of ATA or ACA antibodies (table 2).


The MUC5B SNP was not significantly associated with severity of lung function impairment at first presentation or at 2 and 4 years of follow-up (table 4). Furthermore, no significant association was observed with lung disease progression in sarcoidosis, with no significant difference seen between the frequency in controls and patients who did (n=27, MAF=0.06, p=0.67) or did not (n=113, MAF=0.11, p=0.28) show a substantial decline in lung function at 4 years of follow-up.

Table 4

Pulmonary function in sarcoidosis according to carriage of the minor T allele


We found no association between the MUC5B promoter region polymorphism rs35705950 and the presence of lung fibrosis in the context of scleroderma or sarcoidosis while we confirm, in a UK Caucasian population, the association with sporadic IPF. The lack of an association with lung fibrosis in the context of SSc and sarcoidosis suggests that the MUC5B variant is not related to shared fibrotic mechanisms across fibrotic lung diseases but is instead associated with an IPF-specific pathway, differing from SSc- and sarcoidosis-associated lung fibrosis which are more likely to be related to immunological/inflammatory triggers.

In IPF, experimental evidence suggests that the respiratory epithelial abnormalities have an important pathogenetic role.30–33 Bronchiolisation of the distal lung epithelium with aberrant expression of MUC5B has been observed in the lungs of patients with IPF and may be related to abnormal differentiation of the respiratory epithelium.33 Overexpression of MUC5B is preferentially observed in honeycombed lesions;1 interestingly, honeycombing is present, if at all, only to a minimal degree in the context of idiopathic or SSc-associated fibrotic non-specific interstitial pneumonia (NSIP). One might therefore speculate that aberrant expression of MUC5B is involved in the development of architectural distortion not seen to a similar degree in SSc-ILD, mostly characterised by an NSIP pattern.34 If this were the case, one could expect the MUC5B polymorphism not to be associated with idiopathic fibrotic NSIP, a hypothesis which will require a multicentre collaborative approach in view of its relative rarity.

Limitations of this study include the relatively small numbers in each disease group and the lack of a replication cohort for the positive association observed in IPF. However, our IPF study is in effect a replication cohort, as the association has previously been described by two separate groups.1 ,2 Regarding SSc-ILD and sarcoidosis, it is possible that we were unable to detect a weaker association than the one seen in IPF due to insufficient power. However, the lack of association with SSc-ILD is in accordance with the recent publication by Peljto et al35 where the cohort was approximately half the size of that used in this study. Nevertheless, in view of the limitations in terms of power, further studies using larger patient cohorts of non-IPF patients are required to confirm the findings of this study.

Subgroup analyses of putative associations with disease progression within each disease cohort were hampered by relatively small numbers, limiting the power to detect an association. Nevertheless, following adjustment for disease severity, we observed a trend bordering on statistical significance between the MUC5B variant and slower decline in FVC in patients with IPF. Interestingly, an association between the MUC5B variant and less severe pathological changes was recently reported in familial interstitial pneumonia.36 Although this observation will require further evaluation in separate populations, it provides a possible link between this polymorphism and disease behaviour in IPF, suggesting the need for further exploration. In both the SSc-ILD and in sarcoidosis groups the analysis was potentially underpowered to pick up a more subtle effect on rate of decline. Furthermore, it is not possible to exclude the fact that the MUC5B variant could have an effect on the natural untreated history of the disease as, in both groups, patients observed to decline would have had immunosuppressive treatment initiated or altered. As the choice of treatment, timing and duration varied significantly during the years of follow-up, it is not possible to correct for treatment effects in the analysis. However, we can at least conclude that the MUC5B variant is not associated with marked disease progression in a group of unselected SSc-ILD and sarcoidosis patients receiving intended ‘best management’.

Seibold et al showed that, in the lungs of healthy individuals, high mRNA expression levels of MUC5B were significantly associated with carriage of the IPF-associated T allele. Interestingly, although IPF lungs expressed 14.1 times higher levels of MUC5B RNA than control lungs, there was no relationship seen between MUC5B genotype and expression levels, in contrast to control lungs.1 It is possible that, in IPF patients homozygous for the G allele, MUC5B expression is upregulated as a result of aberrant expression of an upstream/downstream factor. The role played by MUC5B overexpression in the pathogenesis of IPF is currently unknown, although a number of possible mechanisms have been proposed.1 ,37 In order to better understand the pathobiology of IPF, further work to investigate the effect of excess MUC5B on injury response in the lung is needed. Evaluation of MUC5B expression in lung tissue of patients with SSc-ILD, fibrotic sarcoidosis and idiopathic NSIP would also be informative. Overexpression could suggest involvement of genetic variants upstream or downstream of MUC5B, or at least the importance of this pathway across fibrotic lung diseases, while the absence of MUC5B overexpression would highlight MUC5B and related pathways as IPF-specific.

Despite the evidence that MUC5B expression correlates with rs35705950,1 further work is required to establish a direct relationship between this SNP and regulation of expression. It could be acting as a marker for another polymorphism with which it is in linkage disequilibrium. Although Seibold et al demonstrated that rs35705950 shows an independent effect on susceptibility to IPF, it is possible that unscreened genetic variants, particularly in unscreened repetitive regions which the authors were unable to assess, may be in strong linkage disequilibrium with rs35705950 and be the true disease-associated variant. This is particularly true in the SSc and sarcoidosis cohorts, in which MUC5B polymorphisms have not been previously screened. A broader gene-focused rather than a single SNP-focused study would be necessary to determine if other MUC5B variants are involved in the pathogenesis of SSc-ILD or sarcoidosis.

In conclusion, we confirm that the MUC5B polymorphism rs35705950 is significantly associated with susceptibility to IPF in this UK Caucasian population. However, in this population, the MUC5B variant is not a risk factor for development of lung fibrosis in SSc or sarcoidosis. We found that it is not a marker of more rapidly progressive disease in any of the three groups and, rather, there was a trend for a slower decline in FVC observed in the patients with IPF which will require further investigation.


We are grateful to all of the patients for allowing us to collect blood samples for genetic association studies over the years.


View Abstract
  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

    Files in this Data Supplement:


  • Contributors CJS and EAR contributed to the study design, data acquisition, data analysis, interpretation and drafting of the manuscript. CF, GEL, DJA contributed to analysis and interpretation. HS, PLM, HA, AMR, CPD and TMM contributed to data acquisition and critical revision of the manuscript. DMH and AGN contributed to critical revision of the manuscript. AUW contributed to the analysis and interpretation of the data and to critical revision. All authors approved the final version of the manuscript. EAR is guarantor of the manuscript. WASG contributed to statistical analysis.

  • Ethics approval The ethics committees of the Royal Brompton Hospital and of the Royal Free Hospital gave authorisation for the study.

  • Patient consent All participants gave written informed consent.

  • Funding This work was funded by the Raynaud's and Scleroderma Association (BR10), the Asmarley Trust (B0498), Arthritis Research UK (19291 and 19427) and the Scleroderma Society. AGN was partly supported by the NIHR Respiratory Disease Biomedical Research Unit at the Royal Brompton and Harefield NHS Foundation Trust and Imperial College London. The project was supported by the NIHR Respiratory Disease Biomedical Research Unit at the Royal Brompton and Harefield NHS Foundation Trust and Imperial College London.

  • Competing interests None.

  • Provenance and review Not commissioned; externally peer reviewed.

Request permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Linked Articles

  • Airwaves
    Andrew Bush Ian Pavord