Background The American and European cystic fibrosis (CF) guidelines recommend different diagnostic criteria. This study assessed diagnostic concordance between these recommendations.
Methods Subjects with single organ manifestations suggestive of CF (chronic sinopulmonary disease (RESP), chronic/recurrent pancreatitis (PANC) or obstructive azoospermia (AZOOSP)) were prospectively evaluated by sweat test, nasal potential difference and genotyping. Concordance in diagnostic outcomes between the two algorithms was measured using observed agreement and κ statistics.
Results A total of 208 subjects were evaluated. Observed agreement was 84.8% and level of agreement was excellent (κ=0.87) between the American and European recommendations. The RESP phenotype was associated with the highest degree of concordance (observed agreement ≥90%, κ=0.92) compared with the PANC (observed agreement 86%, κ=0.65) and AZOOSP (observed agreement 80%, κ=0.87) phenotypes. Incorporation of nasal potential difference into the American algorithm failed to improve the overall degree of concordance (good agreement level; κ=0.75); the level of agreement was unchanged in RESP and PANC subjects, but reduced in AZOOSP subjects (from excellent to good). Extensive genotyping had limited clinical utility in the diagnosis of CF in both algorithms.
Conclusions Despite inconsistencies between the American and European diagnostic recommendations, concordance in diagnostic outcomes among subjects presenting with single organ manifestations of CF was good to excellent. These diagnostic guidelines provide guidance and promote rigorous evaluation for the diagnosis of CF but neither guideline should be regarded as dogma.
- Cystic fibrosis
- cystic fibrosis transmembrane conductance regulator
- diagnostic tests
- clinical epidemiology
- respiratory infection
Statistics from Altmetric.com
- Cystic fibrosis
- cystic fibrosis transmembrane conductance regulator
- diagnostic tests
- clinical epidemiology
- respiratory infection
What is the key question?
Do individuals with single organ manifestations of cystic fibrosis (CF) have different diagnostic outcomes when assessed by American or European diagnostic guidelines for CF?
What is the bottom line?
Overall, there was ‘good to excellent’ concordance between the American and European diagnostic algorithms for CF.
Why read on?
Approximately 15% of patients had discordant diagnostic outcomes, a third of which were related to different cutoffs for borderline sweat chloride concentration (30 vs 40 mmol/litre). Concordance was not improved when adjunctive nasal potential difference testing was included in the American algorithm.
Cystic fibrosis (CF) was previously thought to be a multisystem disease that manifests either at birth (with intestinal obstruction) or in infancy/early childhood (with growth failure and recurrent sino-pulmonary symptoms). It is now recognised that a broad spectrum of conditions are associated with mutations in the CF transmembrane conductance regulator (CFTR) gene. This includes older children and adults presenting with manifestations in one organ, including sino-pulmonary diseases, pancreatitis or obstructive azoospermia.1–6 In many of these individuals, a diagnosis of CF is difficult to establish or exclude.
The US CF Foundation and European CF Society each convened expert panels to establish consensus on the diagnostic criteria for CF.7 8 Different terminologies and the lack of objective evidence to support or counter expert opinions resulted in differences in the application and interpretation of diagnostic tests. The American report recommended that diagnostic terminology be limited to ‘CF’, ‘CFTR-related disorder’ or ‘unlikely CF’. The European guidelines label individuals as having ‘classic CF’, ‘CFTR dysfunction’ (within which ‘non-classic/atypical CF’ or an item from the WHO diagnostic list can be ascribed), ‘inconclusive’ or ‘unlikely CF’.
For initial testing, the American guidelines recommended sweat chloride with CFTR genotyping, whereas the European guidelines begin with sweat chloride or CFTR genotyping. The European guideline recommends sweat chloride of 30 mmol/litre as the lower cutoff for the intermediate range at all ages. However, the American guideline uses 30 mmol/litre as the cutoff up to 6 months of age; thereafter the cutoff is raised to 40 mmol/litre.7 8 Both reports recommended initial genotyping using a limited CFTR mutation panel. The American report limited the list to 23 mutations that were established by the American College of Medical Genetics (ACMG).9 In contrast, the European algorithm recommended mutations that reflect the distribution and frequency of the local population and classified patients according to the number of CFTR mutations identified (0, 1 or 2) without providing guidance concerning their consequences (CF-causing vs CFTR-related disorder vs no consequence). Both guidelines reserve extensive genotyping for cases with diagnostic uncertainty.
The two reports offer differing recommendations concerning the role of ancillary tests, such as the nasal potential difference (NPD) test. The American report recommended that NPD be used to provide supportive evidence of CF when a diagnosis remains uncertain. The European report incorporated NPD into its diagnostic algorithm, thereby advocating it as the definitive test in cases of diagnostic uncertainty.10 Neither report offered recommendations concerning standard operating procedures and diagnostic reference values.
We tested concordance between the European and American recommendations by evaluating the diagnostic outcomes of prospectively ascertained, undiagnosed individuals, referred to the Toronto CF clinics with single organ manifestations of CF.
This study was approved by the research ethics boards of all participating institutions (#0020020091 Hospital for Sick Children, #02-156 St Michael's Hospital and #03-0084-E Mt Sinai Hospital). Written consent was obtained from all subjects.
Undiagnosed individuals with single organ manifestations of CF were prospectively and consecutively enrolled (1994–2008) into a study cohort designed to re-evaluate the diagnostic parameters of CF disease. This included subjects with idiopathic chronic sinopulmonary disease (RESP), idiopathic recurrent, acute or chronic pancreatitis (PANC) or men with infertility due to obstructive azoospermia (AZOOSP). Idiopathic sinopulmonary disease was defined as recurrent or chronic sinusitis (including sinusoidal pain, nasal discharge, postnasal drip), nasal polyps, recurrent or chronic bronchitis, recurrent pneumonia and/or bronchiectasis for at least 6 months. All enrolled subjects with sinopulmonary disease had three or more of these symptoms. If not done prior to referral, RESP subjects were tested for immunodeficiency, α-1-antitrypsin deficiency, allergic bronchopulmonary aspergillosis, non-tuberculous mycobacteria, and primary ciliary dyskinesia. Patients were also screened for conditions known to be associated with bronchiectasis (eg, rheumatoid arthritis, other collagen vascular diseases and inflammatory bowel disease). Patients diagnosed as having any of these disorders were excluded from the study. A diagnosis of idiopathic recurrent acute pancreatitis was accepted following at least two episodes of abdominal pain associated with raised serum amylase and/or lipase (more than two times the upper limit of the reference range), and/or imaging evidence of acute pancreatitis such as pancreatic oedema, haemorrhage or necrosis. Patients with chronic pancreatitis had chronic pain in association with pancreatic calcifications and/or characteristic ductal changes. A diagnosis of obstructive azoospermia (congenital unilateral or bilateral absence of vas deferens) was confirmed by physical examination, transrectal ultrasound and evidence of azoospermia on two separate occasions. No patients were excluded on the basis of sex or race (defined by patient self-report). Exocrine pancreatic function was performed using one or more tests, including 72 h faecal fat, faecal elastase-1, and/or serum cationic trypsinogen. Seventeen PANC, 60 AZOOSP and 72 RESP subjects have been reported elsewhere in a different context.3 6 11
Ion channel measurements (sweat test and NPD)
Sweat testing (Gibson and Cooke12 (before 2005) or Macroduct13 methods) and NPD were performed on the same day. NPD was performed according to Knowles and colleagues by a single operator masked to other test results.14 The change in CFTR-mediated chloride diffusion following perfusion with a chloride-free solution and isoproterenol (ΔCl-free+Iso) was used as the diagnostic parameter. The reference range was determined from measurements in cohorts of healthy controls (n=84), obligate heterozygotes (n=48) and patients with established CF (n=112); ΔCl-free+Iso was interpreted as normal (<−12 mV), intermediate (−12 to −7.7 mV), and abnormal (>−7.7 mV) (figure 1).
The 23 CFTR mutations recommended by the ACMG, were used as the initial screening test for both algorithms.9 For the American algorithm, the second step of CFTR mutations included a conservative list of additional mutations which fulfilled the requirements of two consensus criteria as CF-causing mutations (figure 2, footnote).7 18 For the European diagnostic process, interpretation of results from extensive genotyping was based on the number of mutations identified.
The American and European diagnostic processes were independently applied to all subjects. The European algorithm with sweat testing was used as the initial assessment (figure 2). Concordance between the two diagnostic algorithms was examined using observed agreement and κ statistics. For the American algorithm, concordance was measured with and without NPD testing. The following assumptions were made regarding the differing terminologies used by American and European reports, respectively: ‘CF’ = ‘classic CF’, ‘CFTR-related’ = ‘CFTR dysfunction’ and ‘unlikely CF’ = ‘unlikely CF’. Patients with an ‘inconclusive’ outcome in the European algorithm were excluded from concordance analyses. Observed agreement was calculated as the number of patients with the same diagnosis divided by the total number of patients. κ values can range from –1 (complete disagreement) to 1 (perfect agreement), and interpreted by the degree of agreement: κ<0.20 is poor, κ=0.21–0.40 is fair, κ=0.41–0.60 is moderate, κ=0.61–0.80 is good and κ=0.81–1.00 is excellent.19
We prospectively recruited 208 subjects, consisting of 72 (34.6%) RESP, 44 (21.2%) PANC and 92 (44.2%) AZOOSP. The mean age (SD; range) at the time of evaluation of RESP, PANC and AZOOSP subjects was 38.5 (15.9; 9.9–66.7), 24.3 (13.2; 7.9–59.9) and 34.8 (5.3; 25.4–56.6) years, respectively. Fifty-one (70.8%) RESP and 26 (59.1%) PANC subjects were women. The number and type of identified mutations is summarised in table 1 and in the supplemental material.
According to the American recommendations, 37 (17.8%), 45 (21.6%) and 126 (60.6%) subjects could be diagnosed as having ‘CF’, ‘CFTR-related disorder’ and ‘unlikely CF’, respectively (table 2 and figure 2). Of note, second tier extensive genotyping for those with borderline sweat tests failed to confirm a ‘CF’ diagnosis in any additional subjects. When NPD testing was incorporated into the American algorithm, 35 out of 45 patients with an initial outcome of ‘CFTR-related disorder’ were reclassified as either ‘unlikely CF’ (n=16) or ‘CF’ (n=19). Among the 35 patients, there were 6, 2 and 27 RESP, PANC and AZOOSP subjects, respectively. Hence, when NPD was included, 56 (26.9%), 10 (4.8%) and 142 (68.3%) subjects could be diagnosed as having ‘CF’, ‘CFTR-related disorder’ and ‘unlikely CF’, respectively.
Using the European algorithm, 35 (16.8%), 30 (14.4%), 11 (5.3%) and 132 (63.5%) patients were diagnosed as having ‘classic CF’, ‘CFTR dysfunction’, ‘inconclusive’ and ‘unlikely CF’, respectively (table 2 and figure 2). All 35 patients with ‘classic CF’ would have been diagnosed as having ‘CF’ using American recommendations. Two subjects who were not diagnosed as having ‘classic CF’ by the European recommendations but considered to have ‘CF’ based on the American recommendation had PANC phenotypes, borderline sweat tests, and two CF-causing mutations (F508del/3849+10 kb C>T and sweat chloride 31 mmol/litre; F508del/3659delC and sweat chloride 55 mmol/litre).
Three (1.4%) patients with pancreatic insufficiency (all RESP) were diagnosed as having ‘CF’ and ‘classic CF’ according to the American and European diagnostic algorithms, respectively.
Concordance between American and European guidelines
Concordance analysis between the American algorithm without NPD testing and the European algorithm with NPD testing was performed after excluding patients whose condition was labelled as ‘inconclusive’ by the European report. Observed agreement was 84.8% (167/197); that is 15.2% (n=30) patients had discrepant diagnoses. The corresponding κ was 0.87 (95% CI 0.82 to 0.92), suggesting an ‘excellent’ level of agreement (table 3). Discrepancies in 10 of 30 (33.3%) subjects were due to differences in the lower limit of the borderline sweat chloride concentration (30 vs 40 mmol/litre) (table 4). Extensive genotyping demonstrated that two and eight subjects carried one and two CFTR mutations, respectively. Nine of 10 patients were considered to have ‘unlikely CF’ according to the American criteria, whereas the European recommendations yielded a diagnosis of ‘CFTR dysfunction’. According to the American criteria, the remaining subject (PANC) was diagnosed as having ‘CF’ due to the identification of two CF-causing mutations (F508del and 3849+10 kb C>T). In contrast, this subject was considered to have ‘CFTR dysfunction’ by the European criteria due to borderline sweat chloride of 31 mmol/litre and identification of two mutations following initial mutation screening. Discrepant diagnoses in the remaining 20 subjects were due to differences in interpreting genotypes and recommendations that led to NPD testing.
When NPD testing was incorporated into the American algorithm more subjects (33/197, 16.8%) had discordant diagnoses than with sweat testing and genotyping alone. The observed agreement was 83.2% (164/197) and can be interpreted as a ‘good’ level of agreement (κ=0.75; 95% CI 0.67 to 0.83).
The degree of concordance varied according to phenotype (table 3), with the RESP phenotype demonstrating the greatest degree of concordance (>90% observed agreement; ‘excellent’ agreement). The degree of concordance between the American and European guidelines did not change among RESP and PANC subjects when NPD testing was included in the American algorithm. Conversely, the level of agreement reduced from excellent to good for AZOOSP subjects when NPD testing was added to the American algorithm. Eleven patients whose condition was labelled as ‘inconclusive’ by the European criteria were categorised as having a ‘CFTR-related disorder’ (n=7) and ‘unlikely CF’ (n=4) by the American criteria (see online supplemental material). No subject was diagnosed as having ‘CF’ by the American criteria and conversely classified as ‘unlikely CF’ by the European guidelines, and vice versa.
There was ‘good to excellent’ concordance between the American and European recommendations. This is reassuring because the patient population we evaluated represents those that are most likely to be associated with diagnostic challenges. The greatest concordance was observed in subjects with the RESP phenotype, which is also reassuring since this phenotype is associated with the highest morbidity and mortality. In addition, all patients diagnosed as having ‘classic CF’ using European recommendations were classified as having ‘CF’ by the American guidelines. No subject was diagnosed as having ‘CF’ by the American guideline but concurrently classified as ‘unlikely CF’ by the European recommendations, and vice versa. This outcome was not surprising for the following reasons: both guidelines universally accepted that sweat chloride concentrations >60 mmol/litre and <30 mmol/litre were associated with ‘CF’ and ‘unlikely CF’, respectively; we applied the same 23 mutations recommended by the ACMG to both algorithms; and extensive genotyping failed to confirm the diagnosis of CF in any patients with a normal or borderline sweat chloride concentration (American guidelines) despite expansion of the list of CF-causing mutations.
Nevertheless, there was a notable ‘real-life’ problem of discrepant diagnoses among 15% of subjects. One-third of the discrepant diagnoses were due to differences in the lower cutoff for sweat chloride concentration (30 vs 40 mmol/litre) after 6 months of age. Mishra et al20 determined that the upper limit of the sweat chloride concentration in healthy 5–9 year olds was 39.5 mmol/litre, which is consistent with the American cutoff of 40 mmol/litre. In healthy subjects ≥10 years old, the upper limit of sweat chloride concentration overlapped into the intermediate range. The lower cutoff (30 mmol/litre) recommended in the European guidelines was based on case observations that sweat chloride values <40 mmol/litre can occur in a small subgroup of individuals with CF.3 6–8 21 The lower cutoff will subject a larger number of individuals without CF to diagnostic testing. In this study, there were nine subjects who would be discharged from the American algorithm as ‘unlikely CF’ due to sweat chloride <40 mmol/litre and the absence of two CF-causing mutations, but considered to have ‘CFTR dysfunction’ in the European algorithm. These nine individuals would possibly have a different clinical outcome and follow-up depending on which diagnostic guideline was applied. Since all nine subjects had abnormal NPD measurements in the CF range (table 4), the use of the higher intermediate sweat chloride cutoff of 40 mmol/litre (in conjunction with mutation screening) may miss individuals who have CFTR-related disorders or CF. Nonetheless, it could be argued that the choice of lower cutoff merely represents an entry point into the diagnostic algorithm for patients who may otherwise be missed by sweat testing and mutation screening.
Discrepancies in the remaining subjects arose from differences in the recommended sequence and/or interpretation of genotyping and NPD testing. While the American guidelines recommend sweat chloride testing and screening as the most common CF-causing mutations at the first stage of testing, the European guidelines recommend sweat chloride testing (or genotyping) alone. It is well recognised that some confirmed CF-causing mutations can be associated with a normal or borderline sweat test (eg, 3849+10 kb C>T). Thus, there is also the risk of a false-negative diagnostic outcome should sweat testing be performed in isolation. Extensive genotyping in both diagnostic algorithms did not significantly aid in the diagnosis of CF, consistent with findings from previous studies.3 6 Hopefully, the clinical utility of extensive genotyping will improve as more information (correlating specific mutations with functional and/or clinical data) emerges.22
Interestingly, the concordance between the American and European recommendations did not improve when NPD testing was included into the American algorithm, thus raising questions regarding the role of NPD. However, when NPD testing was incorporated into the American algorithm, 35 patients with an initial diagnosis of ‘CFTR-related disorder’ were reclassified as having ‘CF’ or ‘unlikely CF’. Therefore, it is arguable that NPD testing played a role in clarifying the diagnosis of CF in these 35 patients, especially in patients with normal NPD results which makes the diagnosis of CF very unlikely. False-positive results can occur for reasons not intrinsic to the NPD test, including technically incorrect catheter location and perturbations of the nasal epithelium from allergies, infections and smoking.3 6 7 23 24 However, the clinical utility of NPD (and alternative ex vivo intestinal current measurements25) remains limited by the lack of standardisation, validated reference values and a clear cutoff point for differentiating between individuals with CF and those with CFTR-related disorders.7 23 26
The vast majority of individuals in whom NPD testing suggested a diagnosis of CF were men with obstructive azoospermia (16 out of 19). Hence, clinical consideration is necessary concerning the diagnosis of CF in ‘healthy’ men with obstructive azoospermia. Follow-up should be offered to these patients because the presence of CFTR dysfunction may indicate the presence of subclinical disease and/or risk for future disease development in other affected organs (eg, pulmonary disease). Subclinical pulmonary disease has been reported in men with obstructive azoospermia associated with intermediate or abnormal sweat tests,27 but long-term pulmonary outcomes in this cohort are unknown.
Despite the ‘good to excellent’ concordance between the two guidelines, the use of different terminologies and definitions of disease is confusing for clinicians and patients, and may influence research study design and outcomes. When terminologies such as ‘mild/non-classic/atypical’ are used, there are risks of misinterpretation by both clinicians and patients as to whether an individual has CF disease or not, and false reassurance on the potential impact of disease since older patients presenting with broncho-pulmonary disease may progress to pulmonary failure and premature death or a life-saving lung transplant.7 28 Thus, there are benefits in having a unified guideline. To this end, a recent consensus document by European and North American experts recommended the term ‘CFTR-related disorders’ to describe subjects with CF-like manifestations in one or more organ, with evidence of CFTR dysfunction/mutation(s) that is insufficient to fulfil the current diagnostic criteria for CF.26
The limitations of this study include lack of follow-up clinical outcomes and repeat diagnostic test results, and limiting enrolment to older children and adults. Borderline sweat test results are known to occur in asymptomatic newborn screen-positive infants giving rise to considerable diagnostic uncertainty.29 Another potential limitation was the use of two different sweat test techniques, which occurred due to a change in technique by our laboratory. We demonstrated that sweat chloride concentrations measured by Macroduct highly correlated with the Gibson and Cooke method in all ranges, including values in the intermediate range (r=0.93, p≤0.0001).30 Hammond et al made similar conclusions in a large number of healthy controls and CF subjects.31 Furthermore, in a subanalysis of individuals with sweat chloride concentrations <60 mmol/litre, the 95% CI reduced to ±13 mmol/litre, well within the range recommended by the Clinical and Laboratory Standards Institute.32
To conclude, despite differences between the American and European diagnostic recommendations, statistical analyses demonstrated ‘good to excellent’ concordance among subjects who present with single organ manifestations of CF. Concordance was excellent among subjects with chronic sinopulmonary disease. Approximately 15% of patients had discordant diagnoses. While, diagnostic algorithms support a rigorous approach to the evaluation of complex diseases such as CF, they should be regarded as general guidelines rather than dogma. We encourage efforts towards developing unified consensus diagnostic criteria for patients with CF.
The authors thank the many research subjects who gave up their time to participate in the project. The authors acknowledge the support and assistance of Louise Taylor, Susan Carpenter, Thora St Cyr, Rachel Paul, Debbie Ryan, Lenny Chong, Leia Spencer, Xiao-Wei Yuan, Qiuju Huang, Satti Beharry and Wan Ip.
This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.
Files in this Data Supplement:
- Download Supplementary Data (PDF) - Manuscript file of format pdf
Funding PD and ET were supported by research grants from the Canadian Cystic Fibrosis Foundation and Genome Canada through the Ontario Genomics Institute as per research agreement 2004-OGI-3-05, the Ontario Research Foundation and from the Lloyd Carr-Harris Foundation. CYO and TG were funded by the Canadian Cystic Fibrosis Foundation Fellowship Awards and CYO received a Canadian Child Health Clinician Scientist Program Career Enhancement Award. RD was funded by a CIHR-Ontario Women's Health Council joint Fellowship.
Competing interests None.
Ethics approval Ethics approval was provided by the ethics boards of the Hospital for Sick Children, St Michael's Hospital and Mt Sinai Hospital, Toronto.
Provenance and peer review Not commissioned; externally peer reviewed.