Background: The potential of autofluorescence bronchoscopy (AFB) to detect precancerous lesions in the central airways and its role in lung cancer screening is uncertain. A study was undertaken to evaluate the prevalence of moderate/severe dysplasia (dysplasia II–III) and carcinoma in situ (CIS) using a newly developed AFB system in comparison with conventional white light bronchoscopy (WLB) alone.
Methods: In a prospective randomised multicentre trial, smokers ⩾40 years of age (⩾20 pack-years) were stratified into four different risk groups and investigated with either WLB+AFB (arm A) or WLB alone (arm B).
Results: 1173 patients (916 men) of mean age 58.7 years were included. Overall (arms A and B), preinvasive lesions (dysplasia II–III and CIS) were detected in 3.9% of the patients. The prevalence of patients with preinvasive lesions in the WLB arm was 2.7% compared with 5.1% in the WLB+AFB arm (p = 0.037). For patients with dysplasia II–III, WLB+AFB increased the detection rate by a factor of 2.1 (p = 0.03), while for CIS the factor was only 1.24 (p = 0.75). The biopsy based sensitivity of WLB alone and WLB+AFB for detecting dysplasia II–III and CIS was 57.9% compared with 82.3% (1.42-fold increase). The corresponding specificity was 62.1% compared with 58.4% (0.94-fold decrease).
Conclusions: This first randomised study of AFB showed that the combination of WLB+AFB was significantly superior to WLB alone in detecting preneoplastic lesions. Our findings do not support the general use of AFB as a screening tool for lung cancer, but suggest that it may be of use in certain groups. The precise indications await further study.
- AFB, autofluorescence bronchoscopy
- CIS, carcinoma in situ
- WLB, white light bronchoscopy
- white light bronchoscopy
- autofluorescence bronchoscopy
- lung cancer
- preinvasive carcinoma
Statistics from Altmetric.com
In spite of many advances in the detection and treatment of lung cancer in the last decade, the overall survival rate remains less than 15%.1,2 Prognosis strongly depends on the stage of the disease at diagnosis. Five-year survival rate in patients with stage I disease is about 70% and exceeds 90% in stage Ia disease. This indicates the need for early diagnosis in a preclinical stage.
Auerbach and co-workers first noted that preinvasive lesions of different grades of severity were associated with lung tumours of squamous cell histology, and these findings led to the hypothesis that squamous cell carcinoma arose from these preinvasive changes.3 Saccomano and co-workers found cells with increasing malignant features in sequential sputum specimens from miners who subsequently developed lung cancer.4 Also, animal studies mimicking human carcinogenesis support the hypothesis that squamous cell carcinoma develops along the pathway: squamous metaplasia → dysplasia → carcinoma in situ (CIS) → invasive carcinoma.5
Several investigative tools have been proposed for the detection of preinvasive lesions and early cancers. Spiral computed tomographic (CT) scanning is not suitable for detecting such findings in the central airways, especially the early stages of squamous cell carcinoma which comprise 17–29% of all lung cancers.6 Attempts to detect these centrally located malignancies by conventional sputum analysis had a low sensitivity for radiologically occult stages.7 Newly developed methods such as immunostaining or automated sputum cytometry are more sensitive but less specific.8,9 Conventional white light bronchoscopy (WLB) was also deemed insufficient since a previous study showed that only 29% of carcinomas in situ detected by sputum cytological examination could be localised by WLB.10
Autofluorescence bronchoscopy (AFB) was developed to detect preinvasive lesions, based on the observation that moderate to severe dysplasia and CIS show less fluorescence than normal tissue when excited by blue light (wavelength 380–460 nm). However, recent studies have reported conflicting results with a high variability of both the prevalence of preinvasive lesions in study populations and sensitivity of AFB in detecting these lesions. WLB+AFB increased the relative sensitivity of detecting dysplasia and CIS by a factor of 1.1–6.3 compared with WLB alone.11,12 In contrast, Kurie and co-workers in a small study failed to show an advantage of AFB over WLB.13
The aim of this study was to investigate the prevalence of moderate/severe dysplasia and CIS in different risk groups using a newly developed D-light AFB system compared with conventional WLB. At the same time, the sensitivity and specificity of the procedure was compared for AFB and WLB by referring to histological analyses of biopsy specimens.
The clinical trial was conducted at eight institutions in Germany, Austria, Hungary, Italy and Switzerland from April 1999 to January 2003. The protocol was approved by local ethics committees. Written informed consent was obtained from all patients.
Current smokers (⩾40 years old) with a cigarette smoking history of at least 20 pack-years were included and stratified into four different risk groups (box 1). Smokers with symptoms such as change in the characteristics of their cough, occurrence of bloody sputum, increased dyspnoea, or a combination of these symptoms were enrolled into the trial with a clinical suspicion of bronchogenic carcinoma. Smokers with lesions suspicious for malignancy on the basis of chest radiographs or CT scans were recruited with a radiological suspicion of bronchogenic carcinoma. Imaging features included peripheral tumours, hilar masses or enlargement, and areas of consolidation or collapse.
Box 1 Inclusion criteria
Smokers ⩾40 years old (⩾20 pack-years) who belong to at least one of the following risk groups:
I: Known bronchogenic carcinoma; follow up of patients after surgical resection of bronchogenic carcinoma.
II: Radiological or clinical suspicion of bronchogenic carcinoma.
III: Abnormal cytological findings, normal radiograph.
IV: Evidence of COPD and/or occupational exposure.
589 patients were recruited to the WLB+AFB group and 584 patients to the WLB group. The distribution of the patients in all four risk groups was homogeneous (WLB+AFB v AFB: group I, 30.2% v 31%; group II, 55.7% v 55.1%; group III, 4.6% v 4.3%; group IV, 9.5% v 9.6%). Exclusion criteria were pregnancy, psychiatric disease, extensive purulent sputum, severe airway mucosal inflammation, severe haemoptysis, abnormal laboratory values (thromboplastin time <50, leukocytes <2000/μl, thrombocytes <100000/μl), intolerance of local/general anaesthetics, previous myocardial infarction or ischaemic event within the past 6 months, unstable angina pectoris, uncontrollable arrhythmia or severe ventricular arrhythmia.
The study was designed as a two group prospective, randomised, multicentre trial (fig 1). Block randomisation stratified by the four prognostic risk groups (box 1) was used to balance both study arms as follows. Once the declaration of consent has been signed by the patient, risk stratified block randomisation for one of the two study groups was carried out at a single call centre. The randomisation list generated by the statistician for each centre was kept on file at the study secretariat. When making a call the study physician had to allocate patients to one of the risk groups. Before the start of the clinical trial at least 15 patients were examined in advance at each clinic by AFB. The experiences regarding the interpretation of visual findings were exchanged during the course of a research meeting.
The bronchoscopic procedures were carried out under local anaesthesia with or without sedation using a fibreoptic bronchoscope or under general anaesthesia using rigid bronchoscopes combined with flexible bronchoscopy using Karl Storz fibrescopes 11001 BI, 11004 BI and 11009 BI.
In arm A WLB was performed first. The macroscopic findings during WLB were documented and classified into two categories:
“Non-suspicious”: normal appearance or non-specific changes such as general inflammation (acute reddening, thickening, swelling), scars, granulomas, especially due to known (previous) biopsy sites; or
“Suspicious”: changes suspicious of “pre” or “early” malignant changes such as irregularity of the bronchial mucosa, nodular or polypoid lesions, thickening of a carina.
With the bronchoscope still in place in the trachea, the examination was then repeated after changing the white light illumination to the blue light illumination fluorescence mode. During the AFB the endoscopist had to identify the macroscopic findings and to classify them into two categories:
“Non-suspicious”: normal appearance (for example, green colour of the mucosa with regular anatomical structure) or non-specific changes (for example, slight changes in colour without structural changes, circumscript bluish/brownish changes in areas of scars such as bronchial stump, known previous biopsy sites); or
“Suspicious”: changes suspicious of “pre” or “early” malignant changes or tumours (for example, bluish/brownish areas of reduced light intensity with disturbance of the anatomical structure and/or thickening of the carina).
In both WLB and AFB modes the category “non-suspicious” was classified as negative (–) and the category “suspicious” was classified as positive (+).
In study group B only WLB was performed.
In both groups biopsy samples were taken from all areas graded as suspicious or from visible tumours and from at least two areas of non-suspicious appearance (random biopsy). Biopsy samples taken from visible tumours or next to visible tumours (⩽2 cm) were documented and excluded from analysis.
The biopsy slides were first evaluated by local pathologists and then reviewed by a reference pathologist, all blinded to the clinical findings. In case of disagreement between the pathologists, the reference pathologist was asked to re-evaluate the histological slides and to make a final decision.
The pathologists classified the findings as:
“normal”: normal appearance;
“abnormal”: non-specific changes (for example, inflammation, scar, granulomas, metaplasia, mild dysplasia);
“pre” or “early” malignant changes: moderate dysplasia (dysplasia II), severe dysplasia (dysplasia III), CIS; or
“malignant tumour”: invasive tumour of any special characteristics or lymphangitis carcinomatosa.
Preinvasive lesions were graded according to the consensus classification of the WHO/IASLC grading system for preinvasive squamous lesions of the bronchus.14 Microinvasive tumours were classified as invasive tumours, because tissue invasion cannot be completely excluded by biopsy.
The D-Light AF system (Karl Storz, Tuttlingen, Germany) is capable of two different illumination modes—conventional white light mode and autofluorescence mode—and is described in detail elsewhere.15 The system is based on a 300-W Xenon lamp with special optics to focus high intensities of light into a liquid light guide, which is optimised for blue light transmission. The modes can easily be switched at any time during the procedure by using a footswitch. The output wavelength in the autofluorescence mode is between 380 and 460 nm. Blue light output power for autofluorescence at the distal end of a Karl Storz bronchofibrescope is typically 50 mW. Critical to this method of tumour detection is the use of the correct observation technique which requires an endoscope of specific design that includes a filter wheel with two different positions for the white light mode (a) and for the autofluorescence mode (b). Position (a) does not contain any filter to allow standard WLB while position (b) contains an observation filter that blocks the incident blue light for autofluorescence. For optimal contrast, specificity, illumination and plasticity, a small amount of the blue excitation light bypasses the detection filters.
The planned sample size was 1496 eligible patients in the full analysis set in order to have a 90% chance of detecting a difference between arms A and B of 4–8% with a 5% level test (relative risk (RR) = 2). Two potential interim analyses were incorporated at the design stage to allow for early termination if a significant difference was detected between arms A and B. The significance level is based on the type I error spending function proposed by Lan and de Mets16 with a Pocock type boundary leading to a nominal two sided significance level of 4.5%. The Cochran-Mantel-Haenszel test statistic controlled for risk groups was used for primary confirmatory analysis. The efficacy analysis was based on the full analysis set according to the intent-to-treat principle including all randomised patients with documented bronchoscopy. Relative risks and associated 95% confidence intervals were calculated. For further descriptive purposes the statistical analysis was conducted on a lesion based perspective. Data management was carried out using Access 8.0 (Microsoft Corp, USA). Statistical analyses were performed using SPSS 10.0 for Windows (SPSS Inc, Chicago, USA), SAS 8.1 (Cary, NC, USA), and EaSt 3 (Cytel Software Corporation, USA).
1173 subjects (916 men) of mean age 58.7 years (range 40–75) were enrolled into the study. In arm A (WLB+AFB, n = 589 patients) a total of 1978 biopsy samples were taken with a median number of 3.4 biopsies per patient (range 1–10). In arm B (WLB only, n = 584 patients) 1792 biopsy samples were taken with a median number of 3.1 biopsies per patient (range 1–9). According to the study protocol, biopsy specimens taken next to histologically confirmed invasive tumours were excluded from analysis, resulting in 1531 and 1376 evaluable biopsies, respectively. The median duration was 20 minutes (range 5–97) for WLB+AFB and 15 minutes (range 5–77) for WLB only.
Prevalence of patients with preinvasive lesions
Table 1 shows the prevalence of histologically confirmed preinvasive lesions in the full set of 1173 patients.
Overall (arm A and arm B), preinvasive lesions (moderate/severe dysplasia and CIS) were detected in 3.9% of patients (n = 46). The prevalence in the WLB arm was 2.7% (n = 16) compared with 5.1% in the WLB+AFB arm (n = 30). Cochran-Mantel-Haenszel analysis showed statistically significant superiority of WLB+AFB (p = 0.037). Since the primary study goal was achieved at this point, further patient recruitment was stopped after this first interim analysis. AFB identified twice the number of cases with preinvasive lesions in an at risk patient population (RR = 1.9, 95% CI (1.03 to 3.38)). Further exploratory analysis showed that there is a tendency towards superiority of the WLB+AFB arm over the WLB arm in all groups (test of homogeneity, p = 0.62) leading to an increase in the detection rates by a factor of 1.4, 2.5 and 2.8, respectively. However, the superiority was only statistically significant in risk subgroup II. The highest prevalence was found in patients with abnormal sputum cytological findings and a normal radiograph (risk group III). In risk group IV no preinvasive lesions were detected.
The prevalence of patients with dysplasia II–III in the WLB arm was 2.1% (n = 12) and in the WLB+AFB arm 4.2% (n = 25). The prevalence of patients with CIS in the WLB arm was 0.7% (n = 4) and in the WLB+AFB arm 0.9% (n = 5). The combination of WLB and AFB increased the detection rate for dysplasia II–III by a factor of 2.1 (95% CI 1.05 to 4.07, p = 0.03) and for CIS by a factor of 1.24 (95% CI 0.34 to 4.59, p = 0.75).
Biopsy based prevalence of preinvasive lesions
Table 2 shows the prevalence of preinvasive lesions analysed for the 2907 evaluable biopsy samples.
Overall, preinvasive lesions were detected in 1.8% of the biopsy samples. The detection rate was increased by a factor of 1.6. The prevalence of dysplasia II–III in the WLB+AFB arm was increased by a factor of 1.7 and that of CIS by a factor of 1.3.
The absolute number of visually positive biopsy specimens was 651 in arm A and 525 in arm B; the corresponding number of visually negative biopsies (visually negative in WLB+AFB = control biopsies) was 880 in arm A and 851 in arm B. If the control samples were excluded from statistical analysis, the prevalence of preinvasive lesions was 2.1% in arm B and 4.3% in arm A (RR = 2.1 (95% CI 1.05 to 4.04), p = 0.036; data not shown).
Sensitivity, specificity, and positive predictive value of WLB and AFB
Figure 2 shows the biopsy based sensitivity of WLB and WLB+AFB for detecting dysplasia II–III and CIS. The sensitivity of WLB+AFB was 82.3% (95% CI 69.5 to 95.1) compared with 57.9% (95% CI 35.7 to 80.1) for WLB alone, giving a 1.42-fold increase (95% CI 0.94 to 2.15, p = 0.054). Fourteen dysplasias II–III and CIS were found by “random biopsy” only (six in arm A and eight in arm B). The positive predictive value was 4.3% for WLB+AFB and 2.1% for WLB (p = 0.036). Thus, the positive predictive value is doubled by AFB.
Figure 3 shows the biopsy based specificity of WLB and WLB+AFB for dysplasia II–III and CIS. The specificity of WLB alone was 62.1% (95% CI 59.5 to 64.7) compared with 58.4% (95% CI 55.8 to 60.9) for WLB+AFB. This corresponds to a 0.94-fold decrease (95% CI 0.89 to 0.99, p = 0.04).
This is the first randomised, two armed, multicentre study to compare prospectively the combination of WLB+AFB with WLB only for detecting dysplasia II–III and CIS. Overall, WLB+AFB was significantly superior to WLB alone (p = 0.037, table 1). For dysplasia II–III the detection rate was improved by a factor of 2.1 (p = 0.03), but the detection rate for CIS was not significantly increased (p = 0.75). Further subgroup analysis revealed that the WLB+AFB arm was significantly superior to the WLB arm in risk group II. For risk groups I and III a similar tendency was detected, although the statistical power of the study was too low to assess this homogeneity. A comparison of our results with previous studies is limited because of different study populations, the lack of a widely accepted pathological consensus classification system, different study designs, and use of different autofluorescence systems.
The low overall prevalence of 3.9% patients with preinvasive lesions in our study is in contrast to previous studies. This may be due partly to the lower sensitivity of WLB in arm B which reduces the overall prevalence, or to a different patient population with a lower prevalence of preinvasive bronchial lesions, or the prevalence of preinvasive lesions may be reduced by the exclusion of biopsy specimens next to invasive tumours.
In our study the risk group with abnormal sputum cytology comprised the lowest number of patients and the highest prevalence of patients with biopsy proven preinvasive lesions (11.1%, table 1). This is confirmed by previous studies which also found the highest prevalence of preinvasive lesions in study populations comprised mainly of heavy smokers with abnormal sputum cytology, ranging from 13.3%17 to 52.7%18 up to 56%.19,20 Sato et al11 investigated AFB exclusively in patients with abnormal sputum cytology and reported an extremely high prevalence of 80% premalignant findings which, however, included mild dysplasia. In contrast, in a study by Kurie and co-workers,13 only 3% of the biopsy specimens showed metaplasia and/or mild dysplasia. No biopsy showed moderate to severe dysplasia or CIS. The patients in this study had 20 pack-years of smoking but lacked additional risk factors for malignancy such as abnormal sputum cytology or obstructive lung disease. This points out the impact of the patient population on the prevalence of preinvasive lesions. This is also supported by a recent study which reported a prevalence of preinvasive lesions of 1.4% in smokers compared with 12.5% in a group of patients participating in postoperative follow up of completely resected lung cancer.21
The higher prevalence of premalignant lesions in other studies may also be due to the fact that different study populations comprised a higher percentage of patients with invasive carcinomas.12,19 Biopsy specimens in these patients were often taken next to visible invasive tumours. Invasive tumours in marginal zones often show decreasing malignancy with transitional epithelial changes into CIS and dysplasia. However, in our study we were not looking for these carcinoma associated lesions so they have been excluded from further evaluation.
From a biopsy related perspective, the overall detection rate was improved by a factor of 1.6. In clinical practice some investigators may take only visual positive biopsy specimens. In this case the detection will be significantly increased by a factor of 2.1. From this (economical) point of view, in routine bronchoscopy when random biopsies are not taken, WLB+AFB will be even more effective.
In our study the biopsy related sensitivity in arm A (WLB+AFB) was 1.4 times higher than in arm B (WLB, fig 2). Single arm studies have reported a high variability in the relative sensitivities with an improvement in the detection rate (relative sensitivity) with WLB+AFB of 1.1–6.3 compared with WLB alone (table 3). The increase in the relative sensitivity of AFB depends on the sensitivity of WLB and has an inverse ratio to the sensitivity of WLB. The extremely high increase in the relative sensitivity of AFB by a factor of 6.3 reported by Lam et al12 is due to the surprisingly low sensitivity of WLB of only 8.8%. A comparison of recent studies shows that most of them found considerably higher sensitivity rates for WLB ranging from 21%30 up to 85% in the study by Sato et al.11 In our study the relatively high sensitivity of WLB (62.1%) may be explained by the fact that only experienced bronchoscopists in pulmonary care centres trained in detecting subtle bronchial lesions by WLB alone participated. The increase in the detection rate of premalignant lesions detected by AFB may therefore be even higher in primary and secondary care centres.
Another factor which might increase the relative sensitivity of AFB is the preceding WLB examination in single arm studies. However, a recent study by Hirsch,18 who randomised the sequences of bronchoscopists and the examinations in a very small number of patients, did not show a significant influence of expertise and memory bias. In our study this methodological bias was completely excluded by the parallel group study design which allowed for an unbiased estimation of the specific effect of AFB.
The sensitivity of many studies is also limited by the fact that up to 33% of preinvasive lesions were detected by “random biopsies”, which were taken from abnormal but visually negative classified areas—that is, inflammation, granulation tissue, hyperplasia, metaplasia, and mild dysplasia.12 In our study the rate of visually false negative biopsies in arm A (WLB+AFB) was lower (6/34 (17.6%), fig 2). This is due primarily to the study design with visual classification into two categories only (suspicious/non-suspicious), which we recommend to be the basis for future study designs in this field.
An important limitation of AFB may be the high rate of false positive results. In the multicentre trial reported by Lam and co-workers12 the positive predictive value of AFB in the detection of lesions corresponding to moderate dysplasia and CIS or worse was 23%. The positive predictive value is dependent on the prevalence of dysplasia II–III and CIS in addition to sensitivity and specificity. Recalculating the positive predictive value of our study on the basis of the prevalence of 14.5% reported by Lam, our positive predictive values were 25.1% in arm A (WLB+AFB) and 20.6% in arm B (WLB). Although the specificity of WLB+AFB in our study was statistically significantly lower than that of WLB alone, we do not think that this difference (58.4% v 62.1%) is clinically relevant (fig 3). The lower specificity of AFB may lead to more biopsies being evaluated at greater cost. However, the high sensitivity and low specificity of AFB is similar to other imaging modalities such as low dose CT scanning in the diagnosis of small malignant nodules.32 High sensitivity and low specificity also characterise other cancer screening tools such as occult blood testing, mammography, and PSA.33
The inter-observer variability with respect to the pathology reports may also explain conflicting results.34 Until the latest publication of the WHO/IASLC Histological Typing of Lung and Pleural Tumours in 1999,14 there was no consensus classification system for grading preinvasive lesions. A study of the impact of inter-observer and intra-observer variation on the reproducibility of the consensus WHO/IASLC grading system for preinvasive squamous lesions of the bronchus assessed by six histopathologists showed that inter-observer variation was relatively minor, although intra-observer variation was higher among trainee pathologists.35 There was no significant difference in either inter-observer or intra-observer agreement between the five point grading system of metaplasia, dysplasia I–III, and CIS. In our study both pathologists were experts in pulmonary pathology and based their decisions on the consensus grading system restricted to WHO and IASLC.
To date, nearly all published AFB studies used the LIFE system (Xillix Technologies, Vancouver, Canada) which was the first autofluorescence system on the market. Other systems have been developed, including D-Light AF and Pentax-SAFE 1000. No studies have compared all three available systems prospectively, so we cannot exclude the possibility that technical factors may have a significant impact on the findings. At present only two studies have compared LIFE and Pentax-SAFE 1000 and LIFE and D-Light AF, respectively, with similar results.36,37 Furthermore, the use of the AFB has not been evaluated relative to video-assisted bronchoscopy. The newer videobronchoscopes have higher resolution than conventional fibreoptic bronchoscopes and some of the relative benefit over WLB could probably be reduced.38
In our randomised study AFB detected significantly more dysplasias II–III, while the benefit for CIS was not significant. This raises the question for the clinical relevance of these very early lesions.39 Clinical studies to answer this question are limited by small numbers of patients and by their relatively short duration of follow up. They show that severe dysplasia and even CIS have the capacity to progress and regress spontaneously.40–42 Regarding the limited data, it remains to be demonstrated whether preinvasive lesions detectable by AFB are likely to shorten the patient’s life if left untreated.39
In conclusion, this first randomised AFB study including an independent WLB control arm confirmed the superiority of AFB over WLB in detecting preneoplastic lesions. However, the high expectations raised by many previous studies have not been entirely supported by our findings. Firstly, the prevalence of preinvasive isolated lesions in comparable risk groups was lower than in most former studies which, however, usually also included tumour associated lesions. Secondly, the superiority of AFB over WLB in detecting preneoplastic lesions was statistically significant only for dysplasia II–III and not for CIS.
All authors besides UP and KMM were responsible for patient recruitment, performing the bronchoscopies, data capturing, and analysis of the overall results. Statistical analysis was done by UP and pathological analysis by KMM. KH, MK and ChTB wrote the paper.
Organisation (meetings of organisers and participants), collection of data, work up of pathology specimens, and some technical equipment (bronchoscopes and D-light devices) were supported by Karl Storz GmbH & Co KG, Mittelstr. 8, 78532 Tuttlingen, Germany. There was no other funding source and no financial support for the authors or investigators.