Accuracy of transbronchial needle aspiration for mediastinal staging of non-small cell lung cancer: a meta-analysis
- 1VA Palo Alto Health Care System, Palo Alto, CA and Stanford University School of Medicine, Stanford, CA, USA
- 2Center for Primary Care and Outcomes Research, Stanford University, Stanford, CA, USA
- Correspondence to:
Dr J-E C Holty
Center for Primary Care and Outcomes Research, Stanford University, 117 Encina Commons, Stanford, CA 94305-6019, USA;
- Received 27 January 2005
- Accepted 8 June 2005
- Published Online First 30 June 2005
Background: The reported accuracy of transbronchial needle aspiration (TBNA) for mediastinal staging in non-small cell lung cancer (NSCLC) varies widely. We performed a meta-analysis to estimate the accuracy of TBNA for mediastinal staging in NSCLC.
Methods: Medline, Embase, and the bibliographies of retrieved articles were searched for studies evaluating TBNA accuracy with no language restriction. Meta-analytical methods were used to construct summary receiver-operating characteristic curves and to pool sensitivity and specificity.
Results: Thirteen studies met inclusion criteria, including six studies that surgically confirmed all TBNA results and enrolled at least 10 patients with and without mediastinal metastasis (tier 1). Methodological quality varied but did not affect diagnostic accuracy. In tier 1 studies the median prevalence of mediastinal metastasis was 34%. Using a random effects model, the pooled sensitivity and specificity were 39% (95% CI 17 to 61) and 99% (95% CI 96 to 100), respectively. Compared with tier 1 studies, the median prevalence of mediastinal metastasis (81%; p = 0.002) and pooled sensitivity (78%; 95% CI 71 to 84; p = 0.009) were higher in non-tier 1 studies. Sensitivity analysis confirmed that the sensitivity of TBNA depends critically on the prevalence of mediastinal metastasis. The pooled major complication rate was 0.3% (95% CI 0.01 to 4).
Conclusions: When properly performed, TBNA is highly specific for identifying mediastinal metastasis in patients with NSCLC, but sensitivity depends critically on the study methods and patient population. In populations with a lower prevalence of mediastinal metastasis, the sensitivity of TBNA is much lower than reported in recent lung cancer guidelines.
Non-small cell lung cancer (NSCLC) is the most common malignancy in the world and accounts for an estimated 1 million deaths each year.1 The overall 5 year survival is approximately 15%.2 However, the survival rate approaches 70% in some patients with resectable disease.3 Metastasis to the mediastinal lymph nodes is one of the most important factors in determining resectability and prognosis.4 Careful mediastinal staging is essential to identify appropriate candidates for surgery and to avoid futile thoracotomy in patients with more advanced disease.
Currently, computed tomography (CT) is the most frequently used preoperative staging modality. However, large benign hyperplastic lymph nodes are commonly found in patients with NSCLC5 and normal sized lymph nodes frequently harbour metastases.6 Preoperative clinical staging with CT differs from surgical staging in as many as 40% of cases.7,8 Newer imaging modalities such as positron emission tomography (PET) have limitations in diagnostic accuracy as well.9 Given the limitations of CT and PET, invasive surgical staging techniques such as mediastinoscopy are often used to exclude or confirm mediastinal lymph node metastasis, especially in patients who are candidates for surgical resection. However, mediastinoscopy is associated with a complication rate of 2–3% and a surgical mortality rate of around 0.1%.10–12
Transbronchial needle aspiration (TBNA) using a flexible fibreoptic bronchoscope was developed in the early 1980s to obviate the need for more risky surgical staging procedures. Compared with mediastinoscopy, TBNA is generally more convenient, less risky, and less expensive.13 A recent systematic review of mediastinal staging with TBNA found the sensitivity to be similar to that of mediastinoscopy (76% v 81%).14 This analysis, however, was not restricted to patients with NSCLC, did not assess study methodological quality, and did not attempt to identify sources of variation in study results. We performed a meta-analysis to estimate the diagnostic accuracy of TBNA in patients with NSCLC and to identify technical factors and patient characteristics that have an impact on accuracy.
A more detailed description of our methods is available as an online data supplement on the Thorax website at www.thoraxjnl.com/supplemental.
Literature search and identification of studies
Medline and Embase (January 1966 to July 2003; Medline updated through April 2004) were searched to identify studies that examined TBNA for mediastinal staging in NSCLC (fig S1A and B, online supplement), and reference lists of included studies and review articles were manually searched. All articles were considered, regardless of language.
Selection of studies
We included studies that (1) examined TBNA using a flexible bronchoscope for mediastinal staging in patients with NSCLC; (2) enrolled at least 10 subjects with and/or 10 subjects without mediastinal metastasis; (3) provided sufficient data to permit calculation of sensitivity and/or specificity; and (4) enrolled no more than 10% of patients with a diagnosis other than NSCLC or provided separate data for patients with NSCLC. More rigorous (tier 1) studies enrolled at least 10 subjects with and 10 subjects without mediastinal lymph node involvement, surgically confirmed all TBNA results (for example, with mediastonoscopy, mediastonomy and/or thoracotomy), and used the patient as the unit of analysis. The authors of abstracts and studies not reporting sufficient data were contacted to request additional information.
An existing instrument was adapted to describe the methodological quality of studies,15 as reported previously (fig S2, online supplement).9,16 We developed criteria for the technical quality of TBNA based on our clinical experience and by reviewing published guidelines.17–19
One investigator abstracted primary data regarding patient characteristics and the sensitivity and/or specificity of TBNA for identifying mediastinal metastasis in patients with NSCLC.
When possible, we separated staging characteristics of TBNA for patients with and without enlarged lymph nodes on the CT scan and for biopsies performed at hilar, subcarinal, paratracheal, or other lymph node stations. We also separately tabulated test characteristics for studies using “real time” imaging—for example, CT fluoroscopy, endobronchial ultrasound, or transthoracic ultrasound.
Data synthesis and statistical analysis
We constructed a 2×2 contingency table for each study to summarise the results of TBNA and the reference test(s). For each study we calculated the true positive rate (TPR; sensitivity), the false positive rate (FPR; 1−specificity), and the log odds ratio (LOR). When necessary, we added 0.5 as a correction factor to calculate the LOR.
Because many studies of TBNA did not confirm positive test results surgically, they were unable to report false positive rates. We therefore calculated a weighted kappa-1 coefficient which does not require information about the false positive rate to measure test accuracy with respect to avoiding false negative results.20,21
A random effects model was used to pool sensitivity, specificity, LOR and kappa-1.22 When pooling sensitivity and specificity, we excluded studies with <10 subjects with or without mediastinal lymph node involvement, respectively, in the calculations. Summary receiver operating characteristic (SROC) curves as described by Moses et al23 were constructed to summarise the results quantitatively.
To assess sources of variation in study results we performed sensitivity analyses, discriminate function analyses, and meta-regressions. Sensitivity analysis included stepwise single study elimination, adjusting the correction factor, and varying the reference test result in studies that employed a suboptimal reference standard. To compare sensitivity and specificity jointly in studies grouped by tier and prevalence we used discriminant function analysis. Multivariate analysis of variance (ANOVA)24,25 was used to compare reported sensitivities and LORs in studies with high and low prevalences of lymph node metastasis (⩾60% or <60%) and year of study publication (⩾1995 or <1995). To assess for the presence of publication bias we constructed inverted funnel plots of standard error versus estimated effect size (LOR) for each individual study.26 We also assessed how the exclusion of small cell cancer cases from the included studies impacted on the accuracy of TBNA.
All biostatistical models were programmed with Excel 8.0 for Windows (Microsoft Corporation, Redmond, Washington, USA). Discriminant function analysis was performed using SAS 9.0 for Windows (SAS Corp, Cary, NC, USA). We calculated 95% confidence intervals (CIs) for the TPR and the FPR by using the quadratic method.27 A normal approximation to the binomial of the standard error was used in calculating all other confidence intervals, as appropriate. When making comparisons between groups of studies, an unpaired t test or the Mann-Whitney U test was used, as appropriate. A two tailed p value of <0.05 was considered statistically significant.
Literature search and study selection
Our literature search identified 525 potentially eligible studies (fig 1); 398 studies judged not to be relevant after carefully reviewing their titles and abstracts were eliminated. A hand search of the bibliographies of the remaining 127 articles identified 203 additional studies that were potentially relevant. A preliminary review of these 330 articles eliminated 268 studies, leaving 62 articles for detailed analysis (table S1, online supplement). After detailed review, 13 studies met the inclusion criteria (table 1).28–40 Studies were most often excluded because they provided insufficient data to calculate sensitivity or specificity (76%) or enrolled more than 10% of subjects with a diagnosis other than NSCLC (60%). Inter-rater agreement for study inclusion was high (mean kappa ∼80%; table S2, online supplement). Five authors provided additional information that enabled us to include their studies.28,29,32,33,37
The median number of participants per study was 44 (range 10–183). Six studies30,31,33,34,36,37 reported statistics about the age of participants (median age 60 years) and seven studies 30,31,33–37 reported sex characteristics (median proportion male 89%). One study reported results by using individual lymph nodes as the unit of analysis.35 For the other studies that reported results by using the patient as the unit of analysis, the median prevalence of mediastinal metastasis was 70% (interquartile range 47–83). The size and type of TBNA needle used and the number of aspirate passes per lymph node station varied between studies (table S3, online supplement). None of the studies stratified results according to nodal station or lymph node size on the CT scan in patients with NSCLC. In eight studies all positive and negative TBNA results were confirmed by mediastinoscopy, mediastinotomy, or thoracotomy.28–34,37 Six studies enrolled fewer than 10 subjects without mediastinal lymph node involvement.33,34,36–39 Two studies33,37 used real-time imaging (CT or endobronchial ultrasound) to guide needle placement during TBNA. Five studies met criteria for tier 1 analysis.28–32
Studies met between 12 and 23 of the 34 prespecified criteria for methodological quality. Seven studies met at least 50% of the criteria.28–30,32,33,36,38 Table S4 (online supplement) shows selected aspects of methodological quality for each study. In general, tier 1 studies met more criteria (mean 18.8; 95% CI 15.8 to 21.8) than non-tier 1 studies (mean 15.8; 95% CI 14.0 to 17.5), but this difference was not statistically significant (p = 0.13).
Diagnostic accuracy of TBNA
Tier 1 analysis (5 studies)
In these studies the median prevalence of mediastinal metastasis was 34% (range 29–60). The median sensitivity and specificity of TBNA were 36% (interquartile range 32–38) and 98% (interquartile range 96–100), respectively (table 2). The pooled (random effects) sensitivity was 39% (95% CI 17 to 61) and the pooled specificity was 99% (95% CI 96 to 100) (table 2, fig 2). The corresponding positive and negative likelihood ratios were 29.0 and 0.62, respectively. The summary ROC curve is shown in fig 3.
The pooled (random effects) kappa-1 coefficient was 30% (95% CI 15 to 46), suggesting that the accuracy of TBNA with respect to false negative results was poor to fair in tier 1 studies.
Non-tier 1 analysis (8 studies)
Two non-tier 1 studies used real-time radiological needle guidance during TBNA.33,37 In the remaining six studies the median prevalence of mediastinal metastasis was 81% (range 55–100; p = 0.002 for comparison with tier 1 studies). None of these six studies provided sufficient information to calculate specificity (for example, they did not surgically confirm positive TBNA results). The median sensitivity of TBNA in studies not using real-time radiological needle guidance was 82% (interquartile range 79–84; table 2). The pooled (random effects) sensitivity was 78% (95% CI 71 to 84; table 2, fig 2). The pooled kappa-1 coefficient (random effects) was 40% (95% CI 19 to 62; table 2), suggesting that the accuracy of TBNA with respect to false negative results was fair in non-tier 1 studies.
The median prevalence of mediastinal metastasis in the two non-tier 1 studies that used real-time radiological needle guidance was 83% (p = 0.84 for comparison with the six other non-tier 1 studies). The pooled (85%) and median sensitivities (88%) in these two studies were not significantly different (p = 0.36 and p = 0.38, respectively) from the pooled and median sensitivities of the six non-tier 1 studies that did not use real-time radiological guidance.
Summary analysis (11 studies)
The Q statistic from the random effects model showed that there was statistically significant heterogeneity in sensitivity (p<0.001) but not in specificity (p = 0.90). Discriminant function analysis confirmed that there was a statistically significant difference in the joint sensitivity and specificity of tier 1 and non-tier 1 studies (p = 0.002, parametric Wilks’ lambda test; fig 4). We therefore did not pool the results of tier 1 and non-tier 1 studies.
One study did not report complications.30 Of the remaining studies, two reported major complications,28,40 including two major bleeds and one pneumothorax requiring a chest tube. Two other cases of pneumothoraces35 and one case of pneumomediastinum28 spontaneously resolved under observation. The mean rate of major complications per patient in tier 1 and non-tier 1 studies was 0.32% (95% CI 0.01 to 6) and 0.25% (95% CI 0.01 to 6), respectively (p = 0.65). The overall major complication rate was 0.26% (95% CI 0.01 to 4).
Sensitivity analysis and meta-regressions
An inverted funnel plot showed no evidence of publication bias (fig S4, online supplement). Stepwise single study elimination did not substantially affect the magnitude of the pooled LOR or sensitivity in tier 1 or non-tier 1 studies (table S5, online supplement). In one study, one of two false positive results had scanty neoplastic cells and no lymphocytes.32 Re-categorising this result as a true negative had no effect on pooled sensitivity, specificity, LOR, or the kappa-1 coefficient. Varying the correction factor from 0.5 to 0.1 had no impact on the LOR or the kappa-1 coefficient. Using a 0.1 correction tended to shift the summary ROC curve to the left (increasing specificity), but had little discernable impact on sensitivity.
Study sensitivity was positively correlated with the prevalence of lymph node metastasis (fig 5). When the prevalence rose from 40% to 80%, sensitivity increased from 42% to 78%. For the seven studies in which the prevalence of mediastinal disease was ⩾60%, the median sensitivity (83% v 36%; p = 0.005) and pooled sensitivity (84% v 40%; p = 0.005) were higher than the five remaining studies in which prevalence was <60% (fig S3, online supplement). Discriminant function analysis confirmed that the joint sensitivity and specificity were different in studies with high versus low prevalence (p = 0.01, parametric Wilks’ lambda test).
For the eight studies published since 1995, the pooled sensitivity (71% v 60%; p = 0.52) was not significantly different from the five remaining studies published before 1995. However, the median prevalence of lymph node metastasis in more recent studies (82% v 55%; p = 0.09) was higher than in the five earlier studies.
These and other potential sources of heterogeneity were assessed by a multivariate ANOVA to compare reported sensitivities and LORs in studies with respect to the prevalence of lymph node metastasis (⩾60% or <60%) and year of publication (⩾1995 or <1995). Because only two included studies used real-time radiological needle guidance, we were unable to assess this potential source of heterogeneity and excluded these two studies from the analysis. Sensitivity was higher in studies with a higher prevalence of lymph node metastasis (difference 60%; 95% CI 51 to 69) and in more recently published studies (difference 10%; 95% CI 1 to 18). The prevalence of lymph node metastasis, but not year of publication, had a significant effect on the LOR.
Excluding patients with small cell lung cancer from the included studies had no impact on the pooled sensitivity in tier 1 (39% v 41%, p = 0.92) or non-tier 1 (78% v 80%, p = 0.71) studies.
TBNA is highly specific for identifying mediastinal metastasis in patients with NSCLC, but sensitivity depends critically on the prevalence of mediastinal disease. Specificity is excellent, but not perfect. In three of eight studies that surgically confirmed all TBNA results, four false positive results were reported. One of the four false positive results would have been avoided if biopsy specimens were considered negative when they lacked nodal tissue or when the cytopathologist identified the specimen as “contaminated” or containing “atypical” cells. It is essential to avoid contamination of the bronchoscope channel and to follow stringent criteria to define positive or negative biopsy specimens in order to minimise the risk of false positive TBNA results. We found that TBNA is generally safe with a major complication rate of approximately 0.3%.
We identified several sources of variation in study results. Sensitivity was much lower in tier 1 studies than non-tier 1 studies. Tier 1 studies surgically confirmed all TBNA results, enrolled at least 10 patients with and without mediastinal metastasis, and used the patient as the unit of analysis. Sensitivity was also lower in studies with a low prevalence (<60%) of mediastinal metastasis. Not surprisingly, TBNA appears to be less sensitive than mediastinoscopy for identifying mediastinal metastasis. A recent meta-analysis of 14 studies of mediastinoscopy reported a pooled sensitivity of 81% (95% CI 76 to 85).14 In these studies the pooled prevalence of mediastinal disease was 37%, which is similar to the median prevalence (34%) of lymph node metastasis in tier 1 studies of TBNA.
The difference in diagnostic accuracy between tier 1 and non-tier 1 studies was statistically significant. We believe that this difference is probably related to a lower prevalence of mediastinal metastasis in tier 1 than in non-tier 1 studies. Higher disease prevalence and enrolment of patients with a more severe spectrum of disease are sources of variation in studies of diagnostic accuracy leading to an increase in sensitivity.41,42 We speculate that the higher prevalence of mediastinal metastasis in non-tier 1 studies may reflect enrolment of study cohorts with a more severe spectrum of mediastinal disease, resulting in more positive TBNA results. For example, non-tier 1 (high prevalence) studies may have enrolled a greater number of patients with bulky lymphadenopathy in whom TBNA was being used to confirm the diagnosis of unresectable disease. In contrast, tier 1 (lower prevalence) studies may have enrolled potential surgical candidates with less impressive lymph node enlargement. A recent meta-analysis of 39 studies comparing PET with CT scanning for mediastinal staging in NSCLC found that the median prevalence of malignant lymph nodes in enrolled patients was 32% (range 5–64), which is similar to the median prevalence of mediastinal metastasis in the tier 1 studies in our analysis.9 Most of the studies of PET and CT scanning enrolled patients with potentially resectable NSCLC. Furthermore, the bronchoscopist’s technique may be more proficient when the pretest probability of obtaining a positive result is high (higher prevalence of mediastinal disease within the study cohort). For example, more diligence may be taken to identify endobronchial landmarks, more TBNA needle passes attempted, and more aggressive sedation given to minimise cough and patient movement during the procedure.
The difference in pooled sensitivities between tier 1 and non-tier 1 studies may also be the result of methodological differences. Non-tier 1 studies used suboptimal methodological criteria by not confirming all TBNA results against a reference standard (verification bias), having insufficient numbers of participants with and without mediastinal metastasis, and/or not using the patient as the unit of analysis. Verification bias has been shown to lead to overestimates of test sensitivity.41
A previous meta-analysis showed that the pooled sensitivity of 12 studies analysing TBNA in patients with either small-cell lung cancer or NSCLC was 76%.14 Our estimates of sensitivity were lower for tier 1 studies (39%) because several studies that were included in this previous meta-analysis did not meet the criteria for our tier 1 analysis. Interestingly, the exclusion of patients with small cell lung cancer from the studies included in our analysis did not significantly affect sensitivity.
Despite the relatively low sensitivity of TBNA in detecting mediastinal metastasis compared with other invasive staging procedures, TBNA continues to be an appropriate diagnostic test in the sampling of mediastinal lymph nodes, especially if concurrently performed with routine bronchoscopic examination for suspected lung cancer. TBNA is generally more convenient, less risky, and less expensive than other invasive staging procedures such as mediastinoscopy.13 A formal assessment of the cost effectiveness of staging TBNA is beyond the scope of this analysis.
Although we were unable directly to assess how newer needles, use of on-site cytological analysis, and/or improved techniques may impact on TBNA accuracy, our multivariate ANOVA showed that more recent studies—which presumably used more up to date techniques and equipment—had a slightly higher sensitivity when we controlled for prevalence of mediastinal metastasis.
Our study has several limitations. Firstly, only a small number of studies met our inclusion criteria (five tier 1 and eight non-tier 1 studies). Most studies enrolled fewer than 100 participants and were performed at single centres where experience with TBNA is likely to be extensive. Large multicentre prospective studies of TBNA should be performed in consecutively enrolled patients with NSCLC. Studies should explicitly define inclusion criteria and should report separate results for patients with non-bulky and bulky lymphadenopathy. Secondly, because needle type and size, as well as the number of aspiration passes varied between studies, we were unable to control for these test characteristics. Likewise, because most studies did not report age or sex characteristics, we were unable to control for these demographic features. Thirdly, few of the included studies provided information on whether TBNA results altered patient management. Clearly, positive results on TBNA obviate the need for mediastinoscopy because specificity and positive predictive value are high. However, simple calculations based on our results indicate that, when prevalence is relatively low (∼35%), approximately 85% of patients will have negative TBNA results and 25% of such results will be falsely negative. Fourthly, despite an exhaustive search, we may not have identified all studies, especially those with unpublished results. We identified one potentially relevant abstract but we were unable to obtain sufficient additional information to assess it for inclusion.43 However, an inverted funnel plot showed no evidence of publication bias. Finally, the 13 included studies used a variety of reference tests (cervical mediastinoscopy, anterior mediastinotomy and/or thoracotomy with ipsilateral lymph node sampling), raising the possibility of differential verification bias.41 Because none of the reference tests has perfect sensitivity, the true sensitivity of TBNA may be even lower than our estimates. Future studies of the diagnostic accuracy of TBNA should require thoracotomy with systematic sampling of both normal and abnormal appearing lymph nodes at all accessible mediastinal stations to exclude the presence of lymph node metastasis.44
In conclusion, we found that TBNA is highly specific for detecting mediastinal lymph node metastasis in patients with NSCLC, but that sensitivity depends critically on the prevalence of mediastinal lymph node involvement. In patient populations with a relatively low prevalence of mediastinal disease (such as those with potentially resectable NSCLC), the sensitivity of TBNA is poor.
The authors thank Helena C Kraemer for providing statistical advice; Corinna Haberland, Annette Langer-Gould, Pavel Strnad, Kelvin Tan, and Tomasz Ziedalski for reviewing non-English language studies; and the authors of the studies included in our meta-analysis who provided additional unpublished data.
Dr Holty is supported by the VA Special Fellowships Program. Dr Gould is a recipient of an Advanced Research Career Development Award from the VA Health Services Research and Development Service.
Drs Holty, Gould and Kuschner have no financial conflict of interest or competing interests to disclose.
Published Online First 30 June 2005