Introduction Interferon γ release assays (IGRAs) are increasingly used for tuberculosis (TB) infection, but their incremental value beyond patient demographics, clinical signs and conventional tests for active disease has not been evaluated in children.
Methods The incremental value of T-SPOT.TB was assessed in 491 smear-negative children from two hospitals in Cape Town, South Africa. Bayesian model averaging was used to select the optimal set of patient demographics and clinical signs for predicting culture-confirmed TB. The added value of T-SPOT.TB over and above patient characteristics and conventional tests was measured using statistics such as the difference in the area under the receiver operating characteristic curve (AUC), the net reclassification improvement (NRI) and the integrated discrimination improvement (IDI).
Results Cough longer than 2 weeks, fever longer than 2 weeks, night sweats, malaise, history of household contact and HIV status were the most important predictors of culture-confirmed TB. Binary T-SPOT.TB results did not have incremental value when added to the baseline model with clinical predictors, chest radiography and the tuberculin skin test. The AUC difference was 3% (95% CI 0% to 7%). Using risk cut-offs of <10%, 10–30% and >30%, the NRI was 7% (95% CI −8% to 31%) but the CI included the null value. The IDI was 3% (95% CI 0% to 11%), meaning that the average predicted probability across all possible cut-offs improved marginally by 3%.
Conclusions In a high-burden setting, the T-SPOT.TB did not have added value beyond clinical data and conventional tests for diagnosis of TB disease in smear-negative children.
- Clinical Epidemiology
Statistics from Altmetric.com
What is the key question?
Interferon γ release assays (IGRAs) are used for diagnosis of tuberculosis (TB) infection, but their incremental value beyond patient demographics, clinical signs and conventional tests for active disease has not been evaluated in children.
What is the bottom line?
Our study shows that the T-SPOT.TB does not have added value beyond patient demographics, clinical signs and conventional tests for diagnosis of active pulmonary TB (PTB) in smear-negative children.
Why read on?
Our findings have great relevance for paediatric clinical practice in high-burden, low-resource settings and are consistent with the WHO recommendation against the use of IGRAs for diagnosis of active PTB in low and middle income countries.
The diagnosis of pulmonary tuberculosis (PTB) in children may be challenging.1 ,2 Microbiological confirmation with culture is often unavailable due to the difficulty of specimen collection; even when specimens are obtained, smear results are usually negative due to paucibacillary disease. The diagnosis therefore relies on clinical and radiological findings and evidence of TB infection as measured by the tuberculin skin test (TST).3
Interferon γ release assays (IGRAs) have been developed to replace the TST for diagnosis of TB infection. Meta-analyses of IGRA performance in children show that they have higher specificity but similar sensitivity compared with the TST for active TB, although IGRA sensitivity is lower in high-incidence versus low-incidence settings.4 ,5
While the intended use of IGRAs is not for active TB, they may be helpful in providing evidence for a rapid diagnosis in children when combined with other routine clinical and laboratory investigations. If conventional tests such as smears are negative, then a T-SPOT.TB (Oxford Immunotec, Abingdon, UK) result could add diagnostic information while culture results are pending. Two published studies on the incremental value of IGRAs in adult patients showed limited utility of IGRAs for active TB disease.6 ,7
Our study aim was to investigate the incremental value of T-SPOT.TB beyond patient demographics, clinical signs and conventional tests in the diagnostic workup of hospitalised children being evaluated for PTB in a high-burden setting. Using a multivariable approach, the incremental value was measured by the difference between the area under the receiver operating characteristic curve (AUC), net reclassification improvement (NRI) and integrated discrimination improvement (IDI).
The study enrolment process has been previously described.8 Briefly, hospitalised children ≤15 years with suspected PTB were consecutively enrolled from two hospitals in Cape Town from February 2009 to July 2011. To be eligible, a child had to present with cough >2 weeks and at least one of the following: exposure to a household TB case within the past 3 months, weight loss or failure to thrive within the past 3 months, positive TST result or chest radiography (CXR) suggestive of PTB. After the parent or legal guardian provided informed consent, the child underwent a physical examination and extensive diagnostic testing. The study was approved by the Faculty of Health Sciences, Research Ethics Committee at the University of Cape Town.
Two sequential induced sputum8 and two nasopharyngeal aspirate specimens9 were collected, after which concentrated fluorescent acid-fast smear microscopy and liquid culture were performed. A definite TB case was defined by at least one positive culture for Mycobacterium tuberculosis in any specimen.
An anterio-posterior and lateral CXR, taken on enrolment, were evaluated by a single experienced reviewer, blinded to other test results. The results were reported as consistent, inconsistent or inconclusive for PTB. When the result was inconclusive, a second reviewer read and classified the CXR; a third reviewer evaluated the radiograph when there was disagreement. The TST was performed using the Mantoux technique with 2 tuberculin units of purified protein derivative (PPD RT23, Statum Serum Institute) and read by a trained nurse at 48–72 h. The cut-off value for a positive TST was ≥5 mm in HIV-infected children and ≥10 mm for others. A HIV rapid test was performed on all children and confirmed by HIV ELISA in children ≥18 months or PCR in children <18 months.
The T-SPOT.TB was performed according to the manufacturer's guidelines. All of the tests were run within 3–4 h of a blood specimen taken by a single trained technologist. Peripheral blood mononuclear cells were separated, and the cell concentration was adjusted to 2.5×106 cells/ml. To each of four interferon γ capture antibody precoated wells, the following were added: 50 μl of Positive Control, Panel A (ESAT6), Panel B (CFP10), and Nil control. Then, 100 μl of cell suspension was added to each well. This was incubated overnight at 37°C in 5% CO2. The test result was positive if Panel A minus Nil control and/or Panel B minus Nil control ≥6 spots. The test result was negative if both Panel A minus Nil control and Panel B minus Nil control ≤5 spots. A Nil control >10 spots was considered Indeterminate. Two observers, who were blinded to all other test results, reported the number of spots, and the mean number was used in the analysis. Indeterminate results were excluded from analysis. Physicians were blinded to the T-SPOT.TB results and did not use them in the clinical decision-making.
Estimation of incremental value
We used Bayesian model averaging (BMA) to identify the most parsimonious set of predictors in a logistic regression model for the outcome of culture-confirmed TB. Predictors for patient demographics and clinical signs were determined a priori and included passive smoke exposure, HIV status, history of household contact, weight for age Z score, night sweats, cough >2 weeks, fever >2 weeks, malaise, loss of appetite and weight loss. We used the BMA programme in the statistical software package R that considers all possible combinations of predictors. This programme performs a model selection procedure based on the Bayesian information criterion.10 The advantage of this approach is that it does not lead to overfitting.11 Predictors that were in the top six models, which together had an 88% probability of being the best model, were included in the clinical model.
Since children with positive smears are diagnosed with definite TB and started on treatment, the incremental value of T-SPOT.TB was evaluated only in smear-negative children. To reflect the diagnostic workup for active TB in children in our hospital setting, the clinical model was extended by the CXR result (consistent or inconsistent with PTB) and the TST result (positive or negative according to the recommended cut-off depending on HIV status). The clinical predictors, CXR and TST formed the baseline model. This model was further extended by the T-SPOT.TB result (positive or negative according to the manufacturer's recommended cut-off) to determine its incremental value. We also performed the same extension using the quantitative log-transformed spot-forming cells (SFCs) rather than the binary result. In the primary analysis using confirmed TB as the outcome, all culture-negative children were considered non-TB cases, even if they were clinically diagnosed with TB and started on treatment. In a secondary analysis, we estimated the incremental value of T-SPOT.TB using the children placed on empirical treatment for TB as the outcome.
Evaluating incremental value involves comparing prediction models, with and without the new test as a covariate. The difference in AUC for the two models is the most familiar statistic for estimating added value. The AUC is formally defined as the probability that for each randomly selected pair, consisting of one individual with and one without the event, the predicted probability of disease will be higher for the individual who has the event.12
More recently, Pencina and colleagues proposed the NRI and IDI statistics for comparing prediction models.13 Both measures are based on quantifying appropriate reclassification by the addition of a new test to an existing model, or higher probability for patients with the event and lower probability for those without the event.
The NRI is the net proportion of individuals with the event who are reclassified into a higher risk category, plus the net proportion of individuals without the event who are reclassified into a lower risk category: NRI=(pup,events − pdown,events)+(pdown,nonevents − pup,nonevents), where lower-case p is the proportion of upward or downward movement into predefined risk categories. We used the risk categories of <10%, 10–30% and >30% based on the probability distribution of our models and the categories used by Metcalfe et al.7
The IDI, a continuous form of the NRI across all risk thresholds from 0% to 100%, is the sum of the increase in probability among patients with the event and the decrease in probability among patients without the event. It is formally defined by: , where upper-case is the average predicted probability of the outcome in the old model and the new extended model with the additional test.
Tenfold cross validation was performed on the data to avoid overestimating the incremental value. All 95% CIs for the AUC, NRI and IDI were obtained using 1000 bootstrap samples. An interval that did not cross the null value of 0 would represent statistically significant added value. All incremental value analyses were performed using the user-written package ‘incrisk’ in Stata, V.12.14
As shown in figure 1, 557 children with CXR and TST results had the T-SPOT.TB performed. In 29 (5%) children, the T-SPOT.TB was indeterminate (all due to high Nil control), and they were excluded from further analysis. The demographic and clinical characteristics of the 528 children are summarised in table 1. The median age was 22 months (IQR 12–53). The prevalence of HIV infection was 24%, with most children in the late clinical stages of disease as defined by WHO.15 The median CD4 cell count was 489 cells/ml (IQR 226–832). Most children had findings consistent with PTB on CXR, while most were negative on TST and T-SPOT.TB. Culture-confirmed TB was diagnosed in 91 (17%) children, while only 37 (7%) were positive on smear microscopy. Based on clinical judgment, 296 (56%) children were diagnosed with PTB, for which they were started on treatment.
Predictors that were in the top six models according to the BMA procedure included cough >2 weeks, fever >2 weeks, night sweats, malaise, HIV infection, and history of household contact. These predictors comprised the clinical model. The Hosmer–Lemeshow goodness-of-fit test indicated no significant differences between the observed and predicted probability of TB (χ2=7.84, p=0.45). The cross-validated odds ratios for the logistic regression models are presented in table 2.
The incremental value analysis was performed in 491 children with negative smear results. With culture-confirmed TB as the outcome, the AUC for the baseline model with clinical predictors, CXR and TST was 85% (95% CI 80% to 90%). Addition of T-SPOT.TB to this model did not substantially improve its discriminatory ability, as the AUC increased only to 88% (95% CI 85% to 92%) and 87% (95% CI 84% to 92%) when the binary result or quantitative result was added, respectively (table 3). When using the NRI measure, binary T-SPOT.TB results (NRI=7%; 95% CI −8% to 31%) and quantitative SFC results (NRI=8%; 95% CI −7% to 25%) reclassified patients appropriately. However, both CIs included the null value. Table 4 provides a detailed calculation of the NRI for comparing the baseline model with the expanded model with binary T-SPOT.TB results. The numbers along the diagonal represent children who remained in the same risk category even after addition of T-SPOT.TB results. In children with confirmed PTB, the T-SPOT.TB increased the probability of TB in 20% (11/54) of children, while incorrectly decreasing the probability in 17% (9/54) of children. Thus, the net improvement for confirmed PTB cases was 3% (95% CI −10% to 28%). In children without confirmed PTB, the T-SPOT.TB correctly decreased the probability of TB in 9% (38/437) of children but increased the probability in 5% (24/437) of children for a net improvement of 4% (95% CI −1% to 7%).
Considering all possible risk categories from 0% to 100%, the IDI showed that the T-SPOT.TB did not improve the average probability of TB (IDI=3% (95% CI 0% to 11%) for binary results and IDI=4% (95% CI 0% to 12%) for quantitative results). Figure 2 depicts the probability distributions for children with and without confirmed TB and the marginal improvements with addition of binary T-SPOT.TB results. As seen in this figure, adding T-SPOT.TB to the standard workup did not cause substantial increases in predicted probabilities for confirmed TB cases (ie, shifts to the right). For non-TB cases, the probabilities are appropriately concentrated to the left even without the addition of T-SPOT.TB.
When empirical treatment for active TB was used as the outcome of our model, T-SPOT.TB results increased the AUC marginally from 84% to 86% (AUC difference=2%; 95% CI 0% to 3%). A similar finding was seen with the IDI, in which the IDI was 2% (95% CI 1% to 6%) for binary results and 3% (95% CI 1% to 6%) for quantitative results. When risk categories were used for the NRI, the magnitude of improvement was larger in children who were not treated. The T-SPOT.TB correctly decreased the probability of disease in these children. However, the CI contained the null value. Overall, T-SPOT.TB results did not substantially reclassify smear-negative children into appropriate risk categories (table 3).
Although meta-analyses have shown that IGRAs have suboptimal sensitivity for active PTB in children,4 ,5 diagnostic tests such as the T-SPOT.TB are not performed alone. The diagnostic workup is inherently a sequential process that begins with patient demographics, then signs and symptoms, followed by simpler tests, and extending to more expensive and/or invasive tests. Unfortunately, diagnostic research that evaluates the added value of new TB tests beyond routinely measured variables and conventional tests is limited.
Our study is the first incremental value analysis of the IGRA in children, in whom diagnosis of active TB remains a challenge. We measured the added value of T-SPOT.TB in a high-burden setting using the difference in AUC and the NRI and IDI measures that have been more recently described. Our findings show that T-SPOT.TB does not have utility in supporting a diagnosis of active PTB in hospitalised smear-negative children beyond the clinical history and standard tests. When culture-confirmed TB was used as the outcome in the analysis, T-SPOT.TB did not increase the discriminatory ability of the model according to the AUC. Furthermore, it did not correctly reclassify a substantial number of children, whether using prespecified risk categories with the NRI or continuous risk probabilities with the IDI. In a secondary analysis, the findings were similar when empirical TB treatment based on clinical judgment was used as the outcome.
Two studies have been published thus far on the incremental value of IGRAs in smear-negative adult patients using culture-confirmed TB as the gold standard.6 ,7 Both studies evaluated the QuantiFERON-TB (QFT, Cellestis/Qiagen, Australia) assay in adults. Metcalfe and colleagues found that quantitative interferon γ results significantly improved discrimination according to the AUC and appropriately reclassified patients using the NRI when added to patient demographic and clinical characteristics in a low-incidence setting.7 The added value of the QFT, however, was lower when predictors based on subjective clinical judgment of risk (low, medium or high) were incorporated into their model. In the second study, Rangaka and colleagues evaluated the use of the QFT to screen for active TB in HIV-infected adults before starting isoniazid preventive therapy.6 According to the AUC, the QFT had less added value compared with TST for improving upon a baseline clinical model. Based on evidence from these incremental value studies and those based on test accuracy,16 WHO recommended against the use of IGRAs for diagnosing active TB disease in low and middle income countries.17 Our findings are supportive of this negative recommendation in high-burden settings for children as well.
Strengths and limitations
In our methodological approach, we used a BMA procedure that considers all possible combinations of predictors. This model selection process and the cross-validation step avoid developing an overly optimistic model that would have worse performance when applied to other data. Furthermore, we assessed the incremental value using two newer measures that are based on risk probabilities. One major criticism of the AUC as an overall measure of model performance has been its lack of direct clinical interpretation to individual patients. The NRI and IDI are more informative than the AUC in that they calculate the increase in appropriate reclassification separately for events and non-events.
Our study also had several limitations. Only results from one reader were used for most CXRs, so that a consensus read was not possible. Furthermore, we could not evaluate specific radiological findings, such as lobar consolidation or lymphadenopathy, in our analysis. In the secondary analysis in which empirical TB treatment based on clinical judgment was used as the outcome of the model, incorporation bias may be an issue. This type of bias would generally overestimate the performance of tests that are also used in assigning the final outcome. This problem is difficult to avoid in the context of childhood TB, as the absence of an adequate reference standard requires that the entire clinical picture be taken into account when making a final diagnosis.
Our study recruited children from secondary and tertiary-care hospitals, where disease is more severe and treating physicians are highly specialised and/or more prone to overtreatment. Thus, our findings may not be applicable to primary-care facilities with less experienced clinicians or in community and ambulatory settings where children would present with milder form of disease. Furthermore, the availability of resources in our hospital setting may not be representative of other settings in which certain diagnostic tests are not standard of care. In fact, smear microscopy is being phased out as the first-line diagnostic test in South Africa with the nationwide rollout of the rapid Xpert MTB/RIF test (Cepheid, USA).18 Further analysis will be required to determine the incremental value of T-SPOT.TB in children in whom Xpert MTB/RIF is negative. Moreover, our study findings may not be generalisable to settings with low prevalence of HIV coinfection.
The lack of a gold standard to confirm the diagnosis further complicates conducting research studies for childhood TB. Recently, efforts have been made to develop practical reference standards based on consensus clinical case definitions19 and to suggest a standardised approach for reporting findings so that results are comparable across studies.20 Thus, these are important steps forward for facilitating the inclusion of children in diagnostic studies that will target this important group as a priority for new TB technologies.
We thank the clinical study and laboratory staff, the staff at Red Cross War Memorial Children's Hospital and New Somerset Hospital, and the children and their caregivers for participating. We also thank Dick Menzies for providing helpful advice on this study.
Contributors DIL: designed the study, performed the analysis and wrote the manuscript as first author. MPN: participated in the study design, provided input on the analysis and supervised the data collection. MP: participated in the study design and provided input on the analysis. SP: performed the T-SPOT.TB assay. ND: provided guidance on the methods used in the study. HJZ: designed the study, provided input on the analysis and supervised the study.
Funding This work was supported by the National Institutes of Health, USA (grant number 1R01HD058971-01), National Health Laboratory Services Research Trust, Medical Research Council of South Africa, and The Wellcome Trust (grant number 085251/B/08/Z). DL is funded by a doctoral research studentship from the Canadian Thoracic Society and the European and Developing Countries Clinical Trials Partnership (TBNEAT grant). MP is the recipient of a Canadian Institutes of Health Research New Investigator Award and a Fonds de Recherche Quebec Santé (FRQS) Salary Award. ND is supported by a FRQS Salary Award.
Competing interests None.
Ethics approval The study was approved by the Faculty of Health Sciences, Research Ethics Committee at the University of Cape Town.
Provenance and peer review Not commissioned; externally peer reviewed.