Article Text

Download PDFPDF

Original research
Challenging the gold standard: the limitations of molecular assays for detection of Mycobacterium tuberculosis heteroresistance
  1. Sarah N Danchuk1,2,3,
  2. Ori E Solomon1,2,3,
  3. Thomas Andreas Kohl4,
  4. Viola Dreyer4,
  5. Ivan Barilar5,
  6. Christian Utpatel4,
  7. Stefan Niemann6,
  8. Dick van Soolingen7,
  9. Richard Anthony7,
  10. Jakko van Ingen8,
  11. Joy S Michael9,
  12. Marcel A Behr1,2,3,10
  1. 1 Department of Microbiology and Immunology, McGill University, Montreal, Quebec, Canada
  2. 2 Infectious Diseases and Immunity in Global Health Program, Research Institute of the McGill University Health Centre, Montreal, Quebec, Canada
  3. 3 McGill International TB Centre, McGill University, Montreal, Quebec, Canada
  4. 4 Molecular and Experimental Mycobacteriology, Research Center Borstel Leibniz Lung Center, Borstel, Schleswig-Holstein, Germany
  5. 5 German Centre for Infection Research, Research Centre Borstel, Borstel, Germany
  6. 6 Research Center Borstel Leibniz Lung Center, Borstel, Schleswig-Holstein, Germany
  7. 7 RIVM, Bilthoven, The Netherlands
  8. 8 Radboudumc, Nijmegen, Gelderland, The Netherlands
  9. 9 Microbiology, Christian Medical College and Hospital Vellore, Vellore, Tamil Nadu, India
  10. 10 Department of Medicine, McGill University, Montreal, Quebec, Canada
  1. Correspondence to Dr Marcel A Behr, McGill University, Montreal, Québec H4A 3J1, Canada; marcel.behr{at}


Objectives Heteroresistant infections are defined as infections in which a mixture of drug-resistant and drug-susceptible populations are present. In Mycobacterium tuberculosis (M. tb), heteroresistance poses a challenge in diagnosis and has been linked with poor treatment outcomes. We compared the analytical sensitivity of molecular methods, such as GeneXpert and whole genome sequencing (WGS) in detecting heteroresistance when compared with the ‘gold standard’ phenotypic assay: the agar proportion method (APM).

Methods Using two rounds of proficiency surveys with defined monoresistant BCG strains and mixtures of susceptible/resistant M. tb, we determined the limit of detection (LOD) of known resistance associated mutations.

Results The LOD for rifampin-R (RIF-R) detection was 1% using APM, 60% using GeneXpert MTB/RIF, 10% using GeneXpert MTB/RIF Ultra and 10% using WGS. While WGS could detect mutations beyond those associated with RIF resistance, the LOD for these other mutations was also 10%. Additionally, we observed instances where laboratories did not report resistance in the majority population, yet the mutations were present in the raw sequence data.

Conclusion The gold standard APM detects minority resistant populations at a lower proportion than molecular tests. Mycobacterium bovis BCG strains with defined resistance and extracted DNA from M. tb provided concordant results and can serve in quality control of laboratories offering molecular testing for resistance. Further research is required to determine whether the higher LOD of molecular tests is associated with negative treatment outcomes.

  • Tuberculosis
  • Respiratory Infection
  • Bacterial Infection

Data availability statement

All data relevant to the study are included in the article. Genomic sequence data are available at NCBI: SAMN38121993 and SAMN38121994.

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


  • Antibiotic regimens used in Mycobacterium tuberculosis (M. tb) treatment have significant adverse effects and are ideally tailored to the susceptibility profile of the patient isolate.

  • Heteroresistant strains of M.tb pose challenges for the detection of drug resistance and are associated with negative patient outcomes.


  • This study highlights limitations in targeted (GeneXpert) and genome-wide molecular testing for drug resistance and calls for the standardisation of whole genome sequencing (WGS) in clinical microbiology.

  • By comparing GeneXpert, in vitro phenotypic testing and WGS-based approaches head-to-head, we outline and report the limitations of these methods in the context of heteroresistant M. tb isolates.


  • Clinicians who supervise tuberculosis care need to be aware of the limitations of available diagnostic tools.

  • Establishing limitations using safe and feasible quality control tests is important to ensure effective implementation of established methods and new approaches.


Tuberculosis (TB) is the second leading cause of mortality by a single infectious agent and the thirteenth leading cause of death globally.1 Eradication efforts have been largely hindered by the emergence of drug-resistant Mycobacterium tuberculosis (M. tb) (DR-MTB), the causative agent of TB.1 In 2021, there were an estimated 450 000 cases of rifampicin-resistant TB/multidrug-resistant TB (RR-TB/MDR-TB) and 119 000 deaths attributed specifically to this.1 The cornerstone of DR-TB management is detection as inadequate treatment can result in failure to cure at the individual level and onward propagation of DR-MTB isolates to their contacts.2

Since the 1960s, the reference method for detecting DR-MTB has been phenotypic testing by the agar proportion method (APM).3 The APM was developed on the premise that when a certain proportion of bacilli is resistant to the antibiotic, clinical success is unlikely. This quantitative property has been operationally reduced into a dichotomous result (resistant or susceptible) based on a threshold of 1% resistance.4 Patients in whom more than 1% of the M.tb population grows in the presence of the antibiotic are not expected to respond to treatment with that antibiotic as the resistant fraction is expected to dominate within several doubling times following initiation of treatment. The use of a 1% threshold in the M.tb world contrasts with a 0.1% threshold used in other sections of the microbiology laboratory; presumably resistant bacteria at less than 1% can be managed by multidrug therapy.4 5

Culture-based drug-susceptibility testing (DST) is difficult, costly to implement and presents a challenge of turnaround time, where clinicians aim to start therapy promptly for newly diagnosed patients. To mitigate these challenges, molecular tests have been implemented globally for (1) the detection of M.tb and (2) the determination of first-line and second-line drug susceptibility.6 These tests span from simple tests used in peripheral laboratories (eg, the GeneXpert MTB/rifampin (RIF) assay; Cepheid, Sunnyvale, California, USA) to whole genome sequencing (WGS)-based prediction of resistance.7 Given the importance of molecular testing for DR-MTB detection, our laboratory generated a panel of first-line and second-line monoresistant Mycobacterium bovis BCG strains to serve as quality controls for both phenotypic and genotypic testing8 as use of these strains to validate the detection of M.tb in a peripheral setting has previously been described.8 Subsequently, we used a pre-extensively drug-resistant isolate (pre-XDR-TB) as a follow-up control for molecular testing. In this study we sought to evaluate the capacity of molecular assays to detect heteroresistance, using defined mixtures of both monoresistant BCG and M.tb strains.


Strains used

M. bovis BCG: wild type (WT), RIF-R (RpoB S450L), isoniazid (INH)-R (KatG AAdel428), fluoroquinolones (FQ)-R (GyrA D94G), CFZ/bedaquiline (BDQ)-R (Rv0678c S63R), streptomycin (STR)-R (RpsL K43R), INH/FQ-R (KatG S315T+GyrA D94G)

M.tb strains : M. tb 20 527 (pan-susceptible clinical isolate); M. tb 21 697 (pre-XDR) clinical isolate. All resistance conferring mutations are described in table 1.

Table 1

Strains used in experimental studies with associated resistance profile

Heteroresistant samples

The development of monoresistant BCG strains has been previously published.8 As APM detects resistant organisms at 1% of the bacterial population,4 9 we created mixtures of BCG strains at different ratios, to test detection at 50%, 10% or 1%. To validate our findings against published observations about the GeneXpert MTB/RIF,10 11 we also generated a 60% RIF-R mixture (table 2).

Table 2

Phenotypic and molecular detection of RIF heteroresistance

Agar proportion method

WT M. bovis BCG and RIF-R M. bovis BCG (RpoB S450L) were grown in 7H9 complete media (0.2% glycerol, 0.1% Tween, 10% albumin, dextrose catalase (ADC) supplement) to log phase (optical density (OD)600 0.5–1). WT cultures were grown in the absence of antibiotics whereas RIF-R BCG was grown in the presence of 1 ug/mL RIF. Cultures were adjusted to 0.5 MacFarland standard using 7H9 complete media and serially diluted (10-2 and 10-4). 0.1 mL of culture mixes were inoculated on 7H10 agar plates supplemented with 10% oleic acid, albumin, dextrose, catalase (OADC) containing (1) no antibiotics, (2) 1 ug/mL levofloxacin (LFX), (3) 1 ug/mL INH, and (4) 1 ug/mL RIF quadrants (figure 1).4 Cultures were incubated at 37°C for 3 weeks and colony forming units (CFUs) were counted. Resistance was defined as CFUs on antibiotic quadrant >1% of antibiotic-free quadrant.4

Figure 1

Left: Schematic of the quadrant plate used in laboratory for phenotypic detection. INH (1 µg/mL) and levofloxacin (LFX) (1 µg/mL) were used as no-growth controls (100% drug susceptibility) whereas the ‘no antibiotics’ was used as a viability control (100% growth). Right: BCG RIF-R (RpoB S450L) and wild type (WT) strains were prepared in a 1:99 mixture. Cultures were resuspended to 0.5 McFarland standard, serial diluted to 10-2 and 10-4, and plated on 7H10 agar+ oleic acid albumin dextrose catalase (OADC), as per CLSI guidelines. Limit of detection =1% (n=4). INH, isoniazid; RIF, rifampin

GeneXpert MTB/RIF + GeneXpert MTB/RIF Ultra

For both GeneXpert MTB/RIF and GeneXpert MTB/RIF Ultra (hereafter referred to as Xpert and Xpert Ultra, respectively), assays were performed per manufacturer instructions (Cepheid, Sunnyvale, California, USA). Briefly, 0.5 mL of sample +1.5 mL of sample reagent (SR, included with assay) was aliquoted into a 15 mL tube, vortexed for 10 s and incubated at room temperature (RT) for 10 min. The sample was then vortexed for an additional 10 s, incubated for 5 min at RT and added to the GeneXpert cartridge.

Genomic DNA (gDNA) extraction

Strains of interest were grown in 7H9 complete media to an OD600 of 0.8–1. These single cultures were passaged twice before subsequent gDNA extraction. gDNA was extracted from RIF-R (RpoB S450L), INH-R (KatG AA428del), FQ-R (GyrA D94G) and CFZ/BDQ-R (Rv0678c S63R) monoresistant M. bovis BCG strains using the Van Sooligen protocol (previously described)12 whereas gDNA was extracted from M. tb 20527, M. tb 21 697 and BCG INH/FQ-R using the Qiagen UCP Pathogen Mini Kit with a modified mechanical lysis protocol as previously described.13 Concentration (ng/uL) was measured using Quant-iT PicoGreen dsDNA assay per the manufacturer’s protocol (Life Technologies Corporation, Eugene, Oregon, USA). Samples were prepared in triplicate and one aliquot was sent to each of the three laboratories for analysis.

Whole Genome Sequencing (WGS) analysis

For our initial round of proficiency testing, gDNA from resistant BCG strains was prepared in the proportions of 50:50, 10:90 and 1:99 (tables 3 and 4) and sent to the testing laboratories, blinded to sample identity, for both first-line and second-line antibiotic assessment. In the second round, we prepared M. tb 21 697 (pre-XDR) in the proportions of 25:75, 10:90 and 1:99 (M. tb 21697: M. tb 20527). Additionally, four resistant BCG strains were combined (to create a simulated XDR strain) and prepared in proportions of 25:75, 10:90 and 1:99 (resistant: WT strain) as described in table 5. Each laboratory received a single aliquot of each sample to mimic the clinical setting where patient specimens are processed and reported as stand-alone results. Heteroresistant mixtures were then evaluated by WGS using in-house bioinformatic pipelines for isolate characterisation and drug susceptibility as per laboratory-specific protocol (previously described).14–17 Reports were returned and interpreted internally. Both M. tb clinical isolates were sequenced and genomic-based resistance prediction was done using TB-profiler V.4.4.218 with a median depth of coverage >250 x. Sequences of M. tb 20 527 and M. tb 21 697 are available on National Center for Biotechnology Informatics (NCBI) with accession numbers SAMN38121993 and SAMN38121994, respectively.

Table 3

Resistance calling following whole genome sequencing (WGS) of blinded samples

Table 4

Resistance calling following whole genome sequencing (WGS) of blinded samples

Table 5

Follow-up study of whole genome sequencing (WGS) resistance calling by laboratory A


Minority populations at the limit of 1% resistance were detected using the phenotypic assay (APM) (figure 1). As has been previously published, Xpert was unable to accurately detect RIF-R at 50%.10 11 To verify the threshold of detection, the percent of RIF-R bacteria present within the sample was increased, confirming that the assay can detect RIF-R at a proportion of ≥60%. When the same samples were run using Xpert Ultra, minority populations were detected at proportions ≥10%. When this was further diluted to 5%, the Xpert Ultra failed to detect the minority population (table 2).

Proficiency testing

To evaluate the limit of detection (LOD) and capabilities of clinical WGS, we sent mixed monoresistant populations to three reference laboratories. Samples were analysed and a resistance report was returned as would be done to guide ‘patient’ treatment by local medical practitioners (figure 2). All laboratories were consistently able to detect pyrazinamide (PZA) resistance, characteristic of M. bovis BCG due to a mutation in pncA, in addition to 100% RIF-R (RpoB I491F) and 100% STR-R (RpsL K43R) control strains. A second RIF-R (RpoB S450L) strain was mixed with an INH-R strain (KatG del428 mutant) in the proportions described in table 2. Two of the three laboratories were able to detect the INH-R mutant. The third laboratory detected the mutation in their variant calling but it was not included in the final resistance report. For RIF-R detection, all three laboratories detected minority RIF-R populations at 10% but none were able to detect at 1% (table 2).

Figure 2

Preparation of heteroresistant strains and subsequent workflow from initial assessment. For first-line antibiotics ‘strain A’ =RIF-R BCG (S450L) and ‘strain B’=INH-R BCG (KatG428del). For second-line antibiotics ‘strain A’ =FQ-R BCG (GyrA D94G) and ‘strain B’ =CFZ/BDQ-R (Rv0678c S63R). For both first-line and second-line antibiotics ‘strain A’ composed the minority population whereas ‘strain B’ was the majority. Following whole genome sequencing (WGS) and variant calling, resistance reports were generated, and drug susceptibility was determined. Figure created with BDQ, bedaquiline; FQ, fluoroquinolone; INH, isoniazid; RIF, rifampin

An attractive characteristic of clinical WGS is that numerous antibiotics can be assessed using a singular assay. We evaluated second-line resistance detection using the FQ-R (GyrA D94G) and CFZ/BDQ-R (Rv0678c S63R) strains described in table 3. Concordant with results for first-line antibiotics, the LOD for second-line detection was 10%. Two out of three laboratories could accurately detect FQ-R at 10% of total sample, but not at 1%. One laboratory was unable to detect FQ-R at either 1% or 10%. Comparably, CFZ/BDQ-R was also accurately reported by two out of three laboratories. The third laboratory detected the mutation (Rv0678c S63R) in the variant calling but did not include it in the final resistance report at any proportion. This was remedied in later testing, following an update to the resistance catalogue used in their pipeline.

Follow-up testing

In subsequent proficiency testing we assessed: (1) the detection of pre-XDR M. tb isolate, at 25%, 10% and 1% (samples A–C); and (2) the detection of an ‘XDR’ strain, comprised of four monoresistant BCG strains on a WT background, again at 25%, 10% and 1% (sample D–F). Two out of the three laboratories from the first round of proficiency testing participated in this follow-up assessment (table 5).

Laboratory A

Laboratory A had a median coverage of ~159 x.


At 25% (sample A), the proper resistance call was made with a high degree of confidence, at a depth of ~90 x. At 10% (sample B), this same sample was called ‘non-MDR’. This was due to the fact that the genetic marker for RIF resistance was not detected, despite the correct calling of the other resistance-associated single nucleotide polymorphisms (SNPs) present (table 5). At 1% (sample C), resistance-associated SNPs were not detected, resulting in a final call of ‘non-MDR’. However, a PncA mutation (V131I), not present in our clinical isolates, was detected in this sample.


Samples D–F contained mixtures of five BCG strains (four monoresistant strains on a WT background). At 25% (sample D) and at 10% (sample E), the monoresistant SNPs were detected. At 1%, only one of five SNPs was detected (RpoB S450L). As a result, samples D and E were correctly called as XDR whereas sample F was called ‘MDR’ due to the identification of the RpoB S450L, as well as a BCG-associated mutation in MmaA3 that has been linked to low-level INH resistance.19

Laboratory B

Despite the DNA meeting the minimum requirements for quality and quantity, four of six sequencing runs provided insufficient depth of coverage for definitive resistance genotyping. However, first-line resistance profiles of our blinded samples were returned and interpreted (median depth of coverage ~13 x). Sequencing could not be repeated due to a lack of remaining material.


Only one first-line SNP could be confidently reported (indicated as more than four high quality mutant reads) in our M. tb clinical isolate mixtures (KatG S315T, sample A). This mutation was detected through the visual interpretation of data rather than through automated calls.


For samples D and E, resistance SNPs were correctly identified in BCG heteroresistant mixes with an LOD of 10%. 1% heteroresistance could not be interpreted due to insufficient depth of coverage (2.5 x).


Adequacy of TB treatment hinges on effective and accurate diagnosis. Recently, heteroresistance and mixed infections have become a valid and rising concern in TB diagnosis and treatment.2 Previous studies have shown that up to 20% tested clinical isolates contain heteroresistant populations, which is particularly concerning as these mixed populations may lead to treatment failure and subsequent DR-TB.20–22 Without adequate diagnostics, patient care and public health may be compromised: not only will the patient be exposed to the adverse effects of ineffective antibiotics, but DR-MTB may be selected due to inadequate treatment. Our results confirm that phenotypic testing, such as APM, is able to detect resistant populations down to 1%, whereas current molecular tests cannot. WGS offers the promise of detecting mutations across the complete genome, beyond the limited regions covered by probe-based assays such as the GeneXpert MTB/RIF assays. However, like the GeneXpert Ultra, this method has an LOD of roughly 10%, one order of magnitude higher than APM.

Our study did not evaluate the clinical utility of WGS. Rather, given that there are laboratories offering these results, we set out to evaluate whether the methodological variances between these laboratories affect the detection and reporting of resistance-associated mutations. Through this study we observed three main caveats to clinical WGS. First, if the in-house sequencing and analysis pipelines are not optimised to include allele frequencies below 10%, these variants will be filtered from analysis despite being present. New molecular tests with increased depth of coverage, such as the Deeplex assay, have been shown to lower the LOD to 5%.23 However, as seen in our study, variant calling at a lower frequency and read depth may lead to an increase of false-positive calls due to contaminant species reads or sample cross-contamination.24 25 Second, if the mutation is non-canonical (or otherwise absent from the laboratory resistance mutation catalogue), the sample may be reported as drug susceptible, as we observed for the KatG del428 strain. For the Rv0678c S63R mutant, the role of this mutation in BDQ is contested26 which highlights the need for studies using allelic exchange to ascertain which resistance-associated mutations are the cause of the phenotype. These findings emphasise the importance of constantly updating resistance mutation catalogues when using genotypic DST. Third, implementing WGS into the clinical diagnostic pipeline requires technical proficiency (ie, tools to detect the mutations) as well as translational proficiency (ie, effective communication of results to physicians). The standardisation of reporting is crucial to the effective treatment of patients.

Clinicians supervising TB care and laboratories offering molecular testing should be aware of these analytical limitations. For example, in laboratory A, sample B (10% pre-XDR population) was returned as ‘non-MDR’. The clinical implication of this is clear: a clinician who reads that an isolate is ‘non-MDR’ may take no further measures to confirm the validity of this call and will continue with first-line antibiotics. In this case, had patient X had a 10% heteroresistant infection, treatment with RIF, INH, PZA and ethambutol would likely be unsuccessful, given that the organism is resistant to three out of four first-line drugs, fluoroquinolones, and streptomycin.

We acknowledge that this study has limitations. Only three laboratories accepted our invitation to participate in this quality control study and were only provided a single aliquot of each sample to be analysed using their WGS pipelines. This was done to mimic conditions in a clinical laboratory setting in which (1) each patient sample is processed as a single biological unit and (2) access to patient samples may be limited. Regardless, this study underscores the importance of evaluating new molecular methods for their ability to detect minority populations and offers a straightforward method to do so. Another limitation was the observed methodological variability between the different labs, including their approach to reporting first-line and second-line antibiotic resistance. The lack of standardisation in the field means that extrapolating findings in this study to laboratories beyond those that have participated must be done cautiously. Finally, a third limitation is our use of the traditional 1% threshold for phenotypic resistance. Our data indicate that targeted and genome-wide molecular assays can only detect down to 10% resistance, but the proportion that is clinically relevant remains to be defined. With rates of DR-MTB rising worldwide, a research priority is determining the proportion of resistance that is clinically relevant, to best inform clinicians on the optimal therapy to offer their patients.

One potential solution to the limitations we have documented is for reference laboratories to retain phenotypic testing capacity, for cases not responding to therapy and/or for a random sample of isolates, to ensure concordance of molecular predictions with phenotypic results. Regulators and developers of novel diagnostic tests must also consider the importance of detecting minority or emerging resistance populations, and the consequences of ignoring them. This is exemplified by the WHO’s ‘Target Product Profile for next-generation drug susceptibility testing for M. tb at peripheral centres’, in which the minimal requirement for minor variant detection is 20%.27 The consequence of this adjustment is currently unknown. Though it is crucial to have patients started on antibiotic therapy as quickly as possible, it is also necessary for this therapy to be correct.


Molecular tests, both targeted to resistance-associated loci, or genome wide, offer newer and potentially faster means of detecting DR-, MDR-, and XDR-MTB. The GeneXpert assay is commonly used in many high-burden DR-MTB settings, despite having an LOD of 60%. Comparatively, we noted that the sensitivity of GeneXpert Ultra and WGS had similar LODs at around 10%, which is 10-fold higher than standard phenotypic testing (1%). Ultimately, with the introduction of new methods, and the increasing recognition of heteroresistant infections, clinical gaps remain and must be addressed to ensure the best patient care.

Data availability statement

All data relevant to the study are included in the article. Genomic sequence data are available at NCBI: SAMN38121993 and SAMN38121994.

Ethics statements

Patient consent for publication

Ethics approval

Not applicable.



  • SND and OES contributed equally.

  • Contributors SND, OES and MAB conceived of the study, prepared the materials, performed phenotypic testing, analysed the data and wrote the final manuscript. JSM contributed the GeneXpert Ultra analysis. TAK, VD, IB, CU, SN, DvS, RA and JvI contributed to the WGS testing and analysis. All authors reviewed the final version of the manuscript. MAB is the guarantor.

  • Funding Funding was provided through the Canadian Institutes of Health Research (FDN-148362).

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.

Linked Articles