Article Text

Download PDFPDF

Original article
Identifying areas and risk groups with localised Mycobacterium tuberculosis transmission in northern England from 2010 to 2012: spatiotemporal analysis incorporating highly discriminatory genotyping data
  1. María Saavedra-Campos1,2,3,
  2. William Welfare4,5,
  3. Paul Cleary2,
  4. Andrew Sails6,
  5. Andy Burkitt7,
  6. Daniel Hungerford2,
  7. Ebere Okereke8,
  8. Peter Acheson9,
  9. Marko Petrovic4,5
  1. 1Field Epidemiology Training Programme, Public Health England, Liverpool, UK
  2. 2Field Epidemiology Services, Public Health England, Liverpool, UK
  3. 3European Programme for Intervention Epidemiology Training, European Centre for Diseases Prevention and Control, Stockholm, Sweden
  4. 4Greater Manchester Public Health England Centre, Public Health England, Manchester, UK
  5. 5Manchester Academic Health Science Centre, University of Manchester, Manchester, UK
  6. 6Microbiology Services, Public Health England, Newcastle upon Tyne, UK
  7. 7Field Epidemiology North East, Public Health England, Newcastle upon Tyne, UK
  8. 8Yorkshire & Humber Public Health England Centre, Public Health England, Leeds, UK
  9. 9North East Public Health England Centre, Public Health England, Newcastle upon Tyne, UK
  1. Correspondence to María Saavedra-Campos, Field Epidemiology Services, Public Health England, 5th Floor, Rail House, Lord Nelson Street, Liverpool, L1 1JF, USA; maria.saavedra-campos{at}


Background Information on geographical variation in localised transmission of TB can inform targeting of disease control activities. The aim of this study was to estimate the proportion of TB attributable to localised transmission for the period 2010–2012 in northern England and to identify case characteristics associated with spatiotemporal-genotypical clusters.

Methods We combined genotyping data with spatiotemporal scan statistics to define an indicator of localised TB transmission and identified factors associated with localised TB transmission thus defined in a multivariable logistics regression model.

Results The estimated proportion of TB cases in northern England attributable to localised transmission was 10% (95% CI 9% to 12%). Clustered cases (cases which were spatiotemporally clustered with others of identical genotype) were on average younger than non-clustered cases (mean age 34 years vs 43 years; p value <0.05). Being UK born (adjusted OR (aOR) 3.6, 95% CI 2.9 to 6.0), presenting with pulmonary disease (aOR 2.2, 95% CI 1.3 to 3.6) and history of homelessness (aOR 2.8, 95% CI 1.2 to 6.8) or incarceration (aOR 2.6, 95% CI 1.2 to 5.9) were independently associated with being part of a spatiotemporal-genotypical cluster in a multivariable model. Belonging to an ethnic group other than white or mixed/other was also significantly associated with localised transmission. We identified localised transmission in 103/1958 middle super output areas mostly in urban areas.

Conclusions Incorporating highly discriminatory genotyping data into spatiotemporal analysis of TB incidence is feasible as part of routine surveillance and can provide valuable information on groups at greater risk and areas with localised transmission of TB, which could be used to inform control measures, such as intensified contact tracing.

  • Tuberculosis
  • Clinical Epidemiology
  • Respiratory Infection

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Key messages

What is the key question?

  • Can we identify risk groups and geographical areas where new TB cases are likely to represent person-to-person transmission, in order to inform targeted control measures?

What is the bottom line?

  • Using a novel method applying spatiotemporal statistical methods to a large UK TB data set with highly discriminatory 24-locus mycobacterial interspersed repetitive unit variable number tandem repeat (MIRU-VNTR) profile for culture-positive cases, we estimated that one in every 10 cases of TB in the North of England was highly likely to be the result of localised transmission and we identified the following case characteristics: birth in the UK, pulmonary disease, a history of homelessness or incarceration and belonging to an ethnic group other than white or mixed/other as being significantly associated with localised TB transmission.

Why read on?

  • We describe a method using freely available statistical software to add a genetic dimension to spatiotemporal analyses, the application of which has provided valuable information on transmission of TB in the North of England, and which has potential applicability to the study of the distribution of other diseases.


The prevention of transmission of TB is the key to reducing incidence. Most TB control programmes rely on contact tracing for the identification of clusters of TB transmission. This method may miss limited or casual contact and thus may underestimate recent transmission.1–3 Several countries have used population based genotyping to assess recent transmission, as represented by the number of genotypically clustered cases.4 Molecular clustering may approximate recent transmission in low incidence populations, but it is not always associated with recent transmission.5

In England, universal typing of TB isolates was introduced in 2010. The national strain typing service uses the 24-locus mycobacterial interspersed repetitive unit variable number tandem repeat (MIRU-VNTR) typing method.6 One of its objectives is to identify clustered cases that are potentially part of a recent chain of transmission as well as the characteristics associated with recent transmission. The programme defines clustered cases as two or more cases with indistinguishable MIRU-VNTR profiles, with at least one case with a complete 24-locus MIRU-VNTR profile. Additional cases in the cluster may each have one missing locus.7 The identification of clustered cases was initially followed by an epidemiological investigation if there were five cases or more resident in the same geographical area under the jurisdiction of a Health Protection Unit, or ten cases within a regional or national boundary in the last 2 years, with two in the previous 6 months.7 However a review of the UK strain typing service in March 2013 found no evidence to suggest that cluster investigations undertaken on the basis of specific thresholds were effective or cost-effective.8 As a result, cluster investigation undertaken this way was abandoned in favour of a more locally targeted approach.

The Centers for Disease Control and Prevention in the USA recently described a method that combines genotyping data with spatial scan statistics to estimate recent TB transmission.9 Spatial scan statistics have been widely used for the identification of clusters of infectious disease.10–13 The availability of genotyping data and data on individual cases from the national Enhanced TB Surveillance (ETS) system has provided an opportunity to combine these data for the North of England. The objectives of this study were to employ this method to (1) estimate the proportion of TB in the North of England attributable to localised transmission for the period 2010–2012, (2) identify the areas where localised transmission is occurring and (3) identify the demographic, geographical, clinical and social risk factors associated with localised TB transmission.


Study population

We conducted a cross-sectional study in which cases were defined as patients resident in the North of England (defined as Public Health England region, which includes North-East of England, Cumbria and Lancashire, Yorkshire and the Humber, Greater Manchester and Cheshire and Merseyside) at the time of diagnosis with a positive culture of Mycobacterium tuberculosis from a clinical specimen and with complete 24-locus MIRU-VNTR profile between 2010 and 2012. We obtained the cases from the ETS system. TB nurses and physicians routinely submit to the ETS system information on case demographics, clinical characteristics and risk factors for all TB cases notified in England, which is then linked to microbiological and molecular data. For cases with more than one isolate and discordant genotyping we used the first isolate for the analysis. For cases that did not have a recorded date of onset of symptoms, this was approximated using other dates in the data set and for cases with no recorded residence, the postcode of the hospital reporting the case was used as a proxy measure of residence in order to reduce potential selection bias. Case locations were geocoded with ArcGIS V.10.0,14 using the postcode of residence to assign cases to middle super output areas (MSOA; MSOAs are well-defined census neighbourhoods with a population between 5000 and 15 000 people).15 The final study population included all cases with confirmed culture-positive TB with a complete 24-locus MIRU-VNTR profile, a MSOA within the North of England and onset of symptoms between January 2010 and December 2012 inclusive.

Genotype and geospatial clustering

As an indicator of localised TB transmission, we defined clustered cases as cases with an exact match on all 24 MIRU-VNTR loci within geospatial zones with statistically significant spatiotemporal scan statistics obtained from SaTScan V.9.2.16 SaTScan detects clusters in space and time using a moving cylindrical window method, where the base represents space and the height represents time.17 For each window, a log-likelihood ratio is generated, comparing observed with expected cases inside and outside the moving window. A p value representing how likely the observed data for a given area and period would be if cases were occurring randomly in accordance with a Poisson distribution and independently of each other is assigned to each window using Monte Carlo simulation.17 Clusters with a p value of <0.05 were regarded as statistically significant. In our study the temporal window or longest duration of a cluster was set to 3 years, the full study period and the spatial window was set to 50 km. We selected the option for no geographical overlap for reporting secondary clusters.

As the number of cases by genotype was small, we ran SaTScan with all the cases together to first define the spatiotemporal clusters. We then identified cases that had a genotype that appeared more than once within each significant spatiotemporal cluster. For the purpose of this study these cases were considered to be those more likely to represent localised TB transmission. The cases that did not fulfil these criteria were considered more likely to be cases of remotely acquired TB or reactivation of old TB.

We used MSOA population estimates for 2010 from the Office for National Statistics,18 for all 3 years (the only estimates available at the time of the analysis).

We used the ‘n–1’ method19 and genotype clustering (two or more cases with complete identical genotype) as comparator methods for estimating recent TB transmission.

Statistical analysis

We used logistic regression to examine demographic (ie, age, sex, ethnicity, country of birth and year of entry in the UK) and clinical (ie, clinical presentation, sputum smear status, previous diagnosis and patient receiving directly observed therapy (DOT)) characteristics and risk factors (ie, having a social risk factor, homelessness, imprisonment and drug or alcohol abuse) potentially associated with recent TB transmission. We performed single variable analysis using Pearson's χ2 test and Fisher's exact test. All variables with a significance value of <0.2 in the single variable analysis were included in the regression model. We used a backwards stepwise approach to identify a final model, eliminating variables with the highest χ2 values first and examining at each step for possible confounders. The analysis was performed using R V.3.0.3 (The R Foundation for Statistical Computing, Vienna, Austria).


The overall annual incidence of TB for the North of England was 10.7 cases per 100 000 population in 2010, 10.8 in 2011 and 10.7 in 2012. Between January 2010 and December 2012, 4765 cases of TB were reported to ETS. Of these, 2845 (60%) were culture-positive. The proportion of culture-positive cases with a complete genotype result has increased over the 3 years; from 41% (387/942) in 2010; to 57% (562/982) in 2011 and to 58% (535/921) in 2012. A total of 2782 (98%) isolates were characterised using MIRU-VNTR typing. A total of 2090 (75%) had a result with at least 23 loci and 1547 (54%) had a full 24-locus MIRU-VNTR profile. Six cases had multiple isolates and discordant genotypes. We included 1484 TB cases in the analysis, representing 52% of culture-positive cases. Of those 1146 (78%) had a unique 24-locus MIRU-VNTR. A total of 13/94 cases reporting history of being homeless did not have a postcode. For these cases, we used the postcode of the reporting hospital or laboratory. Only one of these cases was classified as being part of a cluster. Therefore, this procedure was unlikely to cause artificial clustering around reporting hospitals or laboratory.

Estimation of the proportion of TB in the North of England attributable to localised transmission

Of the 1484 cases included in the analysis, 153 (10%; 95% CI 9 to 12) were clustered in space and time and by genotype using the method described in this paper. Eighty-five per cent of clusters had four members or less and 58% of the clusters had only two members (median cluster size 2; range 2–14) (figure 1). The estimate of the proportion due to recent transmission varied depending on the method used, from 32% of all cases in our study population being attributed to recent TB transmission if only genotyping clustering was used to a more conservative estimate of 10% with the method described in this paper (figure 1). Using the ‘n–1’ method the proportion of clustered cases in our study population was 23%.

Figure 1

Number of cases of TB reported to Enhanced TB Surveillance (ETS), including culture-positive cases, genotyped cases and clustered cases by the different methods, northern England, 2010 to 2012. †This number includes incomplete 24-locus mycobacterial interspersed repetitive unit variable number tandem repeat (MIRU-VNTR) profiles too. ‡Cases with an onset date that was not between January 2010 and December 2012 were removed. *Genotype clustering defined as two or more cases with identical 24-locus MIRU-VNTR profile identified in northern England. **Calculated using the number of cases that were part of a cluster as per genotype clustering definition minus the total number of clusters identified and divided by the total number of cases with a full 24-locus MIRU-VNTR. ***Two or more cases with the same 24-locus MIRU-VNTR profile found to be spatiotemporally clustered by SaTScan.

Areas where localised transmission was found to be occurring

Of all clustered cases, 3% (5/153) were residents in the North-East, 14% (21/153) in Cumbria and Lancashire, 37% (57/153) in Greater Manchester and 46% (70/153) in Yorkshire and Humberside. No clustering was identified in Cheshire and Merseyside. The proportion of all cases that were clustered (ie, consistent with localised transmission) varied by geographical area. In the North-East, 3% (5/164) of all cases were clustered compared with 13% (21/183) in Cumbria and Lancashire, 13% (70/591) in Yorkshire and Humberside and 15% (57/429) in Greater Manchester. We identified clustered cases in 103 MSOAs out of a total of 1958 that constitute the North of England. Bradford, Manchester and Kirklees local authorities accounted for the majority of these MSOAs (46%; 47/103) (figure 2).

Figure 2

Proportion of TB clustered cases by middle super output areas (MSOAs) and local authority representing the areas where we identified localised TB transmission in northern England, 2010–2012. The map shows the local authorities in which localised transmission was identified and the proportion of clustered cases per 1000 by MSOA within each of the local authorities. The number of MSOAs within each local authority in which we identified localised TB transmission are: Blackburn with Darwen (7), Bolton (3), Bradford (20), Bury (2), Calderdale (3), Doncaster (1), Kirklees (13), Lancashire (7), Leeds (6), Manchester (14), Middlesbrough (3), North Yorkshire (1), Oldham (7), Rochdale (4), Salford (3), Sheffield (3) and Tameside (6).

Identification of characteristics associated with localised TB transmission

Single variable analysis

Clustered cases were on average younger than non-clustered cases (mean 34 years old compared with 43 years; p value <0.05), and almost three times more likely to be UK born (OR 2.7, 95% CI 1.4 to 33.0) (table 1). Having entered the UK in the previous 2–4 years increased the risk of being part of a cluster by OR 2.2 (95% CI 1.0 to 4.7) compared with entering in the year previous to diagnosis (table 1).

Table 1

Demographic characteristics of clustered cases or cases involved in recent TB transmission events; North of England, 2010–2012

Presenting with pulmonary disease was associated with being part of a cluster (OR 2.3, 95% CI 1.5 to 3.7) (table 2). Cases having a social risk factor were two times more likely to be part of a cluster (OR 2.1, 95% CI 1.2 to 3.4) (table 2). Having a history of homelessness (OR 3.0, 95% CI 1.3 to 6.5), being homeless at the time of diagnosis (OR 4.1, 95% CI 1.1 to 12.8) and having been to prison (OR 2.4, 95% CI 1.1 to 5.0) were all associated with being part of a cluster (table 2). Among the cluster cases we identified nine children. Seven of them were born in the UK. Of the other three only one entered the UK within 1 year, the rest had entered the UK more than 5 years ago. Out of the nine, four were of Pakistani origin, three were of black African origin and two were mixed/other.

Table 2

Clinical and social risk factor characteristics of clustered cases or cases involved in recent TB transmission events; North of England, 2010–2012

Multivariable analysis

In the final multivariable model, being UK born, being homeless, past incarceration and pulmonary disease remained statistically significantly associated with being part of a cluster (table 2). Cases belonging to an ethnic group other than white or mixed/other were more likely to be clustered, especially cases of black Caribbean origin (aOR 5.6, 95% CI 1.7 to 23.0) or Pakistani origin (aOR 4.9, 95% CI 2.8 to 8.6) (table 1).


We estimate that 10% of TB cases in the North of England in 2010–2012 were the result of localised transmission. Being UK born, homelessness, incarceration, presenting with pulmonary disease and being of an ethnic background other than white or mixed/other were associated with localised TB transmission. Our estimate of localised TB transmission showed geographical variation, although due to artificial geographical boundaries local estimates may be an underestimate. Our estimate differs from the estimate using the same method in the Centers for Disease Control and Prevention study.9 This might be due to differences in TB epidemiology but the smaller sample size and the use of 24 as opposed to 12-locus MIRU-VNTR may also contribute.5 ,20 In the absence of detailed epidemiological data for all cases of TB, the use of routine epidemiological data together with genotyping data offers an alternative way to assess localised TB transmission at subnational and local levels. This method obtained a more conservative estimate compared with methods previously used, allowing more targeted investigation of clusters. In addition, the method was easy to perform, could be run regularly and requires minimum training.

Studies estimating recent TB transmission have mostly relied on the identification of cases with indistinguishable MIRU-VNTR profiles, assuming that they represent recent transmission. This is not necessarily the case and inferences based on crude estimates need cautious interpretation. Traditional contact-tracing only uncovers a small proportion of epidemiological links between patients in the same molecular cluster.1 ,3 ,21 Intensive contact-tracing can identify a much higher proportion of links,22 but its application is limited by the available resources especially TB nurse staffing levels.

A precise estimate of recent transmission requires information on parameters including the annual rate of TB infection, existence of effective control programmes, intensity of contact tracing activities, population movements, the number of different circulating strains in the past and new introductions and the number of people infected per case, which may not be known.23 Interpretation of clustering results based only on genotypical data is challenging. A combination of molecular data with spatial scan statistics provides an innovative way of defining TB clusters. Although there is uncertainty about which definition describes localised TB transmission most accurately, adding a spatiotemporal component has the advantage of identifying areas where the number of cases is higher than expected and where transmission may be ongoing. If performed routinely the method could be used to inform and evaluate TB control activities.

The characteristics identified as associated with being part of a cluster are compatible with reports from London, Denmark and the USA.24–27 A previous US study that used similar methods to predict TB outbreaks also reported that the presence of characteristics such as homelessness and incarceration in at least one of the three first cases in a cluster indicated a likely outbreak.27 Transmission in these groups could reflect chaotic lifestyles or differences in health-seeking behaviour or access to health services that result in delayed diagnosis. They could also reflect complex transmission patterns not yet well understood. Routine contact tracing may miss transient contact in these groups.

The study has a number of limitations. First, our sample population did not include all cases reported to ETS as only 54% (1547/2845) of the cases were culture-positive and had a complete genotype. This is likely to have resulted in an underestimate of recent transmission, if we assume that non-cultured cases were also infectious. Second, our study period was only 3 years. Increasing the length of the study could also increase the proportion of clustered cases. However, using a longer study period can make interpretation of results challenging as some of the cases are likely to be due to reactivation of TB acquired earlier rather than to localised transmission.28 Increasing the length of the study may increase the number of cases due to localised transmission only to reach a plateau determined by the mutation rate of TB.19 The mutation rate influences the degree of clustering, but uncertainty remains regarding the mutation rate of TB, with several studies suggesting a low frequency of MIRU-VNTR genotype changes.29 ,30 This uncertainty complicates the interpretation of methods using MIRU-VNTR data and highlights the need to estimate the underlying mutation rate. Novel approaches such as whole genome sequencing are encouraging and are likely to improve our understanding of TB microevolution, heterogeneity and transmission.31–34 Third, although the definition used in our study is more complex than relying only on genotyping data, it is still only a surrogate for localised TB transmission and does not take into account other factors that affect the interpretation of the results such as migration and diversity of underlying strains circulating or new introductions. For instance, similar ethnic groups might settle in the same areas and introduce genotypes that may reflect common strains in the country of origin and may not reflect local transmission.35 Fourth, by limiting the study to the North of England we were likely to miss cases whose exposure or transmission to others might have occurred outside this geographical area.

Our study suggests that efforts to control TB could target recent or ongoing transmission by prioritising those with identified risk factors. Improving access to healthcare and more intense contact-tracing efforts might be needed but are likely to require additional resources. Routine use of the method described in this paper could assist TB control strategies at a local level by identifying and monitoring where TB transmission events are occurring and the risk factors associated with these events.


The authors thank Dr Maeve Lalor (Public Health England) for providing data from the national TB data set (matched ETS and laboratory data), Sanch Kanagarajah (Public Health England) for technical assistance, Sara Sarginson and Deborah Osborne (Public Health England, Newcastle Laboratory) for performing the MIRU-VNTR typing, the TB nurses and physicians that input data onto ETS, Dr Sam Bracebridge (Public Health England) and Dr Yvan Hutin (ECDC) for their helpful comments. The authors also thank the PHE National Strain Typing Programme Board for their support.



  • Authors note Dr Marko Petrovic died on 13 March 2015, between submission and publication of this paper. He worked extensively and tirelessly on TB control in North West England. We dedicate this paper to his memory.

  • Contributors WW and MP conceived the study. MS-C and PC performed the analysis. MS-C wrote the first draft. All authors supported interpretation and revised the manuscript.

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.

Linked Articles