Rationale Sensitive outcome measures applicable in different centres to quantify and track early pulmonary abnormalities in infants with cystic fibrosis (CF) are needed both for clinical care and interventional trials. Chest CT has been advocated as such a measure yet there is no validated scoring system in infants.
Objectives The objectives of this study were to standardise CT data collection across multiple sites; ascertain the incidence of bronchial dilatation and air trapping in newborn screened (NBS) infants with CF at 1 year; and assess the reproducibility of Brody-II, the most widely used scoring system in children with CF, during infancy.
Methods A multicentre observational study of early pulmonary lung disease in NBS infants with CF at age 1 year using volume-controlled chest CT performed under general anaesthetic.
Main results 65 infants with NBS-diagnosed CF had chest CT in three centres. Small insignificant variations in lung recruitment manoeuvres but significant centre differences in radiation exposures were found. Despite experienced scorers and prior training, with the exception of air trapping, inter- and intraobserver agreement on Brody-II score was poor to fair (eg, interobserver total score mean (95% CI) κ coefficient: 0.34 (0.20 to 0.49)). Only 7 (11%) infants had a total CT score ≥12 (ie, ≥5% maximum possible) by either scorer.
Conclusions In NBS infants with CF, CT changes were very mild at 1 year, and assessment of air trapping was the only reproducible outcome. CT is thus of questionable value in infants of this age, unless an improved scoring system for use in mild CF disease can be developed.
This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 3.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/3.0/
Statistics from Altmetric.com
What is the key question?
Is chest CT a reliable surrogate outcome as a clinical tool or as an end-point in clinical trials in 1-year-old infants with cystic fibrosis (CF) diagnosed by newborn screening?
What is the bottom line?
No, because structural changes detected on chest CT were generally very mild and, with the exception of air trapping, inter and intraobserver agreements on CT scores were poor using the standard Brody-II scoring system.
Why read on?
Chest CT is of questionable value in infants of this age and thus should not be used routinely; development of an improved scoring system for use in mild CF disease is urgent.
Widespread newborn screening for cystic fibrosis (CF) has resulted in early diagnosis and the potential for early intervention before changes in lung function and structure become irreversible. Sensitive outcome measures to quantify and track early abnormalities in infants and young children are needed both for clinical care and interventional trials. However, early intervention studies are likely to be of considerable duration and involve treatments with possible side effects. Such studies should therefore not be undertaken without ensuring that any risk is justified by a reasonable likelihood of obtaining useful information.
CT of the chest has been advocated as a sensitive surrogate measure of early lung disease,1–4 since bronchiectasis and gas trapping have been detected in newborn screened (NBS) infants with CF,5–7 and a recent international seminar concluded that chest CT was a useful outcome for interventional trials in very young children with CF.8 Despite increasing publications in this field,5–7 ,9 ,10 the challenges in obtaining standardised chest CTs at consistent lung volumes11 with acceptable radiation exposure in infants, and also identifying a reproducible scoring system, sensitive to very mild lung disease, which can quantify severity of changes in NBS infants with CF, have received relatively little attention. The Brody-II CT score12 is the most widely used and validated scoring system in CF2 ,4 ,13–16 which quantifies lung disease objectively in school-aged children with good interobserver agreement,2 ,12 but its usefulness in young infants with mild disease has not been established. Thus it is difficult to know whether changes identified in this population represent disease, normal variation or experimental error.
A longitudinal observational study of lung function and structure in NBS infants with CF by the London CF Collaboration (LCFC)17 ,18 in which volumetric CT scans were obtained at 1 year of age, provided the opportunity to explore these challenges. Before starting the study, discussions were held with international experts, including those from the Australian Respiratory Early Surveillance Team for CF (AREST-CF), in order to standardise data collection. In the absence of any validated scoring system for use in NBS infants with CF, the Brody-II system was selected. We hypothesised that significant changes would be detected by 1 year of age but that interobserver agreement using Brody-II would be lower in NBS infants with CF than in older children, owing to the greater proportion of subjects with no, or only subtle, abnormalities.2 ,15
The aims of this study were to (a) standardise CT data collection across multiple sites to achieve consistent data quality with an acceptable radiation exposure, (b) ascertain the incidence of bronchial dilatation and air trapping in NBS infants with CF at 1 year and (c) assess the reproducibility of Brody-II in such infants by measuring inter- and intraobserver agreement of scores.
Subjects and methods
NBS infants with CF born between 2009 and 2011 were referred by one of six specialist LCFC centres for this study.17 ,18 Chest CT scanning under general anaesthesia (GA) was performed at three of these centres using standardised protocols at ∼1 year of age as part of the study protocol. The study was approved by the North Thames multicentre research ethics committee (#09/HO71/314). Informed written parental consent was obtained (section 1, see online supplementary data).
Protocol for controlled ventilation during GA
Infants were intubated and ventilated (section 2, see online supplementary data). Atelectasis was minimised by using slow inflations to a peak inspiratory pressure (PIP) of ∼35 cmH2O while maintaining a positive expiratory pressure (PEEP) of 5 cmH2O19 before the scan. Inspiratory scans were obtained during a breath-hold at 25 cmH2O PIP, and expiratory scans at 0 cmH2O. Initial adherence to protocols was variable across centres. Consequently, a team member monitored ventilation (see online supplementary table E3) using the NICO2 respiratory monitor (Philips Respironics, USA).20 (see online supplementary figure E3 and figure E4.)
Thin-section CT scan protocol
CT scans were performed using multidetector CT units (see online supplementary table E1). A predetermined technique for volumetric CT image acquisition was used (see online supplementary table E2; section 3, see online supplementary data). Scanning ranges for inspiratory and expiratory scans were tailored for each infant. The planned radiation dose range for the entire scan was ≤2.0 mSv with a target of ∼1.5 mSv (annual background radiation exposure in the UK ∼2.5 mSv).21–23
CT data collection was completed and scans anonymised before starting scoring. Studies were scored independently without clinical or laboratory information by two scorers (AB: 25 years’ paediatric chest CT experience, 13 years’ scoring CF lung disease; AC: 7 years’ paediatric chest CT experience, 5 years’ scoring CF lung disease) using Brody-II scores.2 ,12 Using this scoring system, comprising five components, the maximum possible subscore is 72 for bronchial dilatation, 27 for air trapping and 243 for total CT; higher scores indicating more severe disease.12 (see online supplementary figures E1 and E2)
The two scorers scored 12 training scans provided by AREST-CF, undertaken with similar protocols in young children with CF aged 1–4 years.5–7 These ‘training scans’ were scored in two batches of six (section 7, see online supplementary data). Scoring of LCFC scans took place within 6 weeks of completing training; scores from both observers being analysed and compared by LPT who was not involved in scoring. LCFC scans with discrepant subscores were returned to both scorers (without details of prior scores allocated) for subsequent reassessment to investigate whether closer agreement might be achieved (section 7, see online supplementary data). A selection of LCFC scans was completely rescored after ∼8 months to assess inter- and intraobserver agreement over time.
Data were inspected for distribution (PASW Statistics V.18, Chicago, Illinois, USA) and summarised using number (percentage), mean (SD) or median (IQR) as appropriate. Agreement between observers was assessed using Cohen's κ statistic with linear weighting (MedCal for Windows, statistical software V.12.3.0, Mariakerke, Belgium). κ Coefficients were similar whether analysed as non-weighted (results not shown) or with linear weighting. κ Results were interpreted as 0–0.2: poor agreement; 0.21–0.4: fair agreement; 0.41–0.6: moderate agreement; 0.61–0.8: strong agreement; 0.81–1.0: excellent agreement.24 Ventilatory pressures and radiation doses between the centres were compared using Kruskal–Wallis with post hoc comparison using Mann–Whitney U tests; adjusted for multiple comparisons so that the family-wise error rate remained at 0.05.
The study was conducted between January 2009 and May 201217 ,18 ,25; chest CT scans at 1 year were performed in 65 NBS LCFC infants. Table 1 summarises clinical details of the infants. At the time of chest CT, all infants were clinically well with no respiratory symptoms.
Verification of adherence to protocols
PEEP during the recruitment inflations was slightly higher than intended (overall median (95% CI) PEEP 7.2 (5.4 to 8.8) cmH2O, and was significantly higher in centre B than C (p=0.012; see online supplementary table E4). PIP during inflation manoeuvres and end-inspiratory breath-hold was close to protocol specifications, with no significant differences between centres.
Median effective radiation exposure across all centres was 1.5 (1.2 to 1.8) mSv, with centres A and B achieving median doses close to the target dose of 1.5 mSv; exposure was significantly higher at centre C (see online supplementary figure E5 and table E5; overall Kruskal–Wallis p<0.0001). Exposures of ≤1.5 mSv were achieved in 58% of infants; 79% received an effective dose of ≤2 mSv. Three infants in centre C received ≥3 mSv; two owing to suboptimal positioning.
Training scan scoring
Interobserver agreement was, on average, fair for bronchial dilatation during training batch 1 (κ=0.27 (95% CI 0.08 to 0.46)) and, after a video conference to discuss discrepancies, moderate for training batch 2 (κ=0.45 (0.17 to 0.72)). During both training sessions greatest agreement was observed for air trapping (κ=0.82 (0.68 to 0.95) for training batch 1 and 0.79 (0.67 to 0.92) for batch 2) (see online supplementary table E6 and figure E6).
LCFC scan scoring
The first round of scoring the LCFC scans (initial LCFC, n=65) started within 6 weeks of training and was completed within a month. Complete rescoring of a selected subset of LCFC scans (rescoring LCFC; n=22) occurred ∼8 months after the initial scoring. As can be seen from table 2, changes were generally very mild, with only seven (11%) infants having a total CT score ≥12 (ie, ≥5% of maximum possible Brody score) according to scorer B, and only two (3%) according to scorer A.
Interobserver agreement between initial and rescoring LCFC rounds
Although discrepancies between scorers with respect to at least one subscore occurred in 50/65 scans, 90% of differences were between a score of 0 (normal), and 1 (minimal to mild disease) (table 2). There was fair agreement for bronchial dilatation and strong agreement for air trapping, both during initial scoring of all 65 LCFC scans and when rescoring (table 3). Scans selected for rescore were representative of those from the entire cohort for the number and severity of changes detected on CT (figure 1).
Scorer B identified more infants with CT changes and generally allocated higher scores than scorer A during initial scoring of LCFC scans, the reverse of that seen during training (see online supplementary figure E6). Scores for air trapping and total score were more similar between scorers during rescoring (figure 1). κ agreement between scorers was initially only fair for bronchial dilatation, with minimal improvement during rescoring, but agreement about the presence or absence of bronchial dilatation or air trapping was consistently achieved in >80% of the scans on initial and rescoring rounds (see online supplementary table E7).
Intraobserver agreement between study rounds
Intraobserver agreement after ∼8 months was only fair for bronchial dilatation (scorer A: κ=0.24 (−0.13 to 0.60); B=0.35 (−0.06 to 0.76)) but strong for air trapping (A: κ=0.72 (0.59 to 0.85); B: κ=0.72 (0.55 to 0.88)). For total CT score, scorer A showed strong while scorer B showed moderate agreement (A: κ=0.66 (0.42 to 0.90); B: κ=0.51 (0.29 to 0.73)) (see online supplementary figure E7). Both scorers detected an identical proportion of changes when rescoring but those identified were not necessarily for the same infants. Challenges were faced in discriminating between very mild changes potentially attributable to bronchial dilatation or airtrapping and normal, even by those with considerable expertise, is illustrated in figure 2.
This is the first study specifically to assess the reproducibility, and hence validity, of CT evaluation of lung disease in infants with CF. Despite the scoring being undertaken by experienced observers with prior training, with the exception of air trapping, the Brody-II score was not reproducible in this age range. The obvious interpretation of these results is that the mild nature of any CT changes at 1 year of age precluded reproducible evaluation of most parameters. A label of bronchial dilatation in the presence of very mild lung disease should therefore be applied cautiously, at least using current methods and definitions. These findings, together with the technical difficulties in standardising acquisition of CT scans across sites, suggest that the use of CT both clinically and as an endpoint in multicentre trials of infants remains extremely challenging.
Strengths and limitations of the study
Standardised protocols for GA, scanning parameters and image acquisition were established to ensure consistent, reliable CT data were obtained between centres. This is the first study to monitor adherence to a specific CT ventilation protocol objectively. Use of both inspiratory and expiratory volumetric scans to evaluate lung disease (the first such study in NBS infants with CF at 1 year28 ,29) reduces the risk of missing subtle abnormalities, thereby increasing the likely accuracy of the reported changes.
We evaluated Brody-II in infants, as previously undertaken in older subjects with CF, by measuring inter- and intra-agreement of scores by two highly experienced scorers, who underwent training using scans from young children with CF immediately before scoring the LCFC scans in an attempt to ensure consistent interpretation.
The main limitation, as with similar studies, was the lack of normal CT scans for comparison owing to concerns about radiation exposure in healthy individuals. Since clinical CT scans in children with normal lungs (eg, screening for metastases) would not include expiratory images, even this group would not provide adequate controls. In addition, at the time of study, few training scans from 1-year-old NBS infants with CF were available. Owing to the time-consuming nature of the reproducibility studies, no other CT scoring system was used, but given that most use components which at least overlap with Brody-II, it is unlikely that the results would have been very different.
Radiological evidence of structural lung disease
Although AREST-CF detected structural abnormalities in 81% of NBS infants with CF at a median age of 3.6 months, bronchial dilatation was only found in 11/57 (19%) at this age,6 and remained low through the first 2 years of life (∼8% at both 1 and 2 years of age) before increasing markedly to ∼36% by 4 years.7 In the most recent publication from this group, prevalence of bronchial dilatation in children with CF during the first 4 years of life was ∼60%,10 ∼80% of whom had evidence of bronchial dilatation at some time during the first 3 years. Bronchiectasis as classically defined refers to irreversible dilatation due to damaged bronchi. The ‘apparent improvement’ in bronchiectasis reported in some of the AREST children with CF might be associated with mild and borderline normal bronchi (see below). The AREST-CF studies also report more air trapping (67% at ∼4 months,6 62% at ∼1 year7 and 69% at ∼3 years10) than in this study. These discrepancies may be partially explained by the fact that in contrast to the AREST-CF study, LCFC children were only studied when asymptomatic. Bronchial dilatation was significantly more likely (60.0% vs 10.2% in asymptomatic) and more severe in AREST infants with CF with respiratory symptoms at the time of CT.6
Use of different scoring systems makes direct comparisons difficult, particularly when attempting to quantify severity of changes. While changes could be identified on at least one Brody-II subscore in 34/65 (52%) of the LCFC infants, the magnitude of these changes was often trivial. Important changes (defined either by visual inspection and/or a total CT score ≥5% maximum possible) were only detected in 2% of infants by scorer A and 11% by scorer B (table 2).
Comparing inter- and intraobserver agreement of CT scores with other studies
The interobserver agreement when using Brody-II in NBS infants with CF contrasts with previous studies in older subjects (including those in which scorers A and B participated, see online supplementary table E8). Previous studies have found that bronchial dilatation is the most reliably reproducible element when evaluating CF lung disease.12 ,15 ,30 The relatively poor agreement in this study probably reflects the subtlety of changes seen. A single scorer scored all the AREST-CF scans with good intraobserver agreement after a 6–12-month interval7 (see online supplementary table E8). Separate assessments for younger children in whom bronchial dilatation was infrequent and milder were not, however, reported. While use of a single dedicated observer to score all scans6 ,7 ,9 might provide more consistent outcomes, such an approach is impractical in clinical practice and unlikely to be either generalisable or feasible in large multicentre trials. In the absence of measures of repeatability, the extent to which inter- and intraobserver variation contributes to the reported CT findings cannot be established.
Definition of bronchial dilatation
Additional problems in interpreting CT scans relate to lack of international consensus on how to define bronchial dilatation, especially in infants. A bronchoarterial ratio (BAR) >1 as specified in Brody-II was used both in this study and CF-AREST. This speeds up evaluation, as judging whether the bronchus is bigger than the adjoining vessel can be assessed subjectively more easily than calculating a ratio. It has been suggested that a threshold of 0.76, rather than 1, should be applied in children,31 ,32 but given the poor inter- and intraobserver agreement even when using BAR≥1 in infants with mild CF lung disease, it is unlikely that this would be effective. Furthermore, measuring changes in small bronchial luminal size to define bronchial dilatation may be beyond current CT spatial resolving ability. The accuracy of assessing BARs, especially in health, is also critically dependent on reliably achieving full lung inflations.33
Technical challenges in acquiring standardised CTs
We experienced several challenges in performing thoracic CT in this age group. Despite clear protocols and briefing the anaesthetic and radiology teams across all centres, variability in the image acquisition parameters—namely, airway pressures and radiation doses delivered was seen. The greater variability in radiation doses in centre C might be due to their slightly different type of scanner (see online supplementary table E1) and/or the fact that it was not possible to organise a dedicated radiographer to perform procedures within that hospital, the latter being a problem likely to be found in clinical practice as well as multicentre trials. The presence of an investigator to monitor all procedures improved compliance, but is unlikely to be feasible in clinical practice or most clinical trials.
To date there is no consensus on the optimal method of acquiring CT scans in young children to ensure maximum information with minimal radiation exposure. After discussions with the AREST-CF team, we adopted their approach of obtaining end-inspiratory scans at 25 cmH2O PIP, and end-expiratory scans at 0 cmH2O, together with recruitment manoeuvres to minimise procedure-related atelectasis. However, whereas we used a volumetric technique that images the entire lung volume, initial studies by AREST-CF consisted of three thin-slice scans during inspiration and expiration.6 ,7 ,9 Limiting the dataset to three images, compared with ≥20 for the volumetric technique, severely limits the number of airways that can be evaluated. In addition, if bronchi were sampled and imaged at the point of bifurcation, this would overestimate the size of the bronchial lumen, potentially leading to overdetection of bronchial dilatation.
Results from this study suggest that both the acquisition and interpretation of CT scans need further evaluation before being applicable either as a research outcome measure or clinical tool in NBS infants with CF at least at 1 year of age. Based on the incidence of bronchial dilatation detected by both scorers in this study, between 190 and 850 infants per group would be required if a randomised trial such as the recent Ivacaftor trial34 were to be extended to infants, in order to detect a reduction in bronchial dilatation of 50% with 90% power at a 5% significance level at 1 year of age; this number would rise further after accounting for those ineligible for such a trial or whose parents decline.35 Suggestions that such a study would be feasible with only 100/group were based on the incidence of bronchiectasis at 4, not 1 year of age.7 Since there is neither knowledge about the long-term clinical significance of mild changes detected in young infants with CF, nor any data to suggest that mild changes lead to alterations in clinical management or long-term clinical outcomes, it is questionable whether the risks of exposing young infants to additional ionising radiation outweigh the benefits. Indeed, as a result of this study, without specific clinical indications, chest CTs are no longer performed in NBS infants with CF at 1 year within the LCFC group.
Before chest CT can be advocated for widespread use, especially in very young children, standardised CT scanning protocols, which demonstrably can be used in multiple centres, in combination with a reproducible scoring system with good intra- and interobserver agreement, are essential. Given the radiation burden and the expense of even limited, low-dose annual CT scans, it is essential to ensure that the information obtained is useful; indeed there is a strong case for a randomised controlled study of whether CT improves outcome, analogous to the recent Australasian bronchoscopy study.36 A more robust approach to CT scoring in infants with CF, in whom changes may be very mild, may be required; current relatively subjective methods could be augmented by publishing visual standards for comparison or by the more widespread use of formal airway measurements and quantitative assessment of air trapping.37
In conclusion, we do not believe that CT is ready for widespread clinical use or as a trial endpoint in the first year of life for NBS infants with CF. Until refinement of CT scoring has been established and validated, we recommend caution in reporting bronchial dilatation in NBS infants with CF, the incidence of which appears to be low in the first year of life.
We thank the infants and parents who participated in this study and gratefully acknowledge contributions by all members of the London Cystic Fibrosis Newborn Screening Collaboration (Ah-Fong Hoo, Ammani Prasad, Andrew Bush, Angie Wade, Anu Shankar, Catherine Owens, Caroline Pao, Colin Wallis, Deeba Ahmed, Gary Ruiz, Hilary Wyatt, Ian Balfour-Lynn, Jane Chudleigh, Jane Davies, Janet Stocks (director), John Price, Lena Thia, Lucy Brennan, Mark Rosenthal, Paul Aurora, Ranjan Suri, Richard Chavasse, Siobhan Carr, Sooky Lum and The Thanh Diem Nguyen); anaesthetists and radiographers from Great Ormond Street Hospital for Children, Royal Brompton and Harefield Hospital (RBH) and the Royal London Hospital (Angus McKewan, Reema Nandi, Sally Wilmshurst, Duncan McCrae, Carolyn Young, Yvonne Sullivan, Anna Walsh, Trupti Patel) and Elly Castellano from RBH for her input into the CT protocol set up and radiation dose measurements; Sarath Ranganathan from the Australian Respiratory Early Surveillance Team for CF (AREST-CF) for his advice on the practical technique and data collection of CT scans under general anaesthesia, Catherine Gangell and Lauren Mott from the AREST-CF team for providing the training CT scans during the training scoring sessions; finally, to Alan Brody and Alistair Calder for scoring the study images.
This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.
Files in this Data Supplement:
- Data supplement 1 - Online supplement
LPT and AC contributed equally.
Collaborators London Cystic Fibrosis Collaboration (LCFC).
Contributors JS and AB were responsible for the conception and design of study; CMO, AC and ASB provided technical advice on imaging and scoring; AM provided anaesthetic advice. JS and LPT were responsible for supervision of the study and for research governance issues, including ethics committee approval. CY and YS supervised the CT imaging. LPT supervised and audited data collection and analyses. Infants with cystic fibrosis were recruited by the paediatric respiratory consultants participating in the London Cystic Fibrosis Collaboration, including AB and CW. LPT and AW performed statistical analyses; LPT, AC, ASB, AB and JS drafted the manuscript; all remaining authors revised and approved the manuscript for intellectual content before submission.
Funding This study is supported by grants from the Cystic Fibrosis Trust, UK (grant no PJ550); Special Trustees: Great Ormond Street Hospital for Children, London, UK (grant ref V0913); Smiths Medical Ltd, UK (grant ref 1GSB); Comprehensive Local Research Network, UK (grant ref May10-01). It was also supported by the National Institute for Health Research Respiratory Disease Biomedical Research Unit at the Royal Brompton and Harefield NHS Foundation Trust and Imperial College London.
Competing interests The authors had no competing interests, except for ASB who received an institutional grant from the Cystic Fibrosis Foundation and NIH, a grant for consultancy work from PTC Therapeutics and provided expert testimony for Calderhead, Lockemeyer and Peschke for other unrelated work. JS received a peer-reviewed institutional grant from CF Trust, UK and Great Ormond Street Children's Charity for this study.
Ethics approval North Thames multicentre research ethics committee.
Provenance and peer review Not commissioned; externally peer reviewed.