Background: Measures of oxygenation have not been assessed for prognostic significance in systemic sclerosis-related interstitial lung disease (SSc-ILD).
Methods: 83 subjects with SSc-ILD performed a maximal cardiopulmonary exercise test with an arterial line. The agreement between peripheral oxygen saturation (Spo2) and arterial oxygen saturation (Sao2) was examined and survival differences between subgroups of subjects stratified on Spo2 were analysed. Cox proportional hazards analyses were used to examine the prognostic capabilities of Spo2.
Results: At maximal exercise the mean (SD) difference between Spo2 and Sao2 was 2.98 (2.98) and only 15 subjects had a difference of >4 points. The survival of subjects with SSc-ILD whose maximum exercise Spo2 (Spo2max) fell below 89% or whose Spo2max fell >4 points from baseline was worse than subjects in comparator groups (log rank p = 0.01 and 0.01, respectively). The hazard of death during the median 7.1 years of follow-up was 2.4 times greater for subjects whose Spo2max fell below 89% (hazard ratio 2.4, 95% CI 1.1 to 4.9, p = 0.02) or whose Spo2max fell >4 points from baseline (hazard ratio 2.4, 95% CI 1.1 to 5.0, p = 0.02).
Conclusion: In patients with SSc-ILD, Spo2 is an adequate reflection of Sao2 and radial arterial lines need not be inserted during cardiopulmonary exercise tests in these patients. Given the ease of measurement and its prognostic value, Spo2 should be considered as a meaningful clinical and research outcome in patients with SSc-ILD.
Statistics from Altmetric.com
Measurement of static pulmonary physiology1–4 and, more recently, assessments of high-resolution CT (HRCT) scans5 have been shown to provide important prognostic information about patients with fibrosing interstitial lung disease related to systemic sclerosis (SSc-ILD). For patients with idiopathic fibrosing ILD, assessments of blood oxygenation—particularly those done while patients are exerting—have also been shown to predict outcome.6–8
Measures of peripheral oxygen saturation (Spo2) are used to estimate arterial blood saturation (Sao2); however, the validity of the relationship between Spo2 and Sao2 is dependent on a number of factors including the adequacy of peripheral perfusion.9 Given recent data questioning the reproducibility of the level of desaturation (as measured by Spo2) over short time intervals in patients with idiopathic pulmonary fibrosis (IPF)10 but continued enthusiasm to use this easily obtainable measure in patients with fibrosing ILD, we sought to investigate Spo2 in patients with SSc-ILD—a disease in which peripheral circulation is often impaired, an impairment considered to preclude reliable interpretation of peripheral oxygen saturation assessments. We performed this study to test two hypotheses: (1) Spo2 would inaccurately reflect Sao2 at rest and the inaccuracy would be even greater during maximal exercise; and (2) despite the hypothesised inaccuracy, peripheral measures of blood oxygenation during exertion should provide significant prognostic information in patients with SSc-ILD. Thus, besides assessing the usefulness of Spo2 as a prognostic marker, a goal of the study was to determine whether an arterial line is needed to accurately assess exertional oxygenation in patients with SSc-ILD.
We identified 83 patients with SSc-ILD and no signs of pulmonary hypertension on physical examination who were evaluated in the ILD Program at National Jewish Health with pulmonary function tests and a maximal cardiopulmonary test between 1983 and 2005.
All subjects were evaluated clinically using a standard protocol focused on the identification of the cause of ILD. Patients with SSc met the diagnostic criteria adopted by the American College of Rheumatology;11 those with SSc sine scleroderma (ssSSc) met the criteria suggested by Poormoghim and colleagues.12 Patients with overlap syndromes were excluded. In patients with SSc the diagnosis of ILD was made by surgical biopsy (n = 17), chest radiography (n = 60) or CT scanning (n = 4).
Maximal cardiopulmonary exercise test
All subjects underwent maximal cardiopulmonary exercise testing at our institution according to a standardised protocol. A radial arterial line and a peripheral pulse oximeter were placed before commencing exercise. The peripheral pulse oximeter was placed on the index finger of the hand opposite the arterial line. If an adequate pulse oximeter signal was not obtained (inability to obtain correct pulse), an earlobe probe was used. In more than 95% of subjects, finger probes were used. Baseline measurements were collected after the patient mounted the cycle ergometer (Vmax 29; SensorMedics, Yorba Linda, California, USA) and just before beginning to pedal. Subjects pedalled for 3 min at 60 rpm and then work was incrementally added every minute; the goal was to reach a subject’s maximal exercise capacity within 6–12 min. Blood was drawn from the arterial line at baseline and after every minute of exercise, and analysed with a co-oximeter (to measure the percentage of arterial oxygen saturation (Sao2)) and blood gas analyser.
Categorical data are presented as counts or percentages. Continuous data are presented as mean values with standard deviation (SD) or median values with interquartile range (IQR). Graphical methods, including Bland-Altman plots,13 were used to display agreement between Spo2 and Sao2; for each subject the average of the Spo2 and Sao2 at maximal exercise was plotted against the difference between the Spo2 and Sao2 at maximal exercise. In certain analyses we used a difference between the Spo2 and Sao2 of >4 percentage points as the variable of interest. Four percentage points was chosen as the difference value of significance because this difference falls outside the intrinsic variability of the pulse oximeters used and is felt to be clinically relevant.7 8 The product-limit method was used to estimate survival probabilities and the Kaplan–Meier method was used to generate survival curves which were compared using the log-rank test. Cox proportional hazard models were used to examine the prognostic capabilities of Spo2. We confirmed the proportionality assumption was met for the dichotomised Spo2 variables by examining log(-log) plots. All statistical analyses were performed using SAS Version 9.1 (SAS Institute, Cary, North Carolina, USA). A p value of <0.05 was considered to be statistically significant.
The baseline characteristics of the subjects are shown in table 1. Spo2 overestimated Sao2 both at rest and at maximal exercise. The mean (SD) difference between Spo2 and Sao2 (keeping overestimates as positive values and underestimates as negative values) was similar at maximal exercise and rest (1.5 (4) vs 1.5 (3), p = 1, fig 1). Similar results were obtained when absolute values of the difference between Spo2 and Sao2 were used (2.24 (1.91) vs 2.98 (2.98), p = 0.06). Spo2 misclassified four subjects at maximum exercise (Spo2 observed to be >88% but Sao2 was ⩽88%). For these subjects, median (IQR) values for baseline Spo2, Sao2 and arterial oxygen tension (Pao2) and maximum exercise Spo2, Sao2 and Pao2 were 91.5 (89–96), 88.5 (88–90), 61 (59.5–67), and 89.5 (89–93), 82 (79–88), 47.5 (43.5–62.5).
Over the study period (median follow-up 10.3 years, IQR 4–17) there were 40 deaths. Truncating follow-up at 20 years, we observed 39 deaths over the study period (median follow-up after truncation 7.1 years). Median survival was 9.5 years (IQR 4–16). Subjects whose Spo2 at maximum exercise (Spo2max) fell to <89% had shorter survival than those whose Spo2max remained ⩾89% (p = 0.01, log rank test; fig 2). The results were similar when stratifying subjects on whether Spo2max fell >4 points from baseline (p = 0.01, log rank test). In Cox proportional hazards models, Spo2 was a significant predictor of mortality; over a median 7.1 years of follow-up, the risk of death was 2.4 times greater for subjects whose Spo2max fell below 89% (hazards ratio (HR) 2.4, 95% CI 1.2 to 4.9, p = 0.02) than for subjects whose Spo2max remained ⩾89%. Similarly, the risk of death was 2.4 times greater for subjects whose Spo2max fell >4 points from baseline (HR 2.4, 95% CI 1.1 to 5.0, p = 0.02) than for subjects whose Spo2max remained within 4 points of baseline values. When analysed as a continuous variable, the difference between baseline Spo2 and Spo2max remained a significant predictor (HR 1.08, 95% CI 1.03 to 1.14, p = 0.002). Controlling for FVC%, the difference between baseline Spo2 and Spo2max (continuous variable) remained a significant predictor (HR 1.07, 95% CI 1.01 to 1.14, p = 0.02). Figure 3 shows the relationship between percentage lung carbon monoxide transfer factor (Tlco%) and Spo2max.
With two goals in mind—to assess the usefulness of Spo2 as a prognostic marker in patients with SSc-ILD and to determine whether an arterial line is needed to accurately assess oxygenation—we conducted a study to first examine agreement between Spo2 and Sao2 at rest and maximal exercise and then to analyse the ability of Spo2 to predict mortality in patients with SSc-ILD. We hypothesised that, in subjects with SSc-ILD, Spo2 would inaccurately reflect Sao2 at rest and the disparity would be even greater at maximal exercise. In contrast, we found that Spo2 was an accurate reflection of Sao2 both at rest and maximal exertion in these subjects. Moreover, we observed that Spo2, a simple non-invasive and inexpensive measure to collect, was a predictor of mortality in patients with SSc-ILD.
Recently there has been a groundswell of attention on the use of non-invasive markers of exertional blood oxygenation (eg, nadir Spo2 during a 6-minute walk test (6MWT) or statistical manipulations of Spo2 over the course of a timed walked test) as outcome measures in therapeutic trials and clinical studies enrolling subjects with ILD.7 14 15 This increased attention raises three important distinct but related questions regarding the use of Spo2 at maximal exercise as an outcome metric:
Is it valid—does it in fact measure what it is purported to measure (eg, true blood oxygenation or Sao2)?
Is it reliable—if it is measured at two separate time points in a subject whose clinical status has not changed, will it produce similar results?
Is it responsive to underlying change—if a subject’s blood oxygenation at maximal exercise changes from baseline, will Spo2 reflect those changes?
The current study shows that Spo2 at maximal exercise is a valid measure of blood oxygenation at maximal exercise in patients with SSc-ILD, and Spo2 does in fact accurately track changes in Sao2.
The Bland-Altman plots reinforce this finding. These plots give a graphical presentation of the agreement between two methods of measurement; they depict an estimate of the bias (or systematic error which is simply the overestimation or underestimation of one measure compared with the other) as the mean difference between the two measures. The precision of that estimate is reflected in its standard deviation. Whereas correlation coefficients express the relationship between two variables, Bland-Altman plots depict agreement between them. When one is trying to determine the accuracy with which one measure (eg, Spo2) reflects another (eg, Sao2) or whether one measure might be used in place of another measure, correlation may not tell the true story—there can be extremely high correlation between two measures but, at the same time, poor agreement. This study shows that, for patients with SSc-ILD, Spo2 is an accurate reflection of Sao2 at rest or maximal exercise.
Several studies have examined the agreement between Spo2 and Sao29 but, to our knowledge, this is the first in a cohort with SSc-ILD. The importance and clinical relevance of this study centres on the peripheral circulation issues in SSc that make most clinicians reluctant to place an arterial line (digits have been lost as a consequence) and wary of Spo2 accuracy in these patients. There is therefore a need to validate Spo2 in SSc-ILD. In general, Spo2 may either overestimate or underestimate Sao2. In a meta-analysis Jensen and colleagues9 reported that, among 23 studies for which bias and precision estimates were available, the absolute mean (SD) bias was 1.99 (0.23) (ie, on average, Spo2 overestimated Sao2 by 1.99 points). In those studies the mean (SD) difference between Spo2 and Sao2 ranged from −13.2 (8.0) to 12.0 (13.3). The authors commented that severe or rapid desaturation; hypotension, hypothermia, or other unstable haemodynamic or low perfusion states; dyshaemoglobinaemia or use of vital dyes; and motion may all confound agreement between Spo2 and Sao2. The mean differences between Spo2 and Sao2 in the current study fall well within the range mentioned in that analysis.
The results of the current study not only suggest that Spo2 is a valid surrogate for Sao2 in patients with SSc-ILD, but also suggest that desaturation, as measured by Spo2, is a significant predictor of mortality in this patient group. Our results are in line with the work by Lama and colleagues8 that suggested desaturation (as measured by Spo2) during a 6MWT is an important prognosticator in patients with idiopathic interstitial pneumonia. In so far as the 6MWT accurately reflects functional exercise capacity in patients with SSc-ILD—it does so in patients with fibrotic idiopathic interstitial pneumonia10—we hypothesise that our results would hold for measures of Spo2 collected during the 6MWT in this patient population. Not surprisingly, we found Tlco% to be a strong driver of Spo2max (data not shown); in fact, among several candidate variables including age, gender, FVC% and baseline Spo2, Tlco% was the only significant predictor. Like other investigators,1 we also found Tlco% to be a potent predictor of survival in our cohort (data not shown). Because of the strong relationship between Tlco% and Spo2max, and because our goal was merely to begin to examine Spo2max as a prognostic marker, we performed our survival analysis adjusting for FVC% and not Tlco%.
Although the results are novel and clinically relevant, this study has limitations. This is a retrospective analysis of data collected prospectively over a period of three decades. Different pulse and co-oximeters were used during different time periods; however, each instrument is purported to be accurate within two percentage points for Sao2 values from 70–100% so we can be confident in the readings. Data for this study were collected at a centre situated 5280 feet above sea level. In Denver, patients probably “live” closer to the steep portion of the oxygen dissociation curve (probably on or very close to the shoulder) than patients at lower altitudes. How this affects the results merits consideration and examination in future studies. Given the lack of systematic examinations for pulmonary hypertension and the changes in available technology to assess for pulmonary hypertension over the study period, we cannot be certain how many subjects truly had the condition. Even more complex is the issue of exercise-induced pulmonary hypertension: how many subjects had it is unknown but, as with other studies of subjects with ILD, the possibility of its presence and its effects on exercise Spo2 must be considered. Given these limitations, we believe the results should be viewed as hypothesis-generating and will hopefully spark continued investigation in this area. These results will need prospective confirmation at other altitudes. Future studies should examine whether Spo2 values collected during the 6MWT are as meaningful as those collected during cardiopulmonary exercise tests, and efforts should be made to further delineate the relationship between resting or exercise-induced pulmonary hypertension and Spo2.
In SSc-ILD, at both baseline and maximal exercise, Spo2 is an accurate reflection of Sao2. In patients with SSc-ILD, Spo2 carries prognostic value. Because of the ease with which it is assessed, consideration should be given to measuring exercise Spo2 as a marker of clinical status or as an outcome in clinical trials enrolling subjects with SSc-ILD. Future research could clarify a number of outstanding and important questions related to Spo2 in this patient population.
Competing interests: None.