What is new?
This study provides evidence of the convergent and construct validity of the Physiotherapy Evidence Database (PEDro) total score and of the construct validity of 8 of the 10 items that sum to create the PEDro total score.
What is new? This study provides evidence of the convergent and construct validity of the Physiotherapy Evidence Database (PEDro) total score and of the construct validity of 8 of the 10 items that sum to create the PEDro total score.
The PEDro scale [1] is a tool developed to measure the methodological quality of randomized and quasi-randomized controlled trials of physiotherapy interventions. Although the scale has been developed for use in trials of physiotherapy, it may be applicable to trials in many other fields. It includes 11 items: (1) inclusion criteria and source; (2) random allocation; (3) allocation concealment; (4) baseline comparability; (5) blinding of subjects; (6) blinding of therapists; (7) blinding of assessors; (8) adequate (>85%) follow-up; (9) intention-to-treat analysis; (10) between-group comparison; and (11) point estimates and variability. Responses to items 2–11 are summed to create a total score that ranges from 0 to 10. (Item 1 is not used when scoring, as it relates to external validity.)
The PEDro scale is widely used in systematic reviews [2]. It is also used in the PEDro database [3], [4] to rank search results and guide users to trials that are more likely to be valid and interpretable [5].
A systematic review of scales used to measure methodological quality of physiotherapy trials [6] showed that most scales have undergone limited clinimetric evaluation beyond testing of test–retest reliability [7]. The scales typically have not been evaluated with regard to internal consistency, floor and ceiling effects, concurrent–convergent validity, and construct validity. This same review concluded that the PEDro scale appears to be one of the most promising tools to assess the methodological quality of physiotherapy trials. The items on the PEDro scale were derived from a Delphi consensus procedure [8]; hence, they have face validity. However, other forms of validity have not been comprehensively tested. Validity testing to date has been confined to evaluation of convergent validity comparing PEDro total scores with scores on other quality scales [9].
Although the validity of the PEDro scale has not been tested comprehensively, the scale's reliability has been tested in many studies. The PEDro total score has been found to have acceptably good reliability (intraclass correlation coefficient [ICC] = 0.58–0.91) [1], [10]. The reliability of individual scale items ranges from fair to excellent (kappa = 0.50–0.88) [1], [11], [12], [13].
The purpose of this study was to conduct further evaluation of the clinimetric properties of the PEDro scale. The objectives were to test convergent validity (the extent to which scores of a particular instrument correlate with other measures of the same construct) and construct validity (the extent to which scores on a particular instrument relate to other measures in a manner that is consistent with theoretically derived hypotheses concerning the concepts that are being measured. It should be assessed by testing predefined hypotheses) [14]. We evaluated construct validity by assessing the degree to which higher-quality trials are published in higher-impact journals.
Trials are included in the PEDro database if they compare at least two interventions, at least one of which is currently or potentially part of physiotherapy practice; the interventions in the trial are applied to human subjects who are representative of those to whom the intervention might be applied in clinical physiotherapy practice; the allocation of subjects to interventions is random or intended to be random; and the manuscript is published in full in a peer-reviewed journal. The PEDro
The convergent validity of the PEDro scale was tested by correlating (using Spearman's rho) the PEDro total scores with the Van Tulder 1997, Van Tulder 2003, and Jadad quality scores. As the correlation between measures can be attenuated by imperfect reliability, we also corrected the correlations using the Spearman–Brown Prophecy formula [18]. Estimates for reliability were taken from the median reported reliability of the scales. Trials reporting reliability were identified from a systematic
A total of 9,456 randomized controlled trials with completed consensus ratings were extracted from the PEDro database and included in the analysis. For these trials, the median (interquartile range [IQR]) PEDro total score was 5 (4–6). In 88% of the trials, the PEDro total score ranged between 3 and 7.
We located 17 systematic reviews from the Back Review Group that were eligible for inclusion in the study. Of these 17 reviews, nine reviews (rating 154 trials) used the Van Tulder 1997 scale,
This study provides evidence for the convergent and construct validity of the PEDro total score. We also found evidence of construct validity for 8 of the 10 items that contribute to the PEDro total score.
In terms of convergent validity, the weak correlation between the PEDro total score and the Jadad scale is not surprising. A weak correlation between the total scores of these two scales was expected, because the two scales only have three items in common. On the other hand, we expected and
There is evidence for convergent and construct validity of the PEDro total score and of construct validity of 8 of the 10 items that contribute to the PEDro total score.