Untapped potential of multicenter studies: a review of cardiovascular risk prediction models revealed inappropriate analyses and wide variation in reporting

L Wynants; D M Kent; D Timmerman; C M Lundquist; B Van Calster

doi:10.1186/s41512-019-0046-9

Untapped potential of multicenter studies: a review of cardiovascular risk prediction models revealed inappropriate analyses and wide variation in reporting

Diagn Progn Res. 2019 Feb 22:3:6. doi: 10.1186/s41512-019-0046-9. eCollection 2019.

Authors

L Wynants^{1

2}, D M Kent³, D Timmerman^{1

4}, C M Lundquist³, B Van Calster^{1

5}

Affiliations

¹ 1Department of Development and Regeneration, KU Leuven, Herestraat 49, box 7003, 3000 Leuven, Belgium.
² 5Department of Epidemiology, CAPHRI Care and Public Health Research Institute, Maastricht University, PO Box 9600, 6200 MD Maastricht, The Netherlands.
³ 2Predictive Analytics and Comparative Effectiveness (PACE) Center, Institute for Clinical Research and Health Policy Studies, Tufts Medical Center, 800 Washington St, Box 63, Boston, MA 02111 USA.
⁴ 4Department of Obstetrics and Gynecology, University Hospitals Leuven, Herestraat 49, 3000 Leuven, Belgium.
⁵ 3Department of Biomedical Data Sciences, Leiden University Medical Center, PO Box 9600, Leiden, 2300RC The Netherlands.

Abstract

Background: Clinical prediction models are often constructed using multicenter databases. Such a data structure poses additional challenges for statistical analysis (clustered data) but offers opportunities for model generalizability to a broad range of centers. The purpose of this study was to describe properties, analysis, and reporting of multicenter studies in the Tufts PACE Clinical Prediction Model Registry and to illustrate consequences of common design and analyses choices.

Methods: Fifty randomly selected studies that are included in the Tufts registry as multicenter and published after 2000 underwent full-text screening. Simulated examples illustrate some key concepts relevant to multicenter prediction research.

Results: Multicenter studies differed widely in the number of participating centers (range 2 to 5473). Thirty-nine of 50 studies ignored the multicenter nature of data in the statistical analysis. In the others, clustering was resolved by developing the model on only one center, using mixed effects or stratified regression, or by using center-level characteristics as predictors. Twenty-three of 50 studies did not describe the clinical settings or type of centers from which data was obtained. Four of 50 studies discussed neither generalizability nor external validity of the developed model.

Conclusions: Regression methods and validation strategies tailored to multicenter studies are underutilized. Reporting on generalizability and potential external validity of the model lacks transparency. Hence, multicenter prediction research has untapped potential.

Registration: This review was not registered.

Keywords: Cardiovascular disease; Clinical prediction model; Multicenter.

Publication types

Review