Skip to main content
Log in

Why do so many prognostic factors fail to pan out?

  • Published:
Breast Cancer Research and Treatment Aims and scope Submit manuscript

Summary

Although there can be many reasons that one study fails to confirm the results of another, the consequences of data exploration and the potential for spuriously significant results are often overlooked. A series of simulation experiments were designed to mimic the characteristics of relapse-free survival data that might be encountered in a prognostic factor study of node-negative breast cancer patients. Each simulated dataset of 500 or 250 cases was divided into a training set, used to select the “best” prognostic factor cutpoint, and a validation set, used to confirm the cutpoint. Testing multiple cutpoints markedly increased the risk of making a Type I error. The power to detect even small true differences was substantial, and increased as the number of cutpoints increased. Regardless of the number of cutpoints tested on the training sets, the Type I error rate on an independent validation data set was quite stable and the power of the validation set to detect true differences was not related to the number of cutpoints. Validation power closely approximated that predicted for a simple two group comparison. It is therefore recommended that exploratory analyses of prognostic factors formally employ some method of adjusting for increased Type I errors, such as independent validation sets, ad hoc adjustment factors, or other statistical methods of estimating the true risk.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. McGuire WL, Hilsenbeck SG, Clark GM: Optimal mastectomy timing. J Natl Cancer Inst 84:346–348, 1992.

    Google Scholar 

  2. Allred DC, Tandon AK, Clark GM, McGuire WL: HER-2/neu oncogene amplification and expression in human mammary carcinoma.In Pretlow TG II, Pretlow TP (eds) Biochemical and Molecular Aspects of Selected Cancers, Vol 1. Academic Press, 1991, pp 75–97.

  3. Therneau TM, Grambsch PM, Fleming TR: Martingale residuals for survival models. Biometrika 77:147–160, 1990.

    Google Scholar 

  4. Abel U, Berger J, Wiebelt H: CRITLEVEL: An exploratory procedure for the evaluation of quantitative prognostic factors. Meth Inform Med 23:154–156, 1984.

    Google Scholar 

  5. Sigurdsson H, Baldetorp B, Borg Å, Dalberg M, Fernö, Killander D, Olsson H, Ranstam J: Flow cytometry in primary breast cancer: improving the prognostic value of the fraction of cells in the S-phase by optimal categorization of cut-off levels. Brit J Cancer 62:786–790, 1990.

    Google Scholar 

  6. McGuire WL: Breast cancer prognostic factors: Evaluation guidelines. J Natl Cancer Inst 83:154–155, 1991.

    Google Scholar 

  7. StatSci: S-PLUS Reference Manual, version 3.0. Statistical Sciences, Inc., Seattle WA, 1991.

    Google Scholar 

  8. George SL, Desu MM: Planning the size and duration of a clinical trial studying the time to some critical event. J Chron Dis 27: 15–24, 1974.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

We regret to report that Dr. McGuire died on March 25, 1992, while this work was in progress.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hilsenbeck, S.G., Clark, G.M. & McGuire, W.L. Why do so many prognostic factors fail to pan out?. Breast Cancer Res Tr 22, 197–206 (1992). https://doi.org/10.1007/BF01840833

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF01840833

Key words

Navigation