Article Text

Download PDFPDF

Original article
Feasibility of lung cancer prediction from low-dose CT scan and smoking factors using causal models
  1. Vineet K Raghu1,2,
  2. Wei Zhao3,4,
  3. Jiantao Pu3,
  4. Joseph K Leader3,
  5. Renwei Wang5,
  6. James Herman6,
  7. Jian-Min Yuan5,7,
  8. Panayiotis V Benos1,2,
  9. David O Wilson8
  1. 1 Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, Pennsylvania, USA
  2. 2 Department of Computer Science, University of Pittsburgh, Pittsburgh, Pennsylvania, United States
  3. 3 Department of Radiology, University of Pittsburgh, Pittsburgh, Pennsylvania, United States
  4. 4 Current affiliation: Department of Respiratory Medicine, Chinese PLA General Hospital, Beijing, China
  5. 5 Division of Cancer Control and Population Sciences, UPMC Hillman Cancer Center, Pittsburgh, Pennsylvania, United States
  6. 6 Division of Hematology, Oncology, Department of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania, United States
  7. 7 Department of Epidemiology, University of Pittsburgh, Pittsburgh, Pennsylvania, United States
  8. 8 Division of Pulmonary, Allergy and Critical Care Medicine, Department of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania, United States
  1. Correspondence to Dr Panayiotis V Benos, Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, PA 15260, USA; benos{at}


Introduction Low-dose CT (LDCT) is currently used in lung cancer screening of high-risk populations for early lung cancer diagnosis. However, 96% of individuals with detected nodules are false positives.

Methods In order to develop an efficient early lung cancer predictor from clinical, demographic and LDCT features, we studied a total of 218 subjects with lung cancer or benign nodules. Probabilistic graphical models (PGMs) were used to integrate demographics, clinical data and LDCT features from 92 subjects (training cohort) from the Pittsburgh Lung Screening Study cohort.

Results Learnt PGMs identified three variables directly (causally) linked to malignant nodules and the largest benign nodule and used them to build the Lung Cancer Causal Model (LCCM), which was validated in a separate cohort of 126 subjects. Nodule and vessel numbers and years since the subject quit smoking were sufficient to discriminate malignant from benign nodules. Comparison with existing predictors in the training and validation cohorts showed that (1) incorporating LDCT scan features greatly enhances predictive accuracy; and (2) LCCM improves cancer detection over existing methods, including the Brock parsimonious model (p<0.001). Notably, the number of surrounding vessels, a feature not previously used in predictive models, significantly improves predictive efficiency. Based on the validation cohort results, LCCM is able to identify 30% of the benign nodules without risk of misclassifying cancer nodules.

Discussion LCCM shows promise as a lung cancer predictor as it is significantly improved over existing models. Validated in a larger, prospective study, it may help reduce unnecessary follow-up visits and procedures.

  • lung cancer risk
  • low-dose CT
  • cancer screening

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


  • Contributors The study was conceived by PVB and DOW. Data were provided by J-MY, WZ and JP. Analysis was performed by VKR, PVB and DOW. The manuscript was written by VKR, PVB and DOW with contributions from all other authors.

  • Funding This study was supported by the National Institutes of Health (NIH) Grants U01HL137159 and R01LM012087 to PVB, R21CA197493 and R01HL096613 to JP, and T32CA082084 to VKR; the University of Pittsburgh Specialized Program of Research Excellence (SPORE) in Lung Cancer (NCI P50CA90440); and Cancer Center Core Grant (NCI P30CA047904). The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH.

  • Competing interests None declared.

  • Patient consent for publication Obtained.

  • Provenance and peer review Not commissioned; externally peer reviewed.