Article Text

Download PDFPDF
Original article
Using socio-demographic and early clinical features in general practice to identify people with lung cancer earlier
  1. Barbara Iyen-Omofoman1,
  2. Laila J Tata1,
  3. David R Baldwin2,
  4. Chris JP Smith1,
  5. Richard B Hubbard1,2,3
  1. 1Department of Epidemiology and Public Health, University of Nottingham, Nottingham, UK
  2. 2Nottingham University Hospitals NHS Trust, Nottingham, UK
  3. 3Respiratory Biomedical Research Unit, University of Nottingham, Nottingham, UK
  1. Correspondence to Dr Barbara Iyen-Omofoman, Department of Epidemiology and Public Health, University of Nottingham, Clinical Sciences Building, City Hospital, Nottingham NG5 1PB, UK; barboiyen{at}


Introduction In the UK, most people with lung cancer are diagnosed at a late stage when curative treatment is not possible. To aid earlier detection, the socio-demographic and early clinical features predictive of lung cancer need to be identified.

Methods We studied 12 074 cases of lung cancer and 120 731 controls in a large general practice database. Logistic regression analyses were used to identify the socio-demographic and clinical features associated with cancer up to 2 years before diagnosis. A risk prediction model was developed using variables that were independently associated with lung cancer up to 4 months before diagnosis. The model performance was assessed in an independent dataset of 1 826 293 patients from the same database. Discrimination was assessed by means of a receiver operating characteristic (ROC) curve.

Results Clinical and socio-demographic features that were independently associated with lung cancer were patients’ age, sex, socioeconomic status and smoking history. From 4 to 12 months before diagnosis, the frequency of consultations and symptom records of cough, haemoptysis, dyspnoea, weight loss, lower respiratory tract infections, non-specific chest infections, chest pain, hoarseness, upper respiratory tract infections and chronic obstructive pulmonary disease were also independently predictive of lung cancer. On validation, the model performed well with an area under the ROC curve of 0.88.

Conclusions This new model performed substantially better than the current National Institute for Health and Clinical Excellence referral guidelines and all comparable models. It has the potential to predict lung cancer cases sufficiently early to make detection at a curable stage more likely by allowing general practitioners to better risk stratify their patients. A clinical trial is needed to quantify the absolute benefits to patients and the cost effectiveness of this model in practice.

  • Lung Cancer
  • Clinical Epidemiology

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.