To compare methods for analysis of longitudinal studies with missing data due to participant dropout and follow-up truncated by death.
Study Design and Setting
We analyzed physical functioning in an Australian longitudinal study of elderly women where the missing data mechanism could either be missing at random (MAR) or missing not at random (MNAR). We assumed either an immortal cohort where deceased participants are implicitly included after death or a mortal cohort where the target of inference is surviving participants at each survey wave. To illustrate the methods a covariate was included. Simulation was used to assess the effect of the assumptions.
Results
Ignoring attrition or restricting analysis to participants with complete follow up led to biased estimates. Linear mixed model was appropriate for an immortal cohort under MAR but not MNAR. Linear increment model and joint modeling of longitudinal outcome and time to death were the most robust to MNAR. For a mortal cohort, inverse probability weighting and multiple imputation could be used, but care is needed in specifying dropout and imputation models, respectively.
Conclusion
Appropriate analysis methodology to deal with attrition in longitudinal studies depends on the target of inference and the missing data mechanism.
Introduction
What is new?
•
Researchers may not be aware that for some methods of analysis, such as likelihood-based methods, deceased participants are implicitly included after death (i.e., cohorts are immortal).
•
Using a longitudinal study of 12,432 Australian women aged over 70, we illustrated that appropriate methods for dealing with attrition depend on the target of inference and the missing data mechanism.
•
Linear-mixed models are appropriate for data missing at random for an immortal cohort but not for a mortal cohort, whereas multiple imputation and inverse probability weighting can be used for a mortal cohort.
•
Linear increment model and joint modeling of longitudinal outcome and time to death are both appropriate under missing not at random (MNAR) and an immortal cohort, but more research is needed for methods robust to MNAR for a mortal cohort.
The prevention and treatment of missing data in clinical trials was discussed recently in the New England Journal of Medicine [1], and an accompanying article suggests the issues raised also apply to observational studies [2]. The recommendations were that missing data should not be ignored; methods such as complete case analysis or single imputation should only be used in the minority of cases where a simple approach is justified; and model-based methods or methods that include appropriate weighting are generally preferred. These high profile publications reflect the increasing awareness of the importance of dealing appropriately with missing data in health research studies.
Longitudinal studies where participants are repeatedly measured over time are essential for estimating changes in health-related variables in populations because they estimate within-cohort change which is not possible with repeated cross-sectional studies. Within-cohort change may represent the natural progression of health in a population and provide valuable information for clinical management of patients and governmental policy making. For these purposes, it is critical that information provided is accurate and representative of the population of interest. In this regard, a significant limitation of many longitudinal studies is that a proportion of the participants fail to provide data at all waves of data collection. Nonresponse may be transitory, and the participant later returns to the study or it may be due to participant dropout or death, in which case further response is not provided. This latter type of nonresponse is the focus of this article.
Participant dropout leads to missing data, whereas participant death results in truncation of follow-up. This situation is particularly problematic if the participants who drop out or die are systematically different with respect to the outcomes of interest from the participants who remain. Common sense would suggest that participants who drop out or die are likely to have poorer health on average than participants who remain in the study. Therefore, ignoring the potential impact of this type of missing data is likely to lead to biased estimates, especially at later stages of the study where greater proportions of participants have either dropped out or died. In addition, the direction of the bias is likely to be toward showing better health over time than is true.
There is an important philosophical aspect of nonresponse due to death. How should deaths be taken into account when estimating mean levels of health-related variables over time? For example, how is a dead person's quality of life to be represented on a quality of life scale? Is their quality of life zero? Should their quality of life even be estimated in an analysis after they have died? If not then how should deaths be handled in the analysis? Clearly, there are no simple answers to these questions [3]. Moreover, the way in which deaths are handled in the analysis will depend on the study aims [4]. For example, are we describing the population as defined at the initiation of the study? Or are we describing the population as it existed at each stage of the study? The former scenario implies an immortal population and a cohort where participants who die continue to be implicitly included after death, whereas the latter scenario implies a mortal population where deceased participants are excluded after death leading to a cohort of diminishing size over time. Alternatively, we could report estimates separately for dropouts, deaths, and those participants who remain in the study.
Missing data can be classified into one of three types according to Rubin and Little [5]. Missing completely at random (MCAR) refers to situations where the missing data are a random subset of the total data and do not depend on observed or unobserved measurements. In this scenario, an analysis that ignores the missing data will lead to unbiased estimates and valid inference. Missing at random (MAR) occurs when the missing data are not a random sample of the total data; however, given the observed data, the missingness mechanism does not depend on the unobserved data. In this case, there is sufficient information in the data collected to enable unbiased estimates and valid inference provided an appropriate analysis such as a likelihood-based method is performed. Missing not at random (MNAR) is where the missing data are systematically different from the nonmissing data, and even after accounting for the observed information, the reason for the missing data still depends on the unobserved observations. In this situation, there is insufficient information contained in the observed data to ensure unbiased estimates and valid inference.
There have been many analytical methods proposed for dealing with missing data due to participant dropout or death in longitudinal studies [6], [7], [8]. These methods include: joint models [9], [10], inverse probability weighting (IPW) [11], [12], multiple imputation (MI) [13], [14], [15], and linear increment (LI) models [16], [17]. All these methods rely on specific assumptions being met to ensure valid inference. Although these assumptions often cannot be formally tested, their plausibility can be carefully considered in the context of the study, the data collected, and the aims of the analysis. In addition, sensitivity analyses and/or simulation studies can be performed to determine the robustness of the conclusions obtained.
The aim of this article was to illustrate and compare a number of methods of analysis for longitudinal studies with significant participant dropout or death. In Section 2, we describe the example data used for the analysis. The methods of analysis are then described in Section 3 (with the more technical details included in an Appendix at www.jclinepi.com). Simulation studies are described in Section 4; analysis results presented in Section 5, and we conclude with a discussion of the results in Section 6 where we also provide some general guidelines for presenting results of longitudinal studies with dropout and death.
Section snippets
Example data
Data were obtained from the Australian Longitudinal Study on Women's Health (ALSWH) [18]. The ALSWH began in 1996 when around 40,000 adult women in three age group cohorts were recruited by randomly sampling the nationally representative Medicare database. The study was approved by Ethics Committees at the University of Queensland and University of Newcastle. We use data from the older cohort of 12,432 participants born between 1921 and 1926. These women have been surveyed at approximately
Methods for comparison
The methods compared are described below. Methods 1 and 2 are often used in practice but would only be valid in a minority of cases. They have been included to illustrate the potential problem with using simplistic methods. Methods 3–5 assume an immortal cohort where responses are implicitly included for missing data after death. This may not be an appropriate assumption in many cases. Therefore, we have also included two methods that assume a mortal cohort (Methods 6 and 7). In the case of a
Simulation studies
We conducted simulation studies to evaluate each of the analytical methods based on MAR and MNAR missing data mechanisms. However, we did not include complete case analysis (method 2) in the evaluation because the direction of bias is clear from the analysis results alone. For the simulation studies, we identified the participants with complete data for all five waves for the variables of interest. This resulted in a total of 3,799 participants. For each simulation, we randomly chose a sample
Analysis results
Fig. 1 shows the decline in PF stratified by completers, dropouts, and deaths. As expected, the dropouts and participants who died had poorer PF than completers at each wave. Trends over time are similar for the groups; however, there appears to be a slightly steeper decline for the participants between the two surveys before death. Other notable features are that dropouts had higher PF than the participants who died and PF was worst for the participants who died earlier in the study. This
Discussion
The results obtained from our comparative analysis and simulation study suggest that ignoring missing data leads to biased estimates of PF. In addition, the apparent bias increased over time as greater proportions of participants had either dropped out or died. Restricting the analysis to just the completers also produced biased estimates at all waves. Bias was in the expected direction, that is, the estimates suggest better average PF than was actually the case. These results are consistent
Conclusion
The assumptions of any method of analysis need to be carefully considered before implementation in studies with missing data due to participant dropout. In addition, the method should appropriately reflect the study aims and target of inference when follow-up has been truncated by death. Sensitivity analysis can be useful to determine the robustness of results to varying plausible missing data mechanisms. We suggest the primary analysis should be performed using a method that is robust to the
We also found the distributions of education and wealth levels to be similar across follow-up waves (appendix p 13). To account for potential missing-not-at-random data due to mortality, we fitted a joint model of longitudinal data on healthy ageing scores and survival data on all-cause mortality combining multilevel modelling and parametric Weibull survival regression.22 We present the results of joint models as hazard ratios (HRs) with 95% CIs.
The rapid growth of the size of the older population is having a substantial effect on health and social care services in many societies across the world. Maintaining health and functioning in older age is a key public health issue but few studies have examined factors associated with inequalities in trajectories of health and functioning across countries. The aim of this study was to investigate trajectories of healthy ageing in older men and women (aged ≥45 years) and the effect of education and wealth on these trajectories.
This population-based study is based on eight longitudinal cohorts from Australia, the USA, Japan, South Korea, Mexico, and Europe harmonised by the EU Ageing Trajectories of Health: Longitudinal Opportunities and Synergies (ATHLOS) consortium. We selected these studies from the repository of 17 ageing studies in the ATHLOS consortium because they reported at least three waves of collected data. We used multilevel modelling to investigate the effect of education and wealth on trajectories of healthy ageing scores, which incorporated 41 items of physical and cognitive functioning with a range between 0 (poor) and 100 (good), after adjustment for age, sex, and cohort study.
We used data from 141 214 participants, with a mean age of 62·9 years (SD 10·1) and an age range of 45–106 years, of whom 76 484 (54·2%) were women. The earliest year of baseline data was 1992 and the most recent last follow-up year was 2015. Education and wealth affected baseline scores of healthy ageing but had little effect on the rate of decrease in healthy ageing score thereafter. Compared with those with primary education or less, participants with tertiary education had higher baseline scores (adjusted difference in score of 10·54 points, 95% CI 10·31–10·77). The adjusted difference in healthy ageing score between lowest and highest quintiles of wealth was 8·98 points (95% CI 8·74–9·22). Among the eight cohorts, the strongest inequality gradient for both education and wealth was found in the Health Retirement Study from the USA.
The apparent difference in baseline healthy ageing scores between those with high versus low education levels and wealth suggests that cumulative disadvantage due to low education and wealth might have largely deteriorated health conditions in early life stages, leading to persistent differences throughout older age, but no further increase in ageing disparity after age 70 years. Future research should adopt a lifecourse approach to investigate mechanisms of health inequalities across education and wealth in different societies.
European Union Horizon 2020 Research and Innovation Programme.
Bias reduction has also been found to be greater with increasing sample size for longitudinal data [22]. Finally, we have only investigated correctly specified MI—if the imputation model is incorrectly specified, the bias may not be completely removed or could even be larger than in the CCA [9,10,41]. In practice, the variables related to missingness are seldom known with certainty.
Researchers are concerned whether multiple imputation (MI) or complete case analysis should be used when a large proportion of data are missing. We aimed to provide guidance for drawing conclusions from data with a large proportion of missingness.
Via simulations, we investigated how the proportion of missing data, the fraction of missing information (FMI), and availability of auxiliary variables affected MI performance. Outcome data were missing completely at random or missing at random (MAR).
Provided sufficient auxiliary information was available; MI was beneficial in terms of bias and never detrimental in terms of efficiency. Models with similar FMI values, but differing proportions of missing data, also had similar precision for effect estimates. In the absence of bias, the FMI was a better guide to the efficiency gains using MI than the proportion of missing data.
We provide evidence that for MAR data, valid MI reduces bias even when the proportion of missingness is large. We advise researchers to use FMI to guide choice of auxiliary variables for efficiency gain in imputation analyses, and that sensitivity analyses including different imputation models may be needed if the number of complete cases is small.
Funding: The Australian Longitudinal Study on Women's Health is funded by the Australian Department of Health. M.J. is funded by the Australian National Health and Medical Research Council (APP1000986) and G.D.M. is funded by an Australian Research Council Future Fellowship. The funding sources had no involvement in the study design; in the collection, analysis, and interpretation of data; in the writing of the report; and in the decision to submit the article for publication.