Introduction

Health-related quality of life, complementary to clinical and developmental measures and mortality, has become an essential indicator of outcome in clinical evaluation studies [1], community health studies [2], and will find its way into medical practice [3, 4]. In pediatrics, reliable and validated measures are available to describe health status and health-related quality of life of children comprehensively [5]. However, few were designed to measure health of pre-school children; most are intended for school-age children [5, 6]. Challenges with measurement in pre-school children include finding ways to accommodate for rapid changes in children's abilities and roles over time, and wide ranges for normative growth and development [7]. Furthermore, a rating by proxy, often a parent, is indispensable for this age group. It is difficult to assess the adequacy of such proxy ratings that may be confounded by various factors [8].

We evaluated the Infant and Toddler Quality of Life Questionnaire (ITQOL), the only available generic ‘profile measure’ (i.e. with 10 multi-item and 2 single-item scales) for health status and health-related quality of life that was designed for children as young as 2 months up to 5 years old [9, 10]. This study is the first methodological evaluation of the ITQOL that is based on evaluations with regard to infants (< 1 year old) as well as toddlers (1–5 years old) in both a random general population sample and a clinical sample. Regarding the clinical sample in this study, children being treated for a respiratory disease were selected since this is the most prevalent chronic condition in the age group of pre-school children, and since a negative impact on health-related quality of life because of respiratory symptoms was expected [11, 12].

The ITQOL, developed by Landgraf, adopts the World Health Organization's definition of health as a state of complete physical, mental and social wellbeing and not merely the absence of disease, and incorporates the results of a review of child health literature and developmental guidelines used by pediatricians, and the feed-back of parents during pilot testing [9]. Next to physical and psychosocial aspects of child health it covers the impact of child health problems or handicaps on family life; it is to be completed by the parents [9, 10]. The ITQOL is conceptually similar to and has overlapping items and scales with the Child Health Questionnaire (CHQ), which is among the most widely used pediatric health status measures, and has been cross-culturally validated into 21 languages (32 countries) [1319].

Other generic measures for pre-school children are: the one-dimensional Functional Status II-Revised (FSIIR; 0–16 years old) [20], a ‘preferencebased measure’ (suitable for economic evaluations) called Health Status Classification System for Pre-School Children (HSCS-PS; 2–5 years old) [21, 22], and two other ‘health profile measures’, i.e. the Pediatric Quality of Life Inventory (PedsQl; 2–18 years old) [23], and the TNO-AZL Pre-school Children Quality of Life Questionnaire (TAPQOL; 1–5 years old) [24]. Of these instruments we chose the TAPQOL to evaluate the concurrent validity of the ITQOL, as it is also a ‘health profile measure’, and as the age range that is covered by the TAPQOL (1–5 years) is closest to the one covered by the ITQOL (2 months-5 years) [24].

The study objectives were to assess in a random general population sample and in a clinical sample of children with respiratory disease:

  1. (1)

    The feasibility of the ITQOL as a proxy measure of child health and health-related quality of life (indicators: response rates, completion times, perceived difficulty by parents, missing and non-unique answers, presence of floor and ceiling effects);

  2. (2)

    The reliability of the ITQOL-scales (internal consistency and test-retest reliability);

  3. (3)

    The validity of the ITQOL as judged by comparisons of specific ITQOL scale ratings with specific TAPQOL scale ratings of the child's health (concurrent validity) as well as by the ability to discriminate between subgroups with/without self-reported chronic conditions, with high/low medical consumption and with/without doctor-diagnosed respiratory illness/asthma (discriminative validity).

Methods

Study populations and data collection

General population sample

In 2002, by means of the SPSS random number generator, a random sample of 500 out of 9022 children aged 2 months-4 years in the general population of six municipalities allocated to the service area of ‘Carinova Salland’ (single regional provider of Well-Child Care for the 0–4 year olds) were mailed a questionnaire. The parents themselves decided if either the father or the mother should complete the questionnaire. Up to two reminders were sent; no incentives applied. After two weeks, the same questionnaire was mailed again to assess test-retest reliability in a random subgroup of 158 parents who had returned the first questionnaire, by applying random numbers generated by SPSS.

Respiratory illness sample

January 2000 to July 2001, at Erasmus University Medical Center Rotterdam and HAGA Hospital, The Hague, the Netherlands, patients were retrieved by diagnosis asthma or other disease of trachea/bronchus (ICD-9 coding system 493 and 519.1, respectively) or the reason for encounter ‘wheezing/cough’ as registered by the prospective problem oriented patient classification system [25]. Eligible patients were maximally 5 years old, visiting the pediatric outpatient or emergency department with recurrent lower respiratory complaints during at least 3 months within the past year and being treated with bronchodilators or corticosteroids as documented in the patient record [26]. Parents of all eligible patients were asked to participate (n = 230), and 217 agreed and the questionnaire was sent. After 10 days and 2 months, reminding letters were sent, the third reminder was by telephone. After 2 weeks, all parents who returned the questionnaire were mailed the same questionnaire again to assess test-retest reliability.

Infant and Toddler Quality of Life Questionnaire

The ITQOL consists of 103 items (10 multi-item scales and 2 single-item scales; see Table 1) that generally refer to the situation during the past 4 weeks. It was translated into Dutch according to international guidelines, including three independent forward and two backward translations [13, 27]. Per scale, the items that have 4, 5 or 6 response options, were summed up with equal weight per item (some recoded and/or recalibrated) and transformed into a 0 (worst possible score) to 100 (best possible score) scale [9, 10, 13, 28]. ITQOL-scales General behavior and Getting along, and Change in health are only relevant for children aged one year and older [9].

Table 1 ITQOL-scales, number of items per scale and score interpretationa

TNO-AZL Pre-school Children Quality of Life Questionnaire (TAPQOL)

The TAPQOL, which is in Dutch originally, consists of 43 items divided over 12 multi-item scales that cover physical, social, cognitive and emotional functioning domains (see Table 5) [24]. TAPQOL-scales Social functioning, Motor functioning and Communication are only relevant for children aged 1.5 years and older [24].

Table 5 Concurrent validity of the ITQOL assessed by Pearson-r correlation coefficients between ITQOL scale scores and scale scores of related (‘parallel’) vs. unrelated (‘non-parallel’) TAPQOL scales in the general population sample (n = 257–410) and in the clinical sample of children with respiratory illness (n = 106–138)a

Other data

In addition, the questionnaires consisted of items on standard socio-demographic variables, the presence of parent-reported current chronic conditions, and presence of wheezing and/or dyspnea and use of asthma medication during the preceding four weeks as defined in the ISAAC epidemiological measurement instrument [29, 30], and number of visits to the family physician during the past 12 months related to health problems of the child. Furthermore the questionnaire consisted of an item on the time needed to complete the ITQOL questionnaire and an item on the perceived difficulty of the ITQOL questionnaire.

Analysis

Only questionnaires concerning children, of whom at least one of the parents was born in a Dutch speaking country, were considered eligible for analysis; in other cases it is questionable whether the respondents had adequate mastery of the Dutch language (questionnaires were in Dutch).

Feasibility

We evaluated the response rates, ITQOL-questionnaire completion times, and perceived difficulty by the parents, and presence of missing and/or non-unique answers. We assessed mean scale scores and score distributions and presence of floor and ceiling effects (> 25% of the respondents have the minimal and/or maximal score). Additionally, mean scores per gender/age subgroup in the general population sample were evaluated.

Reliability

In both samples, overall and in gender/age subgroups, Cronbach's α was used to evaluate the internal consistency of scales; ≥0.70 is considered adequate [31]. We assessed whether (on average) Pearson-r correlation coefficients between the items and their own scale score (without the item under consideration) were higher than the correlation coefficients between these items and any other scale, to evaluate whether the ITQOL-multiitem scales represent separate domains; the average Pearson-r correlation coefficients were calculated by applying Fisher's z transformations [32]. Additionally, in both samples, we assessed scaling success in terms of the percentage of (corrected) item-total correlations with the own scale being higher than the corresponding item-other scale correlations (not including the single-item scale Change in health) [13]. In both samples, test-retest reliability of the ITQOL-scales was, at the individual level, assessed by test-retest Intraclass Correlation Coefficients (ICCs) [33]; ≥0.70 is considered adequate [34]. At the group level, test-retest reliability was assessed by two-sided paired-samples t tests, and by effect sizes: d = meant2 − meant1)/SDt1; 0:20 ≤ d<0.50 is considered small, 0:50 ≤ d<0.80 moderate, and d ≥ 0.80 large [35].

Concurrent validity

In both the general population sample and in the clinical sample, we evaluated whether specific ITQOL-scales correlated better with their assumed ‘parallel’ TAPQOL scales (see below) than with any other scale, as measured by Pearson-r correlation coefficients; scales are assumed to be ‘parallel’ if they pertain to domains that are considered identical. We hypothesized relatively high correlation coefficients between the following (‘parallel’) ITQOL-scale/TAPQOL-scale (in italics) pairs: Physical functioning-Motor functioning; Temperament/moods-Problem behavior/Positive mood/Anxiety; General behavior-Problem behavior; Getting along-Problem behavior/Social functioning.

Discriminative validity

In both the general population sample and in the clinical sample separately, we evaluated the ability of the ITQOL to discriminate between subgroups of children with no parent-reported chronic conditions (excluding asthma in the clinical sample) and subgroups with ≥2 parent-reported chronic conditions. Similarly, in the general population sample (respectively clinical sample), the ITQOL-scores in the subgroup with 0 (respectively ≤3) physician-visits during the past 12 months were compared with those in the subgroup with ≥4 (respectively ≥8) visits (Table 6).

Table 6 Discriminative ability of the ITQOL between subgroups differing in number of reported chronic conditions and physician visits within the general population sample and within the clinical sample; and between a subgroup of the clinical sample (with asthma) and a gender/age matched subgroup of the general population sample (no asthma)

Additionally, we compared the ITQOL-scores in a subgroup of the clinical sample (n = 94; only children of whom the parents confirmed the presence of asthma) with ITQOL-scores in a gender/age-matched subgroup of the general population sample (n = 188; only children of whom the parents denied the presence of asthma); each clinical subgroup-child was matched to two general population-children with the same gender/age (6 months classes).

If the ITQOL has adequate discriminative validity, we hypothesize that relatively low ITQOL-scores will occur in subgroups with relatively many conditions and/or visits. Differences were evaluated by independent-samples t tests and by effect sizes (d) that were defined as d = [Mean (no conditions) − Mean (with condition)_=SD in the conditions-subgroup [35].

All analyses were done in SPSS, Version 11.0. The Medical Ethical Review Board of Erasmus MC — University Medical Center Rotterdam approved this study.

Findings

General population sample

In the general population sample, response was 83.0%; five questionnaires (1.2%) were not eligible for analysis (non-Dutch families). Response at the retest was 75.3% (one not eligible); 115 retest-questionnaires could be matched to a test-questionnaire (same child and respondent). Mean respondent age was 33.1 years (SD 7.1); 97% were mothers (Table 2). The children ranged from 3 to 46 months of age (mean 24.6; SD 13.8); 50% were girls; 20% of the children had parent-reported current asthma-like respiratory illness (Table 2).

Table 2 Characteristics of the study groups (general population sample n = 410; clinical simple of children with respiratory disease n = 138)

One hundred and one ITQOL-items had <2.0% missing answers; maximum was 6.2% (scale Getting along, item ‘Appears sorry after having misbehaved’); ITQOL-items had <0.75% non-unique answers. Mean reported ITQOL-completion time was 14 minutes (range 2–60; SD 7.2). Four percent of the respondents considered the ITQOL-questionnaire to be difficult/very difficult; 46% neither difficult nor easy; 50% easy/very easy.

Clinical sample

In the clinical sample, mailed questionnaire response was 79.7%; 35 questionnaires (20.2%) were not eligible for analysis (non-Dutch families); retest-response was 82.6%. We could match 114 retest-questionnaires to a test-questionnaire (same child and respondent). Mean respondent age was 33.9 years (SD 7.3); 88% were mothers (Table 2). The children ranged from 5 to 65 months of age (mean 34.5; SD 16.4); 41% were girls; 92% of parents confirmed the presence of asthma (see Table 2 for more information).

ITQOL score distributions

Floor effects were absent (see Methods). In the general population sample four, and in the clinical sample three ITQOL-scales showed a ceiling effect (see Methods) (Table 3). In the general population sample, two ITQOL-scales (Getting along, Parental-emotional) showed statistically significant different scores between boys/girls (p < 0.05); six scales between age-subgroups (p < 0.05) (see Annex A).

Table 3 Score-distributions and psychometric properties of ITQOL-scales in the general population sample (n = 410) and in the clinical sample of children with respiratory illness (n = 138)

Internal consistency of ITQOL-scales

All ITQOL-multi-item scales showed adequate internal consistency in both samples (αs > 0.70) (Table 3). In gender/age-subgroups of the general population sample and of the clinical sample (Annex A), generally the internal consistencies of the ITQOL-scales were adequate, however some subgroup-αs were moderate (0.50–0.70), and one a concerning a very small subgroup (n = 13) was only 0.13.

In both samples, all ITQOL-multi-item scales showed on average higher (corrected) item-own scale correlation coefficients than item-other-scale correlation coefficients, and the percentage scaling success was above 90% for all scales in both samples except for one ITQOL-scale (Getting along) in the clinical sample, which supports that the majority of ITQOL multi-item scales represent separate domains (Table 3).

Test-retest reliability

In the general population sample, four ITQOL-scales showed adequate (ICC≥ 0.70; p < 0.01) and six ITQOL-scales showed moderate test-retest reliability (ICC 0.50–0.70; p < 0.01); only one out of twelve ITQOL-scales had a mean retest score that was statistically significantly different from the mean test score (p < 0.05), but the effect size was small (d = 0.20) (Table 4). Almost identical results with regard to test-retest reliability of the ITQOL-scales were found in the clinical population sample (Table 4).

Table 4 Test-retest reliability of the ITQOL in a random subgroup of the general population sample (n = 115) and in the clinical sample of children with respiratory illness (n = 114)

Concurrent validity

Generally, the hypothesized pattern of correlation coefficients between ITQOL- and TAPQOL-scales was present, except for ITQOL-scale Physical functioning that did not correlate well with TAPQOL-Motor functioning (Table 5). In the clinical sample, there were less ‘violations’ of the hypothesized pattern of correlation coefficients (7 ‘violations’; see Methods) than in the general population sample (11 ‘violations’) (Table 5).

Discriminative validity

As hypothesized, per comparison between subgroups, five to eight ITQOL-scales resulted in statistically significant lower scores in the subgroups with relatively many medical conditions, respectively physician-visits compared to the subgroups with relatively few conditions and/or visits (p < 0.05); the largest effect sizes of score-differences between contrasted subgroups (d ≥ 0.80) were found for the ITQOL-scales General health perceptions and Bodily pain (Table 6).

Discussion

In this first evaluation of the ITQOL among children as young as 3 months up to 5 ½ years in a random general population sample and a clinical sample of children with respiratory illness, we established the feasibility of this measure in an unsupported setting (mailed questionnaire). Our study supports the internal consistency, the concurrent and discriminative validity of the ITQOL-scales and provides (general population) reference/norm scores for clinical studies. The results give rise to some concerns about ceiling effects and test-retest reliability of scales, requiring further investigation (see below).

Limitations

The ITQOL was designed for children aged 2 months up to 5 years old. The only currently available ITQOL evaluation concerns Canadian 3–4 year old children from the general population and a follow-up of Neonatal Intensive Care [10]. Our general population sample did not include 5 year olds, since 48 months is the maximum age of children attending the Well-Child Care organization that sampled the data. Furthermore, we did invite parents of children aged 2 month old, but the youngest children in the study were reported to be 3 months old, at the time the questionnaires were completed. It turned out that the youngest eligible patients in the respiratory disease sample were 5 months old. We recommend additional ITQOL evaluations, especially in very young children (2 months-1 year). Since the vast majority of respondents in the actual samples were mothers, the current results can only be generalized to settings with comparable proportions mothers as respondents.

Another limitation of the study is that we compared ITQOL scores with TAPQOL scores, although the TAPQOL was developed and validated for children at least 1 year old. However, in the general population sample of this study, the TAPQOL proved to have adequate psychometric properties in the youngest subgroup (3–12 months) as well [36].

In our study, we did not assess whether parents as proxies gave adequate ratings; the child's health-related quality of life scores may be affected by parent-related characteristics next to child-related, especially child-health-related characteristics [6, 8]. We propose evaluating the impact of parentrelated characteristics, including ratings of parents’ own health, in proportion to the impact of child and child-health-related characteristics on ITQOL scores in future studies.

Feasibility

Despite its length (103 items), the current ITQOL was well accepted by parents in our study, similar to the Canadian evaluation [10]. However, in order to limit respondent burden when the ITQOL is applied in clinical studies, we strongly recommend developing and evaluating a short ITQOL version.

Score distributions

Five ITQOL-scales showed a ceiling effect to some degree in either the general population sample or the clinical sample, or both. Physical functioning showed the most profound ceiling [10]. Ceiling effects were less manifest (but still present) in the clinical sample than in the general population sample. Ceiling effects are a common phenomenon, but restrict the use of a measure to detect changes and to describe health beyond the average in relatively healthy populations.

In the general population sample, mean ITQOL-scale scores showed some statistically significant differences between gender/age subgroups. We recommend repeated studies, preferably with larger samples, to assess subgroup differences. This will facilitate additional analyses to evaluate to what extent differential item functioning (DIF) explains such gender/age subgroup differences, and/or to what extent those differences reflect ‘reality’ [37]. In any case, we recommend the use of gender/age specific reference values when comparisons are being made between scores in specific clinical subgroups and general population (reference) scores.

Reliability, validity and responsiveness to change

Studies in large, varied samples are needed for additional assessments of the internal consistency of ITQOL-scales in gender/age subgroups, specifically regarding the very young (< 1 year old). Furthermore, we advice future studies with larger sample sizes to conduct confirmatory factor analysis using structural equation modelling to establish factorial validity of the ITQOL scales. ITQOL test-retest reliability was acceptable for the majority of scales in both samples, but given a low test-retest reliability of some scales and some score differences between test and retest scores, we recommend further assessments in varied populations.

This study supported the concurrent and discriminative validity of the ITQOL in a crosssectional design, but responsiveness to change respectively longitudinal construct validity of the ITQOL has not been evaluated yet. We recommend doing so in future studies, in particular in the framework of clinical trials, in the course of which attention should be given to the optimal choice of the time specification of ITQOL items (‘during the past 4 weeks’, or ‘past week’, etcetera) in applications concerning fluctuating symptoms, as may be the case in respiratory disease.

Conclusions

Until now, the ITQOL is the only available multidimensional quality of life measure developed for children as young as 2 months up to 5 years old. This study supported the evidence that the ITQOL is a feasible instrument with adequate psychometric properties. The study provided reference ITQOL scores for gender/age subgroups. We recommend repeated evaluations of the ITQOL in varied populations, especially among very young children, including repeated assessments of test-retest characteristics and evaluations of responsiveness to change. We recommend developing and evaluating a shortened ITQOL version.