Health and Quality of Life Outcomes BioMed Central

Background: Few measures of health related quality of life exist for use with preschool aged children. The objective of this study was to assess reliability and validity of a new multidimensional generic measure of health-related quality of life developed for use with preschool children. Methods: Cross-sectional survey sent to parents as their child turned 3 1/2 years of age. The setting was the province of British Columbia, Canada. Patients included all babies admitted to tertiary level neonatal intensive care units (NICU) at birth over a 16-month period, and a consecutive sample of healthy babies. The main outcome measure was a new full-length questionnaire consisting of 3 global items and 10 multi-item scales constructed to measure the physical and emotional well-being of toddlers and their families. Results: The response rate was 67.9%. 91% (NICU) and 84% (healthy baby) of items correlated with their own domain above the recommended standard (0.40). 97% (NICU) and 87% (healthy baby) of items correlated more highly (≥ 2 S.E.) with their hypothesized scale than with other scales. Cronbach's alpha coefficients varied between .80 and .96. Intra-class correlation coefficients were above .70. Correlations between scales in the new measure and other instruments were moderate to large, and were stronger than between non-related domains. Statistically significant differences in scale scores were observed between the NICU and healthy baby samples, as well as between those diagnosed with a health problem requiring medical attention in the past year versus those with no health problems. Conclusions: Preliminary results indicate the new measure demonstrates acceptable reliability and construct validity in a sample of children requiring NICU care and a sample of healthy children. However, further development work is warranted. Published: 22 December 2003 Health and Quality of Life Outcomes 2003, 1:81 Received: 27 August 2003 Accepted: 22 December 2003 This article is available from: http://www.hqlo.com/content/1/1/81 © 2003 Klassen et al; licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.


Background
There are now a number of validated health-related quality of life (HRQL) instruments available for use with adults, and these are often routinely included in clinical trials. Such measures are based on the view that health is multidimensional, that the concepts forming these dimensions can be assessed only by subjective measures, and that quality of life should be evaluated by asking the patient, or in some cases a proxy. Measurement of HRQL in children is based on these same principles, but is at an earlier stage of development [1].
HRQL assessment in children is complicated by developmental issues and by the need to use proxies in certain circumstances (e.g., preschool aged children). Some developers have addressed these issues by creating separate questionnaires for specific age-groups and for parent and child report. The PedsQL generic measure of HRQL, for example, has 4 parent report measures (ages 2-4, 5-7, 8-12 and 13-18 years old) and 3 child self-report measures (ages 5-7, 8-12 and 13-18 years old) [2].
Developmental issues are most relevant to the preschool aged group, who undergo rapid growth and development [3]. Since preschool aged children are not able to complete a questionnaire for themselves, the use of a proxy is essential. A growing number of studies have looked at the proxy issue in school aged children. Eiser and Morse (2001) performed a systematic review and reported that that there was greater agreement for observable functioning (e.g. physical HRQoL), and less for non-observable functioning (e.g. emotional or social HRQoL), and that agreement was better between parents and chronically sick children compared with parents and their healthy children [4]. These authors suggest there remain strong arguments for obtaining information from both parents and children whenever possible.
A recent systematic review [1] and a number of other review articles [5][6][7][8] describe the range of generic health related quality of life (HRQL) measures for children developed to date. At the time of the present study, generic questionnaires were developed to measure HRQL for school-aged children only. However, a full-length questionnaire still under development -the Infant/Toddler Quality of Life Questionnaire (ITQOL) -was made available for purposes of further evaluation (9). The ITQOL is conceptually similar to the Child Health Questionnaire (there is some overlap of items and scales) [10]. Both measures adopt the World Health Organization's definition of health, which is "a state of complete physical, mental and social well-being and not merely the absence of disease" [11]. The ITQOL was developed following a thorough review of the infant health literature and a review of developmental guidelines used by pediatricians [12], which identified core child health concepts and resulted in the development of items and scales to measure physical function, growth and development, bodily pain, temperament and moods, behavior and general health perceptions. Like the CHQ, the ITQOL also includes scales to measure parental impact (time and emotions).
Since the inception of the current project, two new generic measures for pre-school aged have since become available [2,13]. In The Netherlands, Fekkes and colleagues [13] developed the TNO-AZL Preschool Quality Of Life (TAPQOL), a 43-item (12-domain) generic pre-school measure of health status, and used this instrument in a study of preterm infants [14]. HRQL in this measure was defined as health status in 12 domains weighted by the impact of health status problems on wellbeing. These 12 domains measure aspects of physical, social, cognitive and emotional function. Varni et al, in the USA [2], developed the generic 23-item Pediatric Quality of Life Inventory (PedsQL), which can be used to measure 3 domains of health (physical, mental and social) in children and adolescents aged 2 to 18.
The aim of the current paper is to present preliminary information about the psychometric properties of the ITQOL questionnaire as applied in two samples of preschool aged children: a population-based follow-up study of children admitted at birth to level III neonatal intensive care units (NICU) (i.e., regional neonatal-perinatal centers that provide care for high risk pregnancies and intensive care for severely ill infants); and a comparison group of healthy full-term births. The overall purpose of our study was to link questionnaire survey data with administrative health data for NICU children and their caregivers to examine relationships between health care utilization, initial NICU birth experience and long-term health outcomes for respondents. Research describes a range of negative health outcomes associated with neonatal intensive care [15][16][17][18][19][20][21][22][23][24][25][26][27]. Commonly reported adverse outcomes include cerebral palsy, mental retardation, deafness, blindness as well as more widespread problems such as learning disabilities and behavioral problems. Results pertaining to HRQL outcomes in our sample of NICU graduates are reported in a separate publication [28]. at the time. Mothers' name and contact details were obtained from each hospital. This population of babies was then matched with provincial mortality records to identify and exclude any babies that had died after discharge from the NICU. To ensure the data were independent, only families with one child in the study sample were included in this paper.

Healthy baby sample
Our comparison sample of healthy term babies was recruited from the two hospitals with an affiliated hospital-based primary care unit (BC Women's and Children's Hospital and the Royal Columbian Hospital). This sample included all babies delivered over 11 months (March 1996 through January 1997 inclusive) by any primary care physician from these two units working within either of these two hospitals. Multiple births, babies with a sibling in the NICU sample, and babies subsequently admitted to a NICU for more than 24 hours were excluded. Contact details for the mother were obtained from the health records department at one hospital and directly from the primary care unit at the other.

Data collection
A questionnaire booklet, that included a number of separate instruments, was sent to each mother as her child turned 3 1/2 years of age. A consent letter was included to obtain permission to link the questionnaire data with hospital birth records. The caregiver that had, to that point in the child's life, spent the most amount of time with the child was asked to complete the questionnaire. Nonrespondents were sent a reminder letter and up to two more copies of the questionnaire as necessary. Finally, phone calls were made as part of a final effort to reach families. If the telephone number was not in service or reassigned, or a questionnaire was returned to us from the post office as undeliverable, a comprehensive search strategy was implemented. The process involved searching the Internet and/or contacting the mothers' primary care physician to obtain an address.

Infant Toddler Quality of Life Questionnaire
The questionnaire booklet included the developmental full-length version of the Infant Toddler Quality of Life Questionnaire (ITQOL) [9,29]. The prototype contains 103-items that measure 8 infant and 5 parental concepts (see Table 1). This instrument was developed for infants as young as 2 months and toddlers up to five years of age using developmental guidelines used by pediatricians and other published literature [12]. More than half the items in each scale must be answered in order to derive a score. Raw scores are calculated for each scale by computing the algebraic mean of the items. Following published convention [30], raw scores are then transformed to a scale from Rating of family's ability to get along with one another 0 (worst health) to 100 (best health).

Item-level analysis
Data completeness was measured by computing the percentage of items completed for each scale and the instrument. Following published conventions [31][32][33][34][35][36], item-toscale correlations (corrected for overlap) were considered satisfactory for items that correlated .40 or more with their hypothesized scale. Item discriminant validity was considered successful if the correlation between an item and its hypothesized scale was significantly higher (≥ 2 S.E.) than correlations between that item and all other scales. As advised with newly created scales [30], the percentage of correlations that were ≥1 S.E. higher for each item and its hypothesized scale were also examined.

Scale-level analysis
For each scale, we determined the percentage of scores that could be computed. The distribution of scores was examined to determine potential floor and ceiling effect (i.e., people scoring at the absolute lowest and highest ends of the continuum for each scale). Scale internal consistency was assessed in terms of Cronbach's α coefficient.
Internal consistency was considered satisfactory if the coefficient was at least .70 [37,38]. To evaluate the degree to which each scale was "unique", correlations among all scales were examined and compared against the respective Cronbach's α reliability coefficient observed for each individual scale. In general, the correlation between scales should be less than the alpha coefficient achieved for an individual scale [37]. To examine test-retest reliability, a random sample of 80 NICU respondents, who indicated they would be willing to participate in further research, was contacted by telephone. Those that agreed to participate were sent a copy of the ITQOL in the mail. A second copy of the questionnaire was mailed out once it was confirmed that the first copy had been completed. Test-retest reliability was assessed through intra-class correlation coefficients. ICCs of at least .70 were considered satisfactory [37,38]

Concurrent validity
To test concurrent validity, scale scores in the ITQOL were correlated with scores for similar and dissimilar scales in three validated instruments: the Child Behavior Checklist/ 1.5-5 (CBCL/1.5-5) [39]; the SF-36 [40,41]; and the Family Assessment Device (FAD) [42]. Scales from each instrument that are intended to measure similar constructs should have higher correlations (convergent validity) with each other than with scales that measure unrelated constructs (divergent validity). Correlations of <0.20 were considered negligible; 0.20 to 0.34 weak; 0.35 to 0.50 moderate; and >0.50 strong [43].

Child Behavior Checklist (CBCL/1.5-5)
Since no validated multidimensional generic measure of HRQL was available for validation purposes, we used a measure of behavior as 55% of items in the ITQOL measure child behavior or temperament. The CBCL/1.5-5 measures behavioral, emotional and social functioning in children 1 1/2 to 5 years of age. This 100-item instrument measures both internalizing and externalizing syndromes and can be summed to produce a total problem score. A higher score reflects greater presence and severity of symptoms.

Short Form 36
The SF-36 [40,41] assesses the following 8 domains of adult health: physical health; physical role limitations; emotional role limitations; mental health; social function; energy; pain; and general health perception, and was used to help validate the ITQOL parent-impact scales.
Since the mental health domain and one item from general health perception are included in the ITQOL, the remaining 6 domains were used in the validation process. Scores on these domains can range from 0 (worst health) to 100 (best health).

Family Assessment Device
The Family Assessment Device (FAD) [42] is a measure of family functioning and was used to help validate the Family Cohesion item. Scores for this 12-item scale can range from 0 to 36 with higher scores indicating greater dysfunction.

Discriminant validity
The ability of the ITQOL to discriminate between groups of children with poorer expected outcomes was determined by comparing ITQOL scale scores for the following two dichotomous variables (using Mann-Whitney U-test for statistical significance): (1) NICU vs. healthy baby sample; and (2) children with one or more health problems (from a list of 16 common childhood conditions) vs. children with no health problems. The NICU sample and the group with one or more health problems were expected to have poorer reported health. Effect size statistics (i.e., mean difference divided by pooled s.d.) were computed to determine the magnitude of the difference in mean scores.

Results
Questionnaires were sent to mothers of 1,907 NICU babies and 718 healthy babies. Fifty percent of families had moved at least one time since the birth of their baby. Using our search strategy, we were able to locate 81% of families. The overall response rate (after 131 exclusions, e.g. deaths, language issues) was 54.9%, and the response rate for families we successfully located was 67.9%, with completed questionnaires received for 972 NICU families and 393 healthy baby families. The response rate for the NICU sample did not vary from that of the healthy baby sample. Five NICU respondents returned a signed consent form without a completed questionnaire and were dropped from the analysis.
For both samples combined, the mean age of the respondents was 35 (s.d. 5.7; range 19 to 65). Most respondents, (98.1%), were the child's biological parent, most commonly the child's mother (94.6%), and most (85.6%) were married or living in a common-law relationship. No differences were found between the NICU and healthy baby group in terms of parental age, gender, marital status or educational level. The proportion of boys in the sample was 55.1%. The sample was composed of 926 (68%) three-year olds, 413 (30.3%) four-year olds, and 23 (1.7%) five-year olds. The five-year old children have been excluded from the psychometric analysis since this group is unlikely to be representative.

Item-level analysis
Item-level results are presented in Table 2. Sixty-seven percent of respondents in the NICU sample and 74% of respondents in the healthy baby sample answered all 103 items. This was lower than the 83% (NICU sample) and 94% (healthy baby sample) of respondents who completed all items for the similar length CBCL/1.5-5. Of those missing at least one response on the ITQOL, three quarters of respondents in both samples missed answering only 3 items or less. The rate of missing data within each scale varied from 2.3% (Impact-emotional) to 8.9% (General Health Perception) for the NICU sample, and from 0.8% (Impact-emotional) to 7.8% (Temperament and Moods) for the healthy baby sample.
For item-scale correlations, 91% (NICU sample) and 84% (healthy baby sample) of items correlated with their own domain above the recommended standard (0.40). Within domains, perfect results were obtained for 7 (NICU sample) and 5 (healthy baby sample) scales. For item-discriminant validity, 97% (NICU sample) and 87% (healthy baby sample) of items correlated more highly (≥ 2 S.E.) with their hypothesized scale than with other scales. Perfect results (100%) were attained for 8 of the 10 scales in the NICU sample, and 6 of the 10 scales in the healthy baby sample. Only 2 items in the NICU sample (in Getting Along) and 5 in the healthy baby sample (in Temperament and Moods, General Behavior and Getting Along) did not correlate ≥ 1 S.E. with its hypothesized scales.

Scale-level analysis
Scale-level results are presented in Tables 3 and 4. The proportion of missing values for scored domains was small: 2.9% (Physical Abilities) or less. There were no floor effects, but ceiling effects (scores of 100%) were apparent. The largest ceiling effect (69.3% NICU; 85.8% healthy baby) was in the Physical Abilities scale. The range of scores was particularly skewed for three scales (Physical Abilities, Growth/Development, Bodily Pain) where more than 84% of respondents in both samples reported scores of 75 or higher. Scores for scales that assess aspects of emotional and behavioral function showed more variability.
For both samples, the Cronbach's alpha coefficients were .80 or higher. One scale (Physical Abilities) achieved a coefficient of .96. The correlations between the ITQOL scales were on average moderate (see Table 5 and 6). All

Concurrent validity
Correlations between related scales in the ITQOL and other standardized instruments were strong (see Tables 7  and 8). Specifically, Getting Along, Temperament, and General Behavior correlated more strongly with CBCL syndrome and total problem scores and less strongly with domains that measure aspects of physical health. Similarly, as anticipated, the parental impact scales (emotional and time) correlated more strongly with SF-36 psychosocial scales than with SF-36 physical scales. The family cohesion item correlated strongly with the Family Function Scale and weakly or moderately with all other scales. Single items are not included in these analyses.  Table 9 and 10 presents findings for tests of discriminant validity. Parents of NICU children reported their children as having significantly poorer HRQL than children in the healthy baby group for 5 of the child scales. Scores for the NICU sample were also lower for the 3 parent scales. These differences were all small in size (effect size .44 or smaller).

Discriminant validity
In the NICU sample, those with children with at least one health problem that required treatment in the past year had poorer reported HRQL in all areas compared with those without health problems. In the healthy baby sample, significant differences were noted for 4 of the child scales.  Scores range 0-100 -a higher score indicated more favorable quality of life

Discussion
Increasingly, valid and reliable instruments are needed by researchers and clinicians to facilitate the collection of HRQL data in children. The preliminary results from this study of children aged 3 and 4 years of age indicate that the ITQOL has acceptable reliability in a sample of children requiring neonatal intensive care and a sample of healthy peers born during the same time period. The vast majority of items in the ITQOL were substantially linearly related to their hypothesized scale, and correlations were stronger than with other scales. This finding suggests acceptable item discriminant validity. Alpha coefficients for all but one scale (Physical Abilities .96) were between .80 and .90, indicating that each domain was internally reliable. In addition, the ICCs were all satisfactory, indicating that parents were consistent in their ratings of their children's health upon repeated assessments.
The range of scores in three scales for both samples (Physical Abilities; Growth and Development; Bodily Pain/Discomfort) was rather skewed. It is possible that the ceiling effects may be due to the absence of younger children in our sample, or it could be because many of these children, after graduating from the NICU, are healthy. Questionnaires were sent to parents of children as they turned 3 1/ 2 and were completed at different times (due to the lag time for locating families that moved). Thus, our samples included children ranging in age from 3 to 5 years. The five year olds were excluded since these data were unlikely to be representative. Future validation research should look at the full age-range from two months up to fiveyears, as well as sub-populations (e.g., children with acute and chronic disease). Given the rapidly changing nature of infants and toddlers, it will be important to establish that the same instrument can measure HRQL in a twomonth old and a five-year old.
In its present form, the main disadvantage of the ITQOL is its length. Evidence from the item-level analysis (certain items did not satisfy scaling success criteria) suggests there may be scope for reducing the questionnaire's length. In a recent systematic review of methods used to increase response to postal surveys, the use of a short questionnaire made response much more likely [44]. Since many HRQL studies rely on postal surveys, the development of a short-form, which is planned, may prove useful. Future validation research will need to ensure a large enough sample size across age groups to provide the opportunity to determine which items may be deleted and still retain the psychometric properties deemed necessary.
This study has certain limitations. First, we did not explore concurrent validity for all the instruments' domains.
There was no suitable validated multidimensional measure of HRQL for preschoolers at the time our study was setup. We, therefore, chose to include a validated measure of behavior (CBCL/1.5-5), since the developmental version of the ITQOL is heavily weighted towards measuring behavior. Had we included domain-specific measures for all domains in our study, the length of our questionnaire booklet would likely have been unacceptable to subjects. Using the CBCL/1.5-5, SF-36 and FAD, we found expected correlations between similar and dissimilar constructs in the various measures. Future research should explore concurrent and divergent validity for all the ITQOL domains Second, although we made every effort to locate the entire cohort, we only found 81%, and only 67.9% of these subjects completed our study questionnaire. This response rate is within the range often obtained in a postal survey [45]. Many of the non-participants indicated (verbally or in writing) they were "too busy" to participate. It is also likely that some questionnaires returned to us blank were from non-English speakers. Elsewhere we report that where we had data and were able to look at response bias (NICU sample only), we found a few differences between non-respondents and respondents children, which suggested that non-respondents had healthier babies to begin with, and represents a potential source of bias [28].
Third, our group of healthy babies was not randomly selected from all low-risk births in the province. However, they composed a consecutive sample of hospital deliveries by all family physicians working within the primary care units affiliated with 2 of the hospitals (the third hospital did not have such a unit).

Conclusion
The results from this study indicate that the ITQOL has good reliability and construct validity in a sample of children who were healthy and another that had morbid conditions requiring neonatal intensive care. Limitations include its length and possible ceiling effects. Future validation work should include children of different ages and with different clinical problems.

Author's contributions
Anne Klassen contributed to the study's conception and design; acquisition of data; analysis and interpretation of data; drafting of manuscript; revised the article critically for important intellectual content; and gave final approval of the version to be published.
Jeanne M. Landgraf, contributed to analysis and interpretation of data; revised the article critically for important intellectual content; and gave final approval of the version to be published Shoo Lee contributed to the study's conception and design, acquisition of data, analysis and interpretation of data; revised the article critically for important intellectual content and gave final approval of the version to be published.
Morris Barer contributed to the analysis and interpretation of data; revised the article critically for important intellectual content; and gave final approval of the version to be published.
Parminder Raina contributed to the study's conception and design; the analysis and interpretation of data; revised the article critically for important intellectual content; and gave final approval of the version to be published.
Herbert Chan contributed to the acquisition of data; revised the article critically for important intellectual content; and gave final approval of the version to be published.
Derek Matthew contributed to the acquisition of data; revised the article critically for important intellectual content; and gave final approval of the version to be published.
David Brabyn contributed to the acquisition of data; revised the article critically for important intellectual content; and gave final approval of the version to be published.