Validation of an instrument to evaluate quality of life in the aging population: WHOQOL-AGE

Background There is a need for short, specific instruments that assess quality of life (QOL) adequately in the older adult population. The aims of the present study were to obtain evidence on the validity of the inferences that could be drawn from an instrument to measure QOL in the aging population (people 50+ years old), and to test its psychometric properties. Methods The instrument, WHOQOL-AGE, comprised 13 positive items, assessed on a five-point rating scale, and was administered to nationally representative samples (n = 9987) from Finland, Poland, and Spain. Cronbach’s alpha was employed to assess internal consistency reliability, whereas the validity of the questionnaire was assessed by means of factor analysis, graded response model, Pearson’s correlation coefficient and unpaired t-test. Normative values were calculated across countries and for different age groups. Results The satisfactory goodness-of-fit indices confirmed that the factorial structure of WHOQOL-AGE comprises two first-order factors. Cronbach’s alpha was 0.88 for factor 1, and 0.84 for factor 2. Evidence supporting a global score was found with a second-order factor model, according to the goodness-of-fit indices: CFI = 0.93, TLI = 0.91, RMSEA = 0.073. Convergent validity was estimated at r = 0.75 and adequate discriminant validity was also found. Significant differences were found between healthy individuals (74.19 ± 13.21) and individuals with at least one chronic condition (64.29 ± 16.29), supporting adequate known-groups validity. Conclusions WHOQOL-AGE has shown good psychometric properties in Finland, Poland, and Spain. Therefore, considerable support is provided to using the WHOQOL-AGE to measure QOL in older adults in these countries, and to compare the QOL of older and younger adults.


Background
The World Health Organization Quality of Life Assessment (WHOQOL) is an instrument to measure quality of life (QOL). It has been simultaneously developed in different cultures and languages in order to make it applicable across cultures [1]. There are some areas of QOL that may be more relevant for older adults; therefore, specific instruments that assess QOL adequately in the older adult population are needed [2]. The present study aimed to validate an instrument, the WHOQOL-AGE, built upon previous WHOQOL instruments, which is relatively short to use, e.g., in large-scale population studies or in busy clinical settings; use this instrument to measure QOL in an aging population; and test its psychometric properties in terms of its validity and reliability.
Several versions of the WHOQOL instruments have been shown to have good psychometric properties in terms of reliability, validity and sensitivity to change in different population groups. WHOQOL-100 is a reliable and valid measure of QOL for use in a diverse range of cultures [1] which consists of 24 facets grouped into six domains, whereas WHOQOL-BREF is a reduced 26-item version comprising four domains: physical, psychological, social and environment [3]. The EUROHIS-QOL eight-item index [4] is a brief questionnaire based on WHOQOL-100 and WHOQOL-BREF. It has shown good cross-cultural performance in ten European countries, as well as satisfactory convergent and discriminant validity.
In order to understand the QOL of older adults, some instruments to measure QOL in the elderly, such as the Elderly Quality of Life Index (EQOLI) [5] and the Quality of Life Scale for Elderly (QOLS-E) [6], have been developed. EQOLI was developed in Brazil to monitor longitudinal change in QOL, as well as to evaluate the impact on QOL of behavior, intervention, and treatment. The instrument comprises eight domains and 43 items [7]. The QOLS-E was developed and validated in a sample of the institutionalized population in Japan, and showed an adequate factor structure, although its reliability was not very high [6].
The WHOQOL-OLD is a supplementary module for the WHOQOL for use with older adults, developed using the WHOQOL methodology, in which a simultaneous approach to instrument development is employed in different cultures [2]. Recently, short versions of WHOQOL-OLD have also been developed [8]. Since WHOQOL-OLD needs to be administered together with WHOQOL-BREF, its administration, even when using the short versions of WHOQOL-OLD, requires a long time. Consequently, there is still a need to identify a parsimonious set of items to evaluate QOL in older adults in the general population that can be administered when time is at a premium, e.g. in population-based or clinical studies when other additional data need to be collected, depending on the primary purpose of the study. WHOQOL-AGE is attempting to cover this need, since it is a short instrument, designed to be administered in general population studies, which covers the areas of QOL that are specific to older adults.
WHOQOL-AGE has been designed specifically for the aging population, but in order to understand the transition of aging, it is also important to be able to compare the QOL of the aging population with younger people. The validation process of WHOQOL-AGE will, therefore, be carried out in the aging population and in the population aged 18-49 years, in order to make sure that the instrument also allows comparisons with younger populations.

Design and procedure
The "Collaborative Research on Ageing in Europe (COURAGE in Europe)" is an observational, crosssectional study of the general non-institutionalized adult population reached though household interviews. The sample is representative of three European countries (Finland, Poland, and Spain), which were selected to give a broad representation across different geographical European regions, taking into consideration their population and health characteristics.
Face-to-face interviews using Computer-Assisted Personal Interviewing (CAPI) were carried out at the respondents' homes. All of the interviewers participated in a training course for the administration of the survey. A total of 18 trainers from the three countries (six from Finland, eight from Poland, and four from Spain) attended a central five-day training in English, and they then trained the local interviewers of each country in the local languages (Finnish, Polish, and Spanish). The number of interviewers in the local trainings ranged from 14 in Finland to 55 in Poland. The surveys were conducted in 2011-2012.

Sample
A multi-stage clustered design was used to obtain nationally representative samples. A probability proportion to size design was used to select clusters. In Poland and Spain, an enumeration of existing households was carried out within each cluster to obtain an accurate measurement of size. In Finland, systematic sampling of individuals within each cluster was applied.
Initially, 10 800 respondents were recruited (1976 from Finland, 4071 from Poland, and 4753 from Spain). As in many other aging studies, such as SHARE [9], HRS [10], ELSA [11], TILDA [12], MHAS [13] or SAGE [14], people 50+ years old were evaluated in order to understand the transition of aging. Furthermore a group of subjects who were 18-49 was also included in order to make comparisons between younger and older people. A split technique was used to divide the overall sample into two groups: developmental and validation. 70% (n = 7560) was randomly assigned to the developmental sample, and the remaining 30% (n = 3240) to the validation sample, considering a similar proportion of respondents by country in each sample. The developmental sample was used to analyze the factorial structure of the WHOQOL-AGE by means of exploratory factor analysis, whereas the validation sample was used to assess the reliability and validity of the scale, using confirmatory factor analysis techniques and item response theory methods. The individual response rate was 53.4% for Finland, 66.5% for Poland, and 69.9% for Spain.
If a participant was cognitively impaired and not able to respond to the interview, a proxy was asked some questions about the participant's health. For the purposes of the present analyses, these participants were not included.

Measures
Items from WHOQOL-AGE were derived as an adaptation from the EUROHIS-QOL eight-item index [4] and from the WHOQOL-OLD short form version 1, which comprises six items [8]. A pilot study was carried out in 2010 in the three countries, and based on the feedback from the interviewers and on the preliminary analyses, one question from WHOQOL-OLD was deleted (How concerned are you about how your life will end?) and some wording was changed. Thus, the new instrument, WHOQOL-AGE, comprises 13 positive items (eight derived from EUROHIS-QOL and five from WHOQOL-OLD), assessed on a five-point rating scale.
Furthermore, participants answered questions regarding their overall satisfaction with life, net affect, and presence of chronic conditions. These measures were used, respectively, to evaluate convergent, discriminant, and known-groups validity.
To evaluate overall satisfaction with life (SWL), respondents were asked: Taking all things together, how satisfied are you with your life as a whole these days?, ranking their answer on a scale ranging from 1 = very dissatisfied, to 5 = very satisfied.
Net affect was assessed with an abbreviated version of the Day Reconstruction Method [15], designed to be used in general population surveys. Participants reconstructed a portion of their previous day's activities and responded to questions about each episode, including what they were doing and the extent to which they experienced various feelings on a scale ranging from 0 (not at all) to 6 (very much), with the remaining points unlabelled [16,17]. Individual net affect was calculated by averaging two positive emotions (calm/relaxed and enjoying) minus five negative ones (worried, rushed, irritated/angry, depressed, and tense/stressed), weighting by activity duration. Net affect scores ranged from −6 to 6, with higher scores representing a better affective state.
Participants were also asked questions concerning their sociodemographic characteristics, and the presence of five chronic conditions (depression, arthritis, angina, diabetes, and asthma) during the previous 12 months was assessed. Individuals were considered to have the condition when they had been diagnosed with the condition and had been taking medication or other treatment during the previous 12 months, or when they reported the presence of the core symptoms of the condition during the previous 12 months.
The questions that had not been previously translated and validated in the local languages were translated from English into Finnish, Polish, and Spanish, following the World Health Organization translation guidelines for assessment instruments, which included a forward translation, a targeted back-translation, review by a bilingual expert group, and a detailed report on the translation process. The study was approved by the Bioethical Committee, Jagiellonian University, Krakow, Poland; Ethics Review Committee, Parc Sanitari Sant Joan de Déu, Barcelona, Spain; Ethics Review Committee, La Princesa University Hospital, Madrid, Spain; and the Ethics Review Committee, National Public Health Institute, Helsinki, Finland. Written information consent from each participant was also obtained.

Statistical analysis
Participants who did not complete the interview and did not respond to the QOL section were excluded, as were participants who responded to the QOL section but did not respond to one or more items of WHOQOL-AGE. Frequency analysis and descriptive statistics were used to analyze the demographic characteristics of the developmental and validation samples, after excluding missing values. Differences in proportions and scores between both samples were analyzed using Chi-square tests and unpaired t-tests.

Developmental sample
An Exploratory Factor Analysis (EFA) using a polychoric correlation matrix was conducted on the developmental sample to detect the latent structure among WHOQOL-AGE items. Velicer's Minimum Average Partial (MAP) test [18] was employed to select the number of factors to extract. Geomin rotation for correlated factors was used and each item was associated with the factor in which it had the highest loading. The EFA was carried out separately on people less than and more than 50 years old. In order to assess the factorial equivalence between the two populations, the factor congruence coefficient [19] was calculated, which measures the degree of similarity between factor structures obtained in two independent samples. Interpretation of this coefficient is similar to the Pearson's product moment correlation. A value of 0.90 is typically considered necessary to suggest factor congruence [20].

Validation sample
A Confirmatory Factor Analysis (CFA), using maximum likelihood estimation with robust standard errors (MLR estimation), was used to assess how well the data fit the theoretical model and therefore to confirm the factorial structure suggested by the EFA carried out on the developmental sample. Goodness-of-fit of the model was evaluated according to the standard recommendations [21,22]. Values of the Comparative Fit Index (CFI) and Tucker-Lewis Index (TLI) above 0.90 were considered to represent an adequate fit; values of Root Mean Square Error of Approximation (RMSEA) less than 0.08 indicated a good fit [23]. χ 2 test of goodness-of-fit was not reported. Since the χ 2 statistic is sensitive to sample size [24], the χ 2 values might be inflated (statistically significant) due to the large size of the sample, which might erroneously imply a poor data-to-model fit [25]. Burnham and Anderson [26] noted that model goodness-of-fit based on statistical tests becomes irrelevant with large sample sizes. One common assumption in these models is that a parameter is equal to a given value, often zero (e.g. saying there is no direct relationship between two variables). Modification indices were employed to evaluate how reasonable these assumptions are, by observing what happens when these assumptions are relaxed.
Moreover, the goodness-of-fit of each model was assessed by means of the Bayesian Information Criterion (BIC) proposed by Schwarz [27], which is asymptotically consistent with large sample sizes. Information criteria are entropy-based measures of the goodness-of-fit of a statistical model. They can be applied to models with parameters estimated using maximum likelihood methods. In the case of factor analysis, the aim is to create a factor model that balances complexity (number of factors) with the amount of variance explained. The definition of the information criteria implies that a smaller value indicates a better model.
Since the item responses are polytomous and ordered, a Graded Response Model (GRM) was employed for each of the factors obtained [28]. In the GRM, the values of the discrimination parameter and the item information function were estimated for each item. The discrimination parameter represents the ability of an item to discriminate between people with different levels of an underlying trait; and the total information was calculated by adding up the item information values-the greater the value, the more contribution to the measure of the factor.
In order to find evidence for the use of a global score on the WHOQOL-AGE, a second-order confirmatory factor analysis was applied to test the accuracy of a model with a second-order factor comprising the first-order factors obtained previously by means of exploratory and confirmatory factor analyses. If there is only one second-order factor, then there must be at least three first-order factors if the model is to be identified [29]. To solve underidentification problems, the first-order factor variance was fixed to 1, and the mean and the variance of the secondorder factor were fixed to 0 and 1, respectively.
The internal consistency reliability was assessed by means of Cronbach's alpha. As suggested by Bland & Altman [30], a Cronbach's alpha of 0.70 or higher was considered to indicate adequate reliability. Convergent validity was evaluated by the correlation between the SWL item and the global WHOQOL-AGE score. Discriminant validity was evaluated by means of Pearson's correlation coefficient between the WHOQOL-AGE score and the net affect score. The method described by Raykov [31] was used to test whether the discriminant validity coefficient was sufficiently lower than the convergent validity coefficient, a condition posited by Campbell & Fiske [32] as evidence supporting construct validity. In order to assess the known-groups validity of the questionnaire, the WHOQOL-AGE score was compared for healthy and non-healthy populations. Participants were defined as healthy if they did not present any of the chronic conditions assessed (depression, arthritis, angina, asthma, and diabetes), whereas they were defined as non-healthy if they had at least one of those chronic conditions. Mean scores were compared by means of unpaired t-tests, and the magnitude of the difference was measured by Hedges' g effect size coefficient. The results obtained in terms of reliability and validity for the WHOQOL-AGE were compared with those obtained using the EUROHIS-QOL, and the five items from the WHOQOL-OLD short form version 1 that were included in the WHOQOL-AGE questionnaire.
Normative values, including main percentiles of the distribution of WHOQOL-AGE scores, were calculated across countries and for different age groups: 18-29, 30-39, 40-49, 50-59, 60-69, 70-79, 80-89, and 90+ years. The data to obtain the normative values were weighted to account for the sampling design in order to generalize the results to the population in each country. Finally, the cumulative distribution of WHOQOL-AGE scores by country was presented across the population aged 18-49 years and the population aged 50 and over.
Analyses corresponding to GRM were carried out using the ltm package [33] in R [34]. Mplus version 6 [35] was employed for factor analysis modeling. The rest of the analyses were performed using Stata version 11 [36].
The sample was randomly split into a developmental sample (n = 6993) and a validation sample (n = 2994). In order to confirm that both subsamples were representative of the initial sample, the demographic characteristics of both populations were compared. Sociodemographic characteristics of these samples are shown in Table 1. Significant differences were not found between the developmental and the validation samples regarding the main sociodemographic characteristics.

Exploratory factor analysis (EFA)
In the developmental sample, the Velicer MAP criterion achieved the minimum value for the solution comprising two factors. The two-factor solution explained 62.8% of the total variance for people aged 18-49 years old, and 65% of the total variance for those aged 50+. In Table 2, factor loading estimates after Geomin rotation are shown for both age groups. Items Q2-Q8 loaded on the first factor, whereas items Q9-Q13 loaded on the second factor. Item Q1 presented a similar loading on the first and the second factors, and was considered as belonging to both factors. A similar factor structure was found between both samples, with Tucker's congruence coefficient being 0.98 for the first factor and 0.96 for the second factor.

Confirmatory factor analysis (CFA)
A CFA was carried out on the validation sample in order to assess the suitability of the factor model proposed, comprising two correlated factors. Due to the similar factor structure found in the EFA in both age groups, the CFA was conducted over the pooled sample. One of the two factors loaded on items Q1, Q2, Q3, Q4, Q5, Q6, Q7, and   In bold, items from WHOQOL-OLD short form version 1, in italics, items from EUROHIS-QOL. All the response options use a five-point rating scale, ranging from very bad to very good for Q1, from very dissatisfied to very satisfied for Q2 to Q8, from not at all to completely for Q9 to Q12, and from not at all to an extreme amount for Q13.

Graded response model (GRM)
Graded Response Models were carried out on each factor in the validation sample, considering item Q1 as belonging to both factors. Table 3 shows discrimination parameters and the total information explained by the items in each factor. By means of the Test Information Curve, it was observed that the five items added to the eight-item EUROHIS-QOL provided 20.81% of the total information corresponding to factor 1, and 58.72% of the information corresponding to factor 2 (taking into account that items Q2 and Q8 loaded on factor 1; and items Q10, Q11, and Q13, on factor 2). A useful comparison between items can be performed by plotting the Item Characteristic Curves for each category separately. Items Q3 and Q5 were highly related, with a correlation coefficient equal to 0.81. According to the Item Response Category Characteristic Curves (Figure 1), these items had a very similar effect on the construct corresponding to the first factor. Moreover, items Q9, Q10 and Q11 had a similar effect on the construct corresponding to the second factor ( Figure 2). Mean polychoric correlation between these three items was 0.76, with correlation coefficients among these three items ranging from 0.72 to 0.78.
These analyses were also carried out separately in both age groups (18-49 and 50+ years), and very similar results were found. For example, in the 50+ age group, the five items added to the eight-item EUROHIS-QOL provided 27.87% of the total information corresponding to factor 1, and 58.55% of the information corresponding to factor 2. These percentages were 24.15% and 60.81%, respectively, in the 18-49 age group.

Scoring WHOQOL-AGE
Considering the solution comprising two factors, a score for each factor was obtained. Items with a similar performance (according to the Item Response Category Characteristic Curves) and a strong relationship between them, were combined in the scoring method proposed. Taking into account the similar performance of items Q3 and Q5; of items Q9, Q10 and Q11; and the similar loading of item Q1 on both factors, the following formula was proposed to calculate a score for each factor: Finally, to find support to obtain a global score on WHOQOL-AGE based on the scores obtained for factor 1 and factor 2, a second-order confirmatory factor analysis was constructed over the first-order factors F1 and F2. Adequate goodness-of-fit was found for this model: CFI = 0.93, TLI = 0.91, RMSEA = 0.073 [90% CI = (0.069, 0.077)], BIC = 81593.19. Scores obtained in equations (1) and (2) were transformed to the percentile scale, and then the global score on WHOQOL-AGE was defined as the average of these scores.

Reliability and validity
Adequate Cronbach's alpha values were found for each of the two latent factors (α = 0.88 for factor 1, α = 0.84    The convergent validity of the WHOQOL-AGE was estimated at 0.75 [95% CI = (0.73, 0.77)]. Regarding discriminant validity, a moderate correlation was found between WHOQOL-AGE and net affect [r = 0.35; 95% CI = (0.31, 0.38)]. The resulting 95% CI for the difference between these convergent and discriminant validity coefficients was (0.37, 0.44). This result suggests, with high confidence, that the convergent validity coefficient considered was markedly higher in the population than the discriminant validity coefficient. Similar values for correlation coefficients were found across countries (results available from the authors upon request), with the only exception being the correlation between WHOQOL-AGE score and net affect, which was lower in Finland [r = 0.21, 95% CI = (0.17, 0.26)].
In Table 4, these reliability and validity coefficients are shown separately by age groups: 18-49 and 50+ years. These coefficients are reported for the WHOQOL-AGE, the EUROHIS-QOL, and the five items from the WHOQOL-OLD short form version 1. In the case of the WHOQOL-AGE, the Cronbach's alpha values were α = 0.89 for factor 1 and α = 0.85 for factor 2 in the older population; these values were, respectively, 0.86 and 0.80 in the 18-49 age group. The five items from the WHOQOL-OLD short form version 1 that were included in the WHOQOL-AGE questionnaire showed lower reliability in the 18-49 age group, whereas the Cronbach's alpha values for the WHOQOL-AGE were very similar, although slightly higher than for the EUROHIS-QOL, in the older population. Moreover, significant differences were found between healthy individuals (n = 1795) and individuals having at least one chronic condition (n = 1199), with higher scores on WHOQOL-AGE for healthy people (74.19 ± 13.21 vs. 64.29 ± 16.29, t (2992) = 18.30, p < 0.001). The effect size associated with this difference was considerable (Hedges' g = 0.64). Significant differences were also found in the analysis carried out separately by countries, with effect sizes ranging from 0.54 to 0.79. These results suggested adequate known-groups validity.
In terms of score distributions, the observed range was similar to the theoretical range (from 0 to 100), indicating that the measure covers the full range of the QOL continuum, although the distribution had negative skew (Fisher-Pearson coefficient of skewness = −0.69), indicating that most of the people reported a good QOL, as expected, given that the study sample came from the general population and not from clinical settings. Floor effects were negligible (there was only one person with a score of zero, the worst QOL), and ceiling effect was also acceptable (1.4%). Scores on WHOQOL-AGE decreased as age increased, as can be seen in the table of normative values (see Table 5). The cumulative distribution of WHOQOL-AGE scores for the 18-49 and 50+ age groups can be observed in the Figure 3, supporting results that suggest a lower QOL for the older population, according to their WHOQOL-AGE scores.

Discussion
The present study aimed to validate an instrument to measure QOL in an aging population. WHOQOL-AGE has shown good psychometric properties in Finland, Poland, and Spain. Adequate goodness-of-fit indices were found according to the standard recommendations of Structural Equation Modeling literature [21][22][23]. These indices confirmed that the factorial structure of WHOQOL-AGE comprises two first-order factors, one loaded by items Q2 to Q8, and the other one loaded by items Q9 to Q13, with item Q1 loading on both factors. However, by means of a second-order confirmatory factor analysis, evidence was found supporting that these two factors belong to a more general construct. The similar factor structure in the population aged 18-49 years and in the population aged 50+, the results obtained in the pooled sample during the validation process, and the analyses carried out separately for both age groups, suggested that this instrument could be employed in the population aged 18-49 in order to compare their QOL with the older adults. A score for each component and a global score for WHOQOL-AGE are proposed; this method would involve recombining some items before converting the score on each factor into a percentage. Considering that some of the items had a similar performance, and that item Q1 loaded equally on both factors, it was decided that taking this into account in the scoring provides better precision. The formula proposed is very simple, and the score can be easily calculated. The global score for the WHOQOL-AGE was computed, averaging the scores previously obtained for each factor. This is the preferred scoring method. Nonetheless, if calculating this score is not feasible, as might happen in clinical practice, it is possible to use a simpler score, obtained by adding up the items. All the results and normative values presented in the present paper have been obtained using the first scoring method, and therefore they cannot serve as a guideline if the second option is used.
In terms of reliability, Cronbach's alpha values were higher than the recommended cut-off point of 0.70 [30], indicating adequate internal consistency. Regarding validity, the convergent validity and discriminant validity coefficients were appropriate, and the difference in the magnitude between them was sufficiently high, also supporting construct validity evidence [32]. In line with previous results for EUROHIS-QOL [4], WHOQOL-AGE also discriminates well between healthy individuals and individuals with a chronic condition, showing adequate known-groups validity. Since the results were similar across countries, only the general analyses, pooling the data of the three countries, are shown (country-bycountry analyses are available from the authors upon request). The addition of the five items provided additional explanatory variance over and above EUROHIS-QOL. Furthermore, WHOQOL-AGE showed better reliability than WHOQOL-OLD. EUROHIS-QOL does not include specific questions that are relevant for older adults, and WHOQOL-OLD has to be administered together with WHOQOL-100 or WHOQOL-BREF, which implies that none of the questionnaires was a short instrument adequate to evaluate QOL in older adults. WHOQOL-AGE is an instrument that fills this gap. By combining WHOQOL-OLD and EUROHIS-QOL, it has been possible to create WHOQOL-AGE, an instrument that evaluates specific areas of QOL that are relevant for older adults, such as satisfaction with the senses, the use of time, opportunities to achieve, intimate relationships and control, but also makes it possible to compare the QOL of the older and younger populations. Moreover, WHOQOL-AGE is short enough to be used when time is at a premium, so it is especially recommended for populationbased studies that are interested in measuring QOL as an adjunct to health and functional status, as was originally considered [1], or even when further measures of wellbeing, social networks, and built environment are included, as in the COURAGE in Europe survey.
One of the strengths of the present study is that it uses data obtained with representative samples from three different European countries. Even though there are no strict standards for determining an acceptable response rate, the response rates found in this study can be considered adequate [37] and similar to the ones found in other general population studies recently conducted in Europe, such as SHARE (with a global response rate for the ten countries of 61.8%, ranging from 37.6% in Switzerland to 73.6% in France) [9], ELSA (individual response rate of 67%) [11] and TILDA (with a household response rate of 62%) [12].
However, there are also limitations in the present paper. Some participants were excluded from these analyses because they were not able to participate in the interview; because they finalized the interview before responding to the QOL section; or because they did not respond to some of the items. The excluded sample was therefore older than the included sample. The fact that the excluded sample had also received less years of education and was less frequently married or living with a partner could be due to the higher age in this group. Furthermore, although the percentage of participants who needed a proxy was similar in the three countries, there were more people from Spain in the excluded sample. This is due to the fact that there were more people in Spain (8.3%) that did not respond to item Q13, which asks about satisfaction with intimate relationships. Cultural differences that might make this a more sensitive question in Spain might account for the higher percentage of missing responses on this question in Spain. Nevertheless, the percentage of missing values on this question is not too high, so there is no need to consider dropping it, since it adds valuable information that is not covered by any other question. Only 4.3% of respondents who answered the QOL section did not respond to at least one question on WHOQOL-AGE. If Q13 is not considered, only 0.7% of the sample has any missing value, which suggests that the questions are easy to answer, indicating the instrument's high feasibility. Although for validation purposes the participants who did not respond to one item were excluded, a recommendation for future studies using the WHOQOL-AGE questionnaire is to allow up to one missing value in order to compute the WHOQOL-AGE score. The scores should not be obtained if there are two or more missing values (the syntax to calculate the scores with one missing value is available upon request). Future studies might consider using a proxy instrument to evaluate QOL in older people with cognitive impairment, in order to avoid missing valuable information concerning those people with the worst health state.
Although the samples were representative of the population of the three countries, in order to avoid having small sample sizes for the oldest age groups, the "oldest old" (people 80+) were overrepresented in the sampling. Nevertheless, the normative values for subjects 90+ involved a small sample size. Another limitation is that convergent validity was assessed with a single-item question. Although the use of single-item measures has sometimes been discouraged, if the construct under study is sufficiently unidimensional, single-item measures are not necessarily inferior to multiple-item measures [38]. Moreover, the present study did not address the properties of WHOQOL-AGE in terms of sensitivity to change. Further research should also explore content validity of the WHOQOL-AGE.

Conclusions
The WHOQOL-AGE has been shown to be a short, robust instrument that can be readily implemented in population surveys to track QOL in older adults and assess the relationship between health, QOL and their determinants, as well as to measure the impact of interventions. The instrument can also be used to compare the QOL of older and younger adults.