Inter-rater agreement of the Quality of Life-Alzheimer’s Disease (QoL-AD) self-rating and proxy rating scale: secondary analysis of RightTimePlaceCare data
Health and Quality of Life Outcomes volume 16, Article number: 131 (2018)
To assess the quality of life of people with dementia, measures are required for self-rating by the person with dementia, and for proxy rating by others. The Quality of Life in Alzheimer’s Disease scale (QoL-AD) is available in two versions, QoL-AD-SR (self-rating) and QoL-AD-PR (proxy rating).
The aim of our study was to analyse the inter-rater agreement between self- and proxy ratings, in terms of both the total score and the items, including an analysis specific to care setting, and to identify factors associated with this agreement.
Cross-sectional QoL-AD data from the 7th Framework European RightTimePlaceCare study were analysed. A total of 1330 cases were included: n = 854 receiving home care and n = 476 receiving institutional long-term nursing care. The proxy raters were informal carers (home care) and best-informed professional carers (institutional long-term nursing care).
Inter-rater agreement was investigated using Bland-Altman plots for the QoL-AD total score and by weighted kappa statistics for single items. Associations were investigated by regression analysis.
The overall QoL-AD assessment of those with dementia revealed a mean value of 33.2 points, and the proxy ratings revealed a mean value of 29.8 points.
The Bland-Altman plots revealed a poor agreement between self- and proxy ratings for the overall sample and for both care settings. With one exception (item ‘Marriage’ weighted kappa 0.26), the weighted kappa values for the single QoL-AD items were below 0.20, indicating poor agreement.
Home care setting, dementia-related behavioural and psychological symptoms, and the functional status of the person with dementia, along with the caregiver burden, were associated with the level of agreement. Only the home care setting was associated with an increase larger than the predefined acceptable difference between self- and proxy ratings.
Proxy quality of life ratings from professional and informal carers appear to be lower than the self-ratings of those with dementia.
QoL-AD-SR and QoL-AD-PR are therefore not interchangeable, as the inter-rater agreement differs distinctly. Thus, a proxy rating should be judged as a complementary perspective for a self-assessment of quality of life by those with dementia, rather than as a valid substitute.
Improving quality of life (QoL) is an important focus of various therapeutic interventions for people with dementia . Therefore, QoL is increasingly gaining importance as an outcome measure to evaluate interventions in dementia care . The QoL of those with dementia is considered as an individual, subjective, dynamic, multidimensional and complex construct, which includes the assessment of and adaptation to the consequences of dementia [3,4,5,6]. QoL can only be understood within the person-environment system, which Lawton has described in four sectors: behavioural competence (evaluated functioning of a person in relation to inner or outer events), perceived QoL (evaluations about major life domains), psychological well-being, and objective environment [6, 7].
The measurement of dementia-specific QoL involves particular features, due to the particular symptoms of the disease. Functional impairment of memory, cognition, time perception, attention, judgement and communication, along with the level of insight into the illness and the severity of the disease can all influence how questions are understood, and the rating and communication of the subjective condition [8,9,10,11,12]. In addition, the reliability of QoL assessment is affected by limitations of consciousness  and behavioural and non-cognitive symptoms such as depression, restlessness and psychosis . Emotional symptoms such as social deprivation can also influence QoL ratings by people with dementia . The progress of dementia is likely to make self-rating (SR) infeasible at a certain point. For people with severe dementia, proxy rating (PR) by informal or professional carers is indispensable if they are not to be excluded from QoL determination.
A recent review  has identified 16 QoL measures for people with dementia and has assessed their psychometric properties as well. Many measures were based on proxy assessment, with questionable validity for people with mild to moderate dementia. The best researched measure was the Quality of Life in Alzheimer’s Disease scale (QoL-AD) . It is available as a self-rating version (QoL-AD-SR: Quality of Life in Alzheimer’s Disease Self-Rating scale) and as a proxy rating version (QoL-AD-PR: Quality of Life in Alzheimer’s Disease Proxy Rating scale). The instrument has been translated in various languages and has good psychometric properties overall . Bowling et al.  give an overview of the psychometric evidence. However, some discrepancy between the two rating versions has been identified, which indicates further research aimed to clarify the relationship between the assessment through self-rating by the person with dementia and proxy rating .
We conducted a systematic literature search in Medline via PubMed (April 2016), which was aimed at identifying the body of knowledge on the agreement between the two QoL-AD rating versions. The search strategy used was: ‘QoL-AD OR (quality of life Alzheimer’s disease) OR (quality of life Alzheimer’s disease scale) OR (quality of life Alzheimer’s disease questionnaire) AND ((agreement OR accordance OR consensus OR conformity OR rapport OR congruence OR match) OR (discrepancy OR gap OR mismatch OR difference OR distinction OR disagreement) OR (caregiver bias)) AND (self OR (proxy OR caregiver OR carer))’. According to the four-phase PRISMA  process for selecting articles, all studies in the English or German language published in the last 10 years were included. A total of 28 relevant studies remained [16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43]. Across these studies, carers assessed QoL lower than those suffering from dementia. Fourteen articles [19, 20, 22, 25, 26, 28, 29, 31, 33, 36, 37, 39,40,41] reported the level of agreement between QoL-AD self- and proxy ratings; ranging from a poor [20, 28] to a very good  agreement. In almost all of the studies, the level of agreement at the QoL-AD total score was estimated using correlation analyses, such as the calculation of the intraclass correlation coefficient (ICC) or the application of the paired-samples t-test. However, calculation of correlation coefficients is never appropriate for agreement assessment, as correlation coefficients give information about a linear relation of two metric variables and not about the agreement [44, 45]. Two clearly distinct measures that are intended to assess the same construct regularly correlate in a certain way without necessarily reaching good agreement and good agreement can be reached without good correlation. Interpreting correlation analyses as agreement assessment must therefore be questioned and should generally be avoided. Only two studies [25, 28] used the Bland-Altman plots as recommended for metric variables [44, 45]; the presented data indicate an unacceptable range of agreement. Factors associated with the level of agreement between self- and proxy ratings were reported in eleven studies [16,17,18,19,20,21,22, 25, 27, 28, 36]. The most frequently mentioned factors for different ratings were behavioural and psychological symptoms of dementia (BPSD) and severity of cognitive impairment of those with dementia, as well as caregiver burden (see Additional file 1).
We found no studies investigating the level of agreement between QoL-AD-SR and QoL-AD-PR or agreement at the item level in a large European sample.
Therefore, our study aims to explore the inter-rater agreement of the measures QoL-AD-SR and QoL-AD-PR with the following objectives:
To investigate inter-rater agreement at the total score level.
To investigate inter-rater agreement on the item level.
To investigate inter-rater agreement by comparison of care settings (institutional long-term nursing care versus formal home care).
In addition, the identified factors (see Additional file 1) were hypothesised to be associated with the level of agreement and explored using regression analysis.
Data for the secondary analysis were obtained from the European RightTimePlaceCare study (RTPC; FP7-Health-F3–2010-242,153) . Cross-sectional data were used.
The RTPC study
The RTPC study comprised a prospective, multi-centre cohort study in eight European countries: Estonia, Finland, France, Germany, the Netherlands, Spain, Sweden and the UK. The survey (start in 2010) generated primary data for the development of best-practice recommendations on the transition from home care to institutional nursing care for European citizens with dementia.
Two types of dyads of people with dementia and their informal carers were investigated :
People with dementia (and their informal carers) living in an institutional long-term nursing care (ILTC) facility, admitted one to 3 months ago;
People with dementia (and their informal carers) living at home receiving professional long-term home nursing care (HC), who were assessed as being at risk of institutionalisation within the next 6 months.
The RTPC inclusion criteria for those with dementia were (1) a formal diagnosis of dementia; (2) aged ≥65 years; (3) a Standardized Mini Mental-State Examination (S-MMSE)  score of ≤24; (4) no primary psychiatric diagnosis or Korsakoff syndrome; and (5) personal contact with their informal carer at least twice a month. Those most involved in caring for the person with dementia were included as informal carers. No restrictions on their relationships with the person with dementia were imposed, but those who provide care as volunteers were excluded . Detailed information about the RTPC study design has been published elsewhere .
The RTPC study sample consisted of 2014 people with dementia (and their informal carers) : 791 dyads in the ILTC sample (from 256 ILTC locations) and 1223 dyads in the HC sample.
Variables from the RTPC data set for the secondary analysis
To analyse the inter-rater agreement, baseline data of the QoL-AD self- and proxy ratings are important. Therefore, only cases with total scores available for both the QoL-AD-SR and the QoL-AD-PR were included in our secondary data analysis. In total, n = 1330 cases were selected (see Additional file 2).
The QoL of people with dementia was assessed by self- and proxy ratings in all eight countries  using the 13-item version of the QoL-AD . The self-rating was assessed by those with dementia if they had an S-MMSE score of three or higher. The QoL-AD proxy rating is assessed from the proxy’s own perspective (proxy-proxy perspective), in contrast to answering as the patient would (person-proxy perspective). The proxy report was obtained from the best informed proxy, i.e. QoL in HC was assessed by informal carers and by professional carers in ILTC (i.e., nursing staff).
Data were collected between November 2010 and April 2012. To standardise the data collection, an instruction manual  was provided and implemented through training. This included instructions for the face-to-face interviews of the QoL-AD according to the detailed instructions for interviewers in the original questionnaire. The countries’ main investigators were responsible for the training of the interviewers; all interviewers received training regarding the content and completion of questionnaires .
The total scores of the self- and proxy ratings were calculated according to the specifications of the original authors [8, 14]: Based on the four response categories (1 = poor, 2 = fair, 3 = good, 4 = excellent) of the 13 items, the total score ranges from 13 to 52, with higher values indicating a higher QoL. If two or fewer responses were missing, they were replaced by the mean of the remaining individual responses. If more than two responses were missing, no total score was calculated.
Variables for the analysis of associated factors on the agreement
To analyse associated factors on the agreement of the self- and proxy ratings relevant clinical variables were selected from the RTPC data set. Our selection criteria were guided by the findings from the systematic literature search concerning the factors associated with the level of agreement between self- and proxy ratings. The corresponding variables from the RTPC data set are presented in detail in Additional file 1, including description, value ranges, and interpretation of the measures.
Variables of people with dementia were BPSD measured by the Neuropsychiatric Inventory Questionnaire (NPI-Q) [49, 50], cognitive function/severity of dementia measured by the Standardized Mini Mental-State Examination (S-MMSE) [47, 51] (sometimes used as a surrogate method for staging dementia ), depression measured by the Cornell Scale for Depression in Dementia (CSDD) , functional status measured by the Katz Index of Independence in Activities of Daily Living (KATZ ADL) , educational background, and care setting.
Variables of informal carers were caregiver burden measured by the Zarit Burden Inventory (ZBI) , the subscale ‘Lack of family support’ of the Caregiver Reaction Assessment (CRA) , and the Neuropsychiatric Inventory Questionnaire-Caregiver Distress (NPI-Q-D) [49, 50], QoL measured by the European Quality of Life Scale (EQ-5D-3 L, EQ-5D VAS) [57, 58], the General Health Questionnaire 12-item version (GHQ-12) , and the subscale ‘Impact on health’ of the Caregiver Reaction Assessment (CRA) , kind of relationship to the person with dementia, and gender.
Sociodemographic or clinical variables of the professional carers relevant to the secondary data analysis were not collected in the RTPC study.
Descriptive data from people with dementia, informal carers and professional carers were analysed using the statistical software SPSS Version 24. Frequencies, proportions, means and standard deviations were calculated.
With reference to the descriptive methods recommended for metric variables [44, 60], we took an exploratory approach for the assessment of the agreement between self- and proxy ratings of the QoL-AD total score and used Bland-Altman plots [44, 61] created with the statistical software R Version 3.2.
Bland-Altman plots usually consist of a line representing the mean of all differences of the compared methods and the Limits of Agreement (LoA). We initially decided that an acceptable difference between the self- and proxy rating of the QoL-AD would be within a range of − 3 to + 3 points in the total score. These boundaries correspond to a difference of half a standard deviation , which is generally judged to be of minimum clinical importance for QoL measurements . For a more comprehensive impression of Bland-Altman plots we added lines for the standard deviation of the differences, and the boundaries of an acceptable difference (+/− 3 scale points).
To investigate inter-rater agreement on the item level Cohen’s weighted kappa statistic [63, 64] and corresponding 95% confidence intervals (95% CI) were calculated using the statistical software R Version 3.2. Before calculation, imputed values were removed from the data. The interpretation of kappas was guided by Altman’s recommendation : ≤ 0.20 poor, 0.21–0.40 fair, 0.41–0.60 moderate, 0.61–0.80 good, 0.81–1.0 very good.
The analysis of covariance was used to analyse associated factors on the agreement of the self- and proxy ratings. In the subpopulation of people with dementia living in ILTC, the variables of informal carers were also assessed, but they did not rate the QoL-AD for those with dementia as proxies. Thus, we assumed that they would have no influence on the level of agreement. Therefore, two predefined models were fitted, including the aforementioned variables (see Subsection ‘Variables for the analysis of associated factors on the agreement’) as independent variables and the difference between self- and proxy ratings as the dependent variable (self-rating minus proxy rating). Model 1 included all people with dementia (ILTC and HC settings), with no further variables on the informal carers (as they did not do the proxy rating in ILTC). Model 2 included people with dementia in the HC setting, with variables on the informal carers (proxy raters). Both models were fitted in two ways: a) using the original values of the scales – the resulting coefficients can be interpreted in comparison with the used scales (partial regression coefficient: B); b) transforming all scales to a standardised version (z-transformation) – the resulting standardised partial regression coefficients can be interpreted in a similar way to a standardised effect size, which makes different scales more easily comparable (standardised partial regression coefficient: β). The analysis of covariance was performed using the statistical software R Version 3.2.
Ethical and legal aspects
Ethical approval was obtained from country-specific legal authorities for research on human beings. Written informed consent was obtained from all participants, from the (legal) representatives and whenever possible from the people with dementia themselves .
The RTPC consortium approved and released the data set.
Study sample description
A total of n = 1330 persons with dementia assessed their own QoL (see Table 1). The majority were female (65.9%) and lived at home (64.2%). Their mean age was 83.0 years. In the home care setting, the majority of those with dementia were married (49.3%), while the majority with dementia living in ILTC were widowed (61.7%).
For those with dementia living in ILTC, n = 476 professional carers assessed their QoL (see Table 2). The average age of the carers was 41.9 years. One third was registered nurses (33.3%) and more than half were trained nursing assistants (55.9%). The proxy ratings for people with dementia living at home were assessed by n = 854 informal carers (see Table 2) with an average age of 64.8 years. The informal carers were predominantly female (69.2%), adult children (45.1%) and were married (77.3%).
The overall QoL-AD assessment of those with dementia revealed a mean value of 33.2 out of 52 points (see Table 3). The average self-rating in the ILTC setting was 32.5 points and in the HC setting it was 33.5 points. The ratings at the item level were consistently within the medium range of response categories (2.1 to 3.1 points). The items ‘Memory’ (2.1) and ‘Ability to do chores around the house’ (2.2) showed the most negative self-ratings, while the item ‘Marriage’ (3.1) showed the most positive rating. There were only minor differences between self-ratings in ILTC and HC settings (≤ 0.3 points).
The overall proxy rating average was 29.8 points (see Table 3), the average in ILTC by the professional carers was 31.5 points and in the HC by the informal carers it was 28.8 points. The proxy rating on item level ranged from 1.5 to 2.9 points. The items ‘Living situation’ (2.9), ‘Family’ (2.9) and ‘Marriage’ (2.9) achieved the highest ratings, while the item ‘Memory’ (1.5) had the lowest score.
Inter-rater agreement at the total score
The Bland-Altman plot (see Fig. 1) showed a mean difference of the paired observations of − 3.4, i.e., the carers’ QoL ratings of those with dementia were lower than the self-ratings of those with dementia. The upper limit of agreement (LoA) shows a difference of 8.5 scale points and the lower LoA a difference of − 15.7 scale points. Both LoA by far exceed the acceptable deviation of +/− 3 scale points previously mentioned. No clear pattern in the differences of the paired observations can be identified in the plot.
Inter-rater agreement by comparison of the care settings
No relevant difference was found between the self-rating and the proxy rating when the mean of the paired observations’ differences in ILTC was compared, the professional carers’ rating QoL being one point less than that of those with dementia (see Table 3). In HC a relevant difference of 4.7 mean points between the paired measurements was found, indicating that informal carers rated QoL substantially lower than those with dementia.
The LoA in ILTC (see Fig. 2) ranged between − 13.4 and 11.4 scale points, far outside the acceptable deviation of +/− 3. No pattern of differences of the paired observations could be identified in this plot and, similarly, the LoA in HC (see Fig. 3) ranged between − 16.2 and 6.7 scale points, again with no pattern of differences of the paired observations.
Inter-rater agreement on item level
The agreement of single QoL-AD items (see Table 4) revealed weighted kappas smaller than 0.20, representing poor agreement, which is in accordance with Altman . Only the item ‘Marriage’ revealed fair agreement (weighted kappa: 0.26).
Regression models of factors associated with the level of agreement
The investigated predictor variables in both regression models (see Table 5) accounted for 13 and 8% of the variance of differences between self- and proxy-assessment respectively, with the QoL-AD (Model 1: r2adjusted = 0.13; Model 2: r2adjusted = 0.08).
Variables of people with dementia in model 1 and model 2
In Model 1, the care setting of the tested predictor variables showed the strongest influence on the difference between the self-rating and the proxy rating: B = 3.65; 95% CI (2.88; 4.42) corresponding with a standardised partial regression coefficient of β = 0.58, 95% CI (0.46; 0.70). This estimate indicates a clinically relevant difference between QoL-AD self- and proxy ratings in ILTC and HC, as previously defined (at a difference of more than three points in the total score of the QoL-AD, or more than half a standard deviation as expressed in z-scores).
BPSD of the person with dementia measured by the NPI-Q had the second strongest influence on the difference between the self-rating and the proxy rating in Model 1: B = 0.21, 95% CI (0.13; 0.29); β = 0.20, 95% CI (0.12; 0.28), and the strongest influence in Model 2: B = 0.30, 95% CI (0.15; 0.44); β = 0.29, 95% CI (0.15; 0.42). For both models, it was found that the higher the NPI-Q (more BPSD) the lower a proxy rated the QoL-AD, compared to a person with dementia.
The functional status measured by the KATZ ADL was negatively associated with the difference between the self- and the proxy ratings in both models. In Model 1 the influence of the functional status revealed statistical significance: B = − 0.34, 95% CI (− 0.55; − 0.13); β = − 0.10, 95% CI (− 0.16; − 0.04). In Model 2 the effect was smaller and did not reach statistical significance: B = − 0.14, 95% CI (− 0.40; 0.11); β = − 0.04, 95% CI (− 0.03; 0.11). Thus, the higher the KATZ ADL score (i.e., the less dependent the person with dementia) the higher the proxy rated the QoL-AD compared to the person with dementia.
No other tested variables of those with dementia in the two models revealed statistical significance.
Variables of informal carers in model 2
Two variables representing various aspects of caregiver burden revealed statistical significance, with opposite signs of the effect on the difference between self- and proxy ratings: for the ZBI, which represents general caregiver burden, B = 0.10, 95% CI (0.06; 0.14); β = 0.26, 95% CI (0.15; 0.36); and for the NPI-Q-D, which represents caregiver distress due to BPSD of the person with dementia, B = − 0.14, 95% CI (− 0.23; − 0.04); β = − 0.19, 95% CI (− 0.33; − 0.05). Thus, a higher ZBI (higher general burden) was associated with a lower proxy rating compared to the self-rating. A higher NPI-Q-D (higher caregiver distress related to BPSD) was associated with a higher proxy rating compared to the self-rating with the QoL-AD.
The aim of our study was to analyse the inter-rater agreement of the measures QoL-AD-SR and QoL-AD-PR, in terms of both the total score and the items, including a setting-specific analysis, and to identify factors associated with this agreement.
Our analysis suggests that proxies, i.e., informal carers and best informed professional carers, rate the QoL of those with dementia on average lower than those with dementia themselves. This is confirmed by previous studies that compare the performance of the two QoL-AD measures at the group level [16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43].
We found no acceptable inter-rater agreement (predefined range: +/− 3 scale points) of the QoL-AD measures in the whole sample. Comparing both settings and the means of differences, professional carers in ILTC rated the QoL on average one point less than the people with dementia, the informal carers in HC rated on average 4.7 points less. Nonetheless, no acceptable inter-rater agreement could be demonstrated in either HC or in ILTC when considering the Bland-Altman plots.
In almost all former studies [19, 20, 22, 26, 29, 31, 33, 36, 37, 39,40,41] the level of agreement at the QoL-AD total score was estimated using correlation analyses, which are not appropriate for agreement assessment. Bland-Altman plots were used in only two studies [25, 28]. Bosboom et al.  concluded that the agreement between self-rating and rating from the proxy-proxy perspective and from the person-proxy perspective is acceptable. However, this conclusion stems from a clear misunderstanding of the Bland-Altman plots method, and in particular LoA. The authors argued that only 2.5% (self-rating vs. proxy-proxy rating) and 5% (self-rating vs. person-proxy rating) of their Bland-Altman plots were beyond the LoA. However, LoA are derived from the distribution of the differences between paired observations, and should always include approximately 95% of these observations. This holds true for all Bland-Altman plots and LoA, independent of the actual agreement. For a valid conclusion, a predefined acceptable range for the differences would have been necessary. The actual LoA were − 8 to 6 points (self-rating vs. proxy-proxy rating) and − 9.5 to 15 points (self-rating vs. person-proxy rating). These LoA are far beyond an acceptable range. Zhao et al.  considered inter-rater agreement for the total score as fair (ICC = 0.58). They did not discuss their Bland-Altman plot, which even missed LoA. Using the given data from Zhao et al. , LoA ranges from − 7.3 to 13.9, indicating an unacceptable range of agreement.
Therefore, the QoL-AD proxy rating is not directly interchangeable with and cannot replace the QoL-AD self-rating of a person with dementia.
At the item level, we did not find any agreement better than poor except for the item ‘Marriage’, which achieved a fair agreement. The agreement on item level is comparable to the results of the studies by Conde-Sala et al. , Hoe et al.  and León-Salas et al. , which demonstrate mostly poor agreement on the item level. Chan et al.  and Wolak et al.  had a more positive conclusion, stating that agreement on the item level was largely good. Both studies inappropriately used correlation analysis for agreement rating.
We initially identified several factors that were hypothesised to be associated with the level of agreement and fitted two models. However, no factor importantly influenced the difference between self- and proxy ratings, except the setting (HC vs. ILTC).
The most frequently mentioned associated factor of those with dementia was BPSD [18, 20, 25, 28, 36]. In both fitted models we were able to confirm the strongest influence of all continuous variables for BPSD measured with the NPI-Q. Our results are in line with studies [18, 20, 25, 28, 36] that showed similar effect sizes and direction of effect.
The influence of the functional status or independence in activities of daily living is demonstrated in our analyses of Model 1. The level of agreement changed in such a way that the higher the dependence of the person with dementia (lower KATZ ADL score) the lower the QoL-AD rating of the proxy relative to that of the person with dementia. In Model 2, including only people with dementia living in HC, the effect was in the same direction but had a low magnitude and did not reach statistical significance. This confirms the results of Zhao et al.  and Zucchella et al. . A possible explanation for the varying ratings could be the “disability paradox” introduced by Albrecht and Devlieger . It means that many people with severe disabilities report a good QoL, although for external observers these people do not seem to be in good health. Another phenomenon in this context is the concept of response shift, defined as changes in the meaning of one’s self-evaluation of a target construct resulting from changes in internal standards, values, or conceptualisation . Thus, to underestimate a person’s QoL compared to his or her own rating might also be a result of response shift.
Huang et al. showed that living in HC led to a lower difference of self- and proxy ratings compared to ILTC  (i.e., the proxy rating in HC was not as low as the self-rating in HC compared to ILTC ratings). This is the opposite of our results, but the ILTC proxy raters in the study by Huang et al. were also relatives, i.e., informal carers. In our analysis, we are not able to distinguish setting and proxies (staff ratings in ILTC and informal carer ratings in HC). Therefore, we cannot draw any conclusions as to whether the discrepancy results from differences according to the setting or from the proxy raters.
Unlike Bosboom et al. , Conde-Sala et al. [16, 19], Huang et al.  and Zhao et al. , we did not find any influence of the severity of cognitive impairment or depressive symptoms [18,19,20]. The level of education of the person with dementia, as stated by Huang et al.  and Tay et al. , could not explain the differences between measurements.
Caregiver burden has most often been identified in previous studies as an influencing factor of carers, and our analyses showed similar effects. Both measurements for caregiver burden (ZBI and NPI-Q-D) showed positive gradients on the difference in single regression models and highly correlated with each other (results not shown). The estimator for NPI-Q-D changed signs when combining both measurements in our predefined Model 2, because of multicollinearity. The extent of the standardised effect sizes of both estimators did not reach a relevant level, and are similar to previous results [18, 22, 28, 36]. The results of the studies by Huang et al.  and Schulz et al.  on the influence of the QoL of carers could not be confirmed in our analysis. No influence of the informal carer’s relationship to the person with dementia, as stated by Huang et al. , was shown. Finally, our analysis found no differences in terms of gender of the informal carer, unlike the findings of Conde-Sala et al. .
Overall, our fitted regression models covered only 8 and 13% of the observed variance of the difference between self- and proxy ratings. Therefore, it can be assumed that there are further unknown factors influencing the difference between self- and proxy ratings, which were not addressed by the data or our analysis.
Strengths and limitations
The RTPC data for secondary data analysis provided us with access to a large European sample comprising QoL-AD self-ratings and proxy ratings.
However, setting-specific results should be interpreted with caution since ILTC settings might differ across countries. To ensure a widely representative sample 256 ILTC locations were included in the RTPC study; for the secondary analysis 476 professional carers rated the QoL.
Cross-cultural comparisons could not be conducted since the national sample sizes varied. However, we tested the assessment of the agreement between self- and proxy ratings of the QoL-AD total score on the German subsample (ILTC: n = 64; HC: n = 67). Again, no clear pattern of difference of the paired observations could be identified in the Bland-Altman plot.
Our analysis showed that professional and informal carers appear to generate lower proxy ratings of QoL than those with dementia themselves. The assessment of the inter-rater agreement of the two measurement methods QoL-AD-SR and QoL-AD-PR revealed pronounced differences. Nevertheless, the QoL-AD benefits from good psychometric properties and the applicability to people with a wide range of dementia severity to rate themselves.
Our data indicate that the QoL-AD self- and proxy ratings are not directly interchangeable due to the inter-rater gap. Thus, QoL-AD-PR provides a complementary perspective rather than a substitute for self-rating. In particular, a mix of self-rating and proxy rating may be biased. From a clinical point of view, our study suggests that either only one rating method should be performed or both rating methods parallel with separate analyses grouped in self- and proxy ratings. Self-ratings should be applied whenever possible. It is also required to report transparently who has responded to the QoL-AD.
An implication for future research would be to compare the QoL-AD-SR with the corresponding ratings of several proxy groups such as informal carers and professional carers. This would allow comparison of the levels of inter-rater agreement between the person with dementia and various proxies, and the agreement among the latter. Hereby, the impact of the setting on the level of agreement might be taken into account more specifically.
Behavioural and psychological symptoms of dementia
Caregiver Reaction Assessment
Cornell Scale for Depression in Dementia
- EQ-5D VAS:
European Quality of Life-Visual Analog Scale
- EQ-5D-3 L:
European Quality of Life Scale
General Health Questionnaire
Formal home care
Intraclass correlation coefficient
Institutional long-term nursing care
- KATZ ADL:
Katz Index of Independence in Activities of Daily Living
Limits of agreement
Neuropsychiatric Inventory Questionnaire
Neuropsychiatric Inventory Questionnaire-Caregiver Distress
Quality of life
Quality of Life in Alzheimer’s Disease scale
Quality of Life in Alzheimer’s Disease Proxy Rating scale
Quality of Life in Alzheimer’s Disease Self-Rating scale
The European RightTimePlaceCare study
Standardized Mini-Mental State Examination
Zarit Burden Index
National Collaborating Centre for Mental Health, National Institute for Health and Clinical Excellence. Dementia: a NICE–SCIE guideline on supporting people with dementia and their Carers in health and social care. National Clinical Practice Guideline Number 42. Leicester: The British Psychological Society & The Royal College of Psychiatrists; 2016.
Moniz-Cook E, Vernooij-Dassen M, Woods R, Verhey F, Chattat R, De Vugt M, Mountain G, O'Connell M, Harrison J, Vasse E, Dröes RM, Orrell M. INTERDEM group. A European consensus on outcome measures for psychosocial intervention research in dementia care. Aging Ment Health. 2008;12:14–29.
Dichter MN, Schwab CG, Meyer G, Bartholomeyczik S, Halek M. Linguistic validation and reliability properties are weak investigated of most dementia-specific quality of life measurements: a systematic review. J Clin Epidemiol. 2016;70:233–45.
Ettema TP, Dröes R-M, de Lange J, Ooms ME, Mellenbergh GJ, Ribbe MW. The concept of quality of life in dementia in the different stages of the disease. Int Psychogeriatr. 2005;17:353–70.
Lawton MP. Assessing quality of life in Alzheimer disease research. Alzheimer Dis Assoc Disord. 1997;11:91–9.
Lawton MP. A multidimensional view of quality of life in frail elders. In: Birren JE, Lubben JE, Rowe JC, Deutchman DE, editors. The concept and measurement of quality of life in the frail elderly. San Diego: Academic Press; 1991. p. 3–27.
Lawton MP. Environment and other determinants of well-being in older people. Gerontologist. 1983;23:349–57.
Logsdon RG, Gibbons LE, McCurry SM, Teri L. Assessing quality of life in older adults with cognitive impairment. Psychosom Med. 2002;64:510–9.
Whitehouse PJ, Sami SA. Quality of life in dementias. In: Duyckaerts C, Litvan I, editors. Dementias. Amsterdam. New York: Elsevier; 2008. p. 97–100. https://doi.org/10.1016/S0072-9752(07)01208-0.
Banerjee S, Samsi K, Petrie CD, Alvir J, Treglia M, Schwam EM, del Valle M. What do we know about quality of life in dementia? A review of the emerging evidence on the predictive and explanatory value of disease specific measures of health related quality of life in people with dementia. Int J Geriatr Psychiatry. 2009;24:15–24.
Rabins PV, Black BS. Measuring quality of life in dementia: purposes, goals, challenges and progress. Int Psychogeriatr. 2007;19:401–7.
Thorgrimsen L, Selwood A, Spector A, Royan L, de Madariaga LM, Woods RT, Orrell M. Whose quality of life is it anyway? The validity and reliability of the quality of life-Alzheimer's disease (QoL-AD) scale. Alzheimer Dis Assoc Disord. 2003;17:201–8.
Bowling A, Rowe G, Adams S, Sands P, Samsi K, Crane M, et al. Quality of life in dementia: a systematically conducted narrative review of dementia-specific measurement scales. Aging Ment Health. 2015;19:13–31.
Logsdon RG, Gibbons LE, McCurry SM, Teri L. Quality of life in Alzheimer's disease: patient and caregiver reports. J Ment Health Aging. 1999;5:21–32.
Moher D, Liberati A, Tetzlaff J, Altman DG. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. PLoS Med. 2009;6:e1000097.
Conde-Sala JL, Turró-Garriga O, Piñán-Hernández S, Portellano-Ortiz C, Viñas-Diez V, Gascón-Bayarri J, Reñé-Ramírez R. Effects of anosognosia and neuropsychiatric symptoms on the quality of life of patients with Alzheimer's disease: a 24-month follow-up study. Int J Geriatr Psychiatry. 2016;31:109–19.
Huang H-L, Weng L-C, Tsai Y-H, Chiu Y-C, Chen K-H, Huang C-C, et al. Predictors of self- and caregiver-rated quality of life for people with dementia living in the community and in nursing homes in northern Taiwan. Int Psychogeriatr. 2015;27:825–36.
Zucchella C, Bartolo M, Bernini S, Picascia M, Sinforiani E. Quality of life in Alzheimer disease: a comparison of patients’ and caregivers’ points of view. Alzheimer Dis Assoc Disord. 2015;29:50–4.
Conde-Sala JL, Reñé-Ramírez R, Turró-Garriga O, Gascón-Bayarri J, Campdelacreu-Fumadó J, Juncadella-Puig M, et al. Severity of dementia, anosognosia, and depression in relation to the quality of life of patients with Alzheimer disease: discrepancies between patients and caregivers. Am J Geriatr Psychiatry. 2014;22:138–47.
Tay L, Chua KC, Chan M, Lim WS, Ang YY, Koh E, Chong MS. Differential perceptions of quality of life (QoL) in community-dwelling persons with mild-to-moderate dementia. Int Psychogeriatr. 2014;26:1273–82.
Conde-Sala JL, Reñé-Ramírez R, Turró-Garriga O, Gascón-Bayarri J, Juncadella-Puig M, Moreno-Cordón L, et al. Clinical differences in patients with Alzheimer's disease according to the presence or absence of anosognosia: implications for perceived quality of life. J Alzheimers Dis. 2013;33:1105–16.
Schulz R, Cook TB, Beach SR, Lingler JH, Martire LM, Monin JK, Czaja SJ. Magnitude and causes of bias among family caregivers rating Alzheimer disease patients. Am J Geriatr Psychiatry. 2013;21:14–25.
Sousa MFB, Santos RL, Arcoverde C, Simoes P, Belfort T, Adler I, et al. Quality of life in dementia: the role of non-cognitive factors in the ratings of people with dementia and family caregivers. Int Psychogeriatr. 2013;25:1097–105.
Yeaman PA, Kim D-Y, Alexander JL, Ewing H, Kim KY. Relationship of physical and functional independence and perceived quality of life of veteran patients with Alzheimer disease. Am J Hosp Palliat Care. 2013;30:462–6.
Bosboom PR, Alfonso H, Eaton J, Almeida OP. Quality of life in Alzheimer's disease: different factors associated with complementary ratings by patients and family carers. Int Psychogeriatr. 2012;24:708–21.
Gómez-Gallego M, Gómez-Amor J, Gómez-García J. Determinants of quality of life in Alzheimer's disease: perspective of patients, informal caregivers, and professional caregivers. Int Psychogeriatr. 2012;24:1805–15.
Gräske J, Fischer T, Kuhlmey A, Wolf-Ostermann K. Quality of life in dementia care - differences in quality of life measurements performed by residents with dementia and by nursing staff. Aging Ment Health. 2012;16:819–27.
Zhao H, Novella J-L, Dramé M, Mahmoudi R, Barbe C, Di Pollina L, et al. Factors associated with caregivers’ underestimation of quality of life in patients with Alzheimer's disease. Dement Geriatr Cogn Disord. 2012;33:11–7.
Chan IW-P, Chu L-W, Lee PWH, Li S-W, Yu K-K. Effects of cognitive function and depressive mood on the quality of life in Chinese Alzheimer's disease patients in Hong Kong. Geriatr Gerontol Int. 2011;11:69–76.
Karttunen K, Karppi P, Hiltunen A, Vanhanen M, Välimäki T, Martikainen J, et al. Neuropsychiatric symptoms and quality of life in patients with very mild and mild Alzheimer's disease. Int J Geriatr Psychiatry. 2011;26:473–82.
León-Salas B, Logsdon RG, Olazarán J, Martinez-Martin P. The MSU-ADRU. Psychometric properties of the Spanish QoL-AD with institutionalized dementia patients and their family caregivers in Spain. Aging Ment Health. 2011;15:775–83.
Nakanishi K, Hanihara T, Mutai H, Nakaaki S. Evaluating the quality of life of people with dementia in residential care facilities. Dement Geriatr Cogn Disord. 2011;32:39–44.
Conde-Sala JL, Garre-Olmo J, Turró-Garriga O, Vilalta-Franch J, López-Pousa S. Quality of life of patients with Alzheimer’s disease: differential perceptions between spouse and adult child caregivers. Dement Geriatr Cogn Disord. 2010;29:97–108.
Novelli MM, Nitrini R, Caramelli P. Validation of the Brazilian version of the quality of life scale for patients with Alzheimer's disease and their caregivers (QOL-AD). Aging Ment Health. 2010;14:624–31.
Conde-Sala JL, Garre-Olmo J, Turró-Garriga O, López-Pousa S, Vilalta-Franch J. Factors related to perceived quality of life in patients with Alzheimer's disease: the patient's perception compared with that of caregivers. Int J Geriatr Psychiatry. 2009;24:585–94.
Huang H-L, Chang MY, Tang JS-H, Chiu Y-C, Weng L-C. Determinants of the discrepancy in patient- and caregiver-rated quality of life for persons with dementia. J Clin Nurs. 2009;18:3107–18.
Wolak A, Novella J-L, Drame M, Guillemin F, Di Pollina L, Ankri J, et al. Transcultural adaptation and psychometric validation of a French-language version of the QoL-AD. Aging Ment Health. 2009;13:593–600.
Hurt C, Bhattacharyya S, Burns A, Camus V, Liperoti R, Marriott A, et al. Patient and caregiver perspectives of quality of life in dementia. An investigation of the relationship to behavioural and psychological symptoms in dementia. Dement Geriatr Cogn Disord. 2008;26:138–46.
Yap PLK, Goh JYN, Henderson LM, Han PM, Ong KS, Kwek SSL, et al. How do Chinese patients with dementia rate their own quality of life? Int Psychogeriatr. 2008;20:482–93.
Hoe J, Hancock G, Livingston G, Orrell M. Quality of life of people with dementia in residential care homes. Br J Psychiatry. 2006;188:460–4.
Spector A, Orrell M. Quality of life (QoL) in dementia: a comparison of the perceptions of people with dementia and care staff in residential homes. Alzheimer Dis Assoc Disord. 2006;20:160–5.
Beerens HC, Sutcliffe C, Renom-Guiteras A, Soto ME, Suhonen R, Zabalegui A, et al. Quality of life and quality of care for people with dementia receiving long term institutional care or professional home care: the European RightTimePlaceCare study. J Am Med Dir Assoc. 2014;15:54–61.
Beerens HC, Zwakhalen SMG, Verbeek H, Ruwaard D, Ambergen AW, Leino-Kilpi H, et al. Change in quality of life of people with dementia recently admitted to long-term care facilities. J Adv Nurs. 2015;71:1435–47.
Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet. 1986;327:307–10.
Altman DG. Assessing new methods of clinical measurement. Br J Gen Pract. 2009;59:399–400.
Verbeek H, Meyer G, Leino-Kilpi H, Zabalegui A, Hallberg IR, Saks K, et al. A European study investigating patterns of transition from home care towards institutional dementia care: the protocol of a RightTimePlaceCare study. BMC Public Health. 2012;12:68.
Folstein MF, Folstein SE, McHugh PR. “Mini-mental state”: a practical method for grading the cognitive state of patients for the clinician. J Psychiatr Res. 1975;12:189–98.
RightTimePlaceCare. Improving Health Services for European Citizens with Dementia. Manual Data Collection: Version 3.0. Maastricht. 2010. p. 89.
Cummings JL, Mega M, Gray K, Rosenberg-Thompson S, Carusi DA, Gornbein J. The neuropsychiatric inventory: comprehensive assessment of psychopathology in dementia. Neurology. 1994;44:2308–14.
Kaufer DI, Cummings JL, Ketchel P, Smith V, MacMillan A, Shelley T, et al. Validation of the NPI-Q, a brief clinical form of the neuropsychiatric inventory. J Neuropsychiatry Clin Neurosci. 2000;12:233–9.
Molloy DW, Alemayehu E, Roberts R. Reliability of a standardized mini-mental state examination compared with the traditional mini-mental state examination. Am J Psychiatry. 1991;148:102–5.
Perneczky R, Wagenpfeil S, Komossa K, Grimmer T, Diehl J, Kurz A. Mapping scores onto stages: mini-mental state examination and clinical dementia rating. Am J Geriatr Psychiatry. 2006;14:139–44.
Alexopoulos GS, Abrams RC, Young RC, Shamoian CA. Cornell scale for depression in dementia. Biol Psychiatry. 1988;23:271–84.
Katz S. Assessing self-maintenance: activities of daily living, mobility, and instrumental activities of daily living. J Am Geriatr Soc. 1983;31:721–7.
Zarit SH, Reever KE, Bach-Peterson J. Relatives of the impaired elderly: correlates of feelings of burden. Gerontologist. 1980;20:649–55.
Given CW, Given B, Stommel M, Collins C, King S, Franklin S. The caregiver reaction assessment (CRA) for caregivers to persons with chronic physical and mental impairments. Res Nurs Health. 1992;15:271–83.
Brazier J, Jones N, Kind P. Testing the validity of the EuroQol and comparing it with the SF-36 health survey questionnaire. Qual Life Res. 1993;2:169–80.
The EuroQol Group. EuroQol - a new facility for the measurement of health-related quality of life. Health Policy. 1990;16:199–208.
Goldberg DP, Gater R, Sartorius N, Ustun TB, Piccinelli M, Gureje O, Rutter C. The validity of two versions of the GHQ in the WHO study of mental illness in general health care. Psychol Med. 1997;27:191–7.
Kwiecien R, Kopp-Schneider A, Blettner M. Concordance analysis: part 16 of a series on evaluation of scientific publications. Dtsch Arztebl Int. 2011;108:515–21.
Bland JM. Altman DG. A note on the use of the intraclass correlation coefficient in the evaluation of agreement between two methods of measurement. Comput Biol Med. 1990;20:337–40.
Norman GR, Sloan JA, Wyrwich KW. Interpretation of changes in health-related quality of life: the remarkable universality of half a standard deviation. Med Care. 2003;41:582–92.
Cohen J. Weighted kappa: nominal scale agreement with provision for scaled disagreement or partial credit. Psychol Bull. 1968;70:213–20.
Fleiss JL, Cohen J, Everitt BS. Large sample standard errors of kappa and weighted kappa. Psychol Bull. 1969;72:323–7.
Altman DG. Practical statistics for medical research. London: Chapman & Hall; 1991.
Albrecht GL, Devlieger PJ. The disability paradox: high quality of life against all odds. Soc Sci Med. 1999;48:977–88.
Sprangers MA, Schwartz CE. Integrating response shift into health-related quality of life research: a theoretical model. Soc Sci Med. 1999;48:1507–15.
The RightTimePlaceCare (RTPC) study was funded by the European Commission within the Seventh Framework Program (project 242153). The secondary data analysis of QoL-AD data was not funded.
We acknowledge the financial support of the Open Access Publication Fund of the Martin-Luther-University Halle-Wittenberg.
Availability of data and materials
The datasets analysed during the current study are available from the RightTimePlaceCare (RTPC) consortium on reasonable request.
Ethics approval and consent to participate
The RightTimePlaceCare (RTPC) consortium approved and released the data set. Ethical approval was obtained from country-specific legal authorities for research on human beings .
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Römhild, J., Fleischer, S., Meyer, G. et al. Inter-rater agreement of the Quality of Life-Alzheimer’s Disease (QoL-AD) self-rating and proxy rating scale: secondary analysis of RightTimePlaceCare data. Health Qual Life Outcomes 16, 131 (2018). https://doi.org/10.1186/s12955-018-0959-y
- Quality of life
- Inter-rater agreement