Convergent validity of EQ-5D with core outcomes in dementia: a systematic review
Health and Quality of Life Outcomes volume 20, Article number: 152 (2022)
To explore through a systematic review, the convergent validity of EQ-5D (EQ-5D-3L and EQ-5D-5L (total score and dimensions)) with core outcomes in dementia and investigate how this may be impacted by rater-type; with the aim of informing researchers when choosing measures to use in dementia trials.
To identify articles relevant to the convergent validity of EQ-5D with core dementia outcomes, three databases were electronically searched to September 2022. Studies were considered eligible for inclusion within the review if they included individual level data from people with dementia of any type, collected self and/or proxy reported EQ-5D and collected at least one core dementia outcome measure. Relevant data such as study sample size, stage of dementia and administration of EQ-5D was extracted, and a narrative synthesis was adopted.
The search strategy retrieved 271 unique records, of which 30 met the inclusion criteria for the review. Twelve different core outcome measures were used to capture dementia outcomes: cognition, function, and behaviour/mood across the studies. Most studies used EQ-5D-3L (n = 27). Evidence related to the relationship between EQ-5D and measures of function and behaviour/mood was the most robust, with unanimous directions of associations, and more statistically significant findings. EQ-5D dimensions exhibited associations with corresponding clinical outcomes, whereby relationships were stronger with proxy-EQ-5D (than self-report).
Measuring health-rated quality of life in dementia populations is a complex issue, particularly when considering balancing the challenges associated with both self and proxy report. Published evidence indicates that EQ-5D shows evidence of convergent validity with the key dementia outcomes, therefore capturing these relevant dementia outcomes. The degree of associations with clinical measures was stronger when considering proxy-reported EQ-5D and differed by EQ-5D dimension type. This review has revealed that, despite the limited targeted psychometric evidence pool and reliance on clinical and observational studies, EQ-5D exhibits convergent validity with other dementia outcome measures.
Dementia is a neurodegenerative condition which mostly affects older adults and is typically characterised by cognitive symptoms such as memory loss and speech and language impairments – but also impacts behaviour, function , and general quality of life (QoL) . As the number of people living with dementia is increasing, dementia presents as one of the largest current health and social care challenges facing policymakers , whereby evidence regarding the cost-effectiveness of different dementia interventions is fundamental in shaping policy decisions. In the UK, the National Institute for Health and Care Excellence (NICE) recommends the use of EQ-5D to measure health benefit in cost-effectiveness analyses . EQ-5D has five dimensions: mobility; self-care; usual activities; pain/discomfort; and anxiety depression. The EQ-5D-3L has three response categories for each dimension (no problems, some problems, extreme problems)  whereas the newer EQ-5D-5L has five response categories (no problems; slight problems; some problems; severe problems; extreme problems/unable to do) . NICE guidelines recommend EQ-5D to measure health related quality of life (HRQoL) as it is a generic measure that enables comparability across disease areas [4, 5].
For decision-making to be optimal, the outcome measure used to measure HRQoL should be responsive and representative of the population in question, and the greatest comparability is produced when the same measure is used across studies. Previous reviews concluded that EQ-5D demonstrated strong acceptability and feasibility within this population due to its concise nature, and that the three-level version displays good overall psychometric properties in dementia research settings [7, 8]. Given that EQ-5D is widely used, is acceptable in dementia populations and recommended by NICE guidelines , this systematic review will focus on the EQ-5D measure. There are additional measures that are considered core for collection in dementia studies as they distinctly measure the impact of the key dementia outcomes of: cognition, function, and behaviour/mood , and therefore together reflect dementia experience (see Table 1). Evidence regarding convergent validity of EQ-5D with these core outcome measures would address the question of how well EQ-5D captures dementia experience and therefore dementia-HRQoL.
Two previous recent systematic reviews that have explored HRQoL in dementia have commonly concluded that EQ-5D was the most appropriate utility instrument for use in dementia populations [7, 8]. Although these reviews highlight the value of EQ-5D in dementia research, both reviews broadly investigated psychometric properties, as opposed to specifically focusing on the convergent validity with dementia clinical trial outcomes, which is an important consideration as EQ-5D is increasingly used in dementia populations. A recent review (2022) broadly assessed the psychometric performance of EQ-5D in dementia populations, however it focused solely on EQ-5D-5L . An earlier review (2011)  directly explored EQ-5D-3L as a QoL measure in people with dementia (PwD), investigating various psychometric properties including feasibility, reliability, responsiveness and validity. However, the authors highlighted a key recurring theme in the lack of association between self-rated and proxy-rated EQ-5D, indicating problems with inter-rater reliability .
Proxy-report is when someone is asked to report on behalf of someone else, typically performed by a family member, caregiver, or healthcare professional. Proxy-reports are particularly important in dementia research as there are specific challenges associated with collecting outcomes within a population of deteriorating cognition, including impaired recall and judgement . It is established within the literature that HRQoL reports made by PwD and proxies do not align, with self-reports often reflecting more optimistic responses [2, 10, 12,13,14]. This divergence may be more pronounced for some aspects of HRQoL than others. Certain dimensions of HRQoL i.e., anxiety/depression and pain/discomfort in EQ-5D may be “less/non-observable” and therefore more difficult to proxy-report. On the other hand, the mobility, self-care, and usual activities dimensions in EQ-5D are more “observable” and are therefore considered to be less subjective [10, 13]. This issue is particularly important in dementia, since people with more severe cognitive impairment may not be able to reliably self-report their HRQoL. Therefore, whilst patient self-report is usually considered preferable, this may not be feasible for all PwD.
In light of the challenges surrounding the use of self and proxy-rated HRQoL in dementia, there is the need to develop ways of overcoming these issues to ensure accurate and reliable analyses. It is important to retain the patient as the focus, therefore self-reports are considered default. However, to understand when it is better to use proxy-reports, dementia severity and dimension specific data should be explored. Therefore, there is a gap in the research to investigate the convergent validity of EQ-5D against core outcome measures in dementia for both self and proxy-reports. Although there are previous systematic reviews exploring the psychometric properties of EQ-5D [7,8,9,10], these do not specifically focus on convergent validity and hence the level of detail provided on convergent validity is limited. Convergent validity is an important property that, when explored in detail, may benefit researchers while choosing a HRQoL measure to use in trials, taking into consideration other instruments used in dementia studies. Therefore, this review will focus specifically on convergent validity of EQ-5D. This research adds to the existing psychometric literature of EQ-5D in dementia populations, and aids in addressing the question of how well EQ-5D captures dementia HRQoL in light of its widespread use. Therefore, a systematic literature review of the existing evidence was conducted, which to our knowledge is the first of its kind.
Systematic review aim
The aim of this systematic review was to assess the convergent validity of the five EQ-5D dimensions (both 3L and 5L versions) with pre-defined core outcomes in dementia, taking into consideration the potential impacts of rater type (self vs. proxy EQ-5D reports).
The systematic review adopted the methodology outlined by the Centre for Review and Dissemination (CRD) . The Preferred Reporting of Items for Systematic Review and Meta-Analysis guidance (PRISMA)  were followed for reporting the results, and a narrative approach adopted for reporting the main analysis.
Convergent validity defines the strength of association between the measure of interest (in this case EQ-5D) and other measures via statistical significance of regression analyses or correlation coefficients. If the correlation with a measure capturing the same construct (kappa) is: > 0.4 it is considered as moderate and convergent validity is established (> 0.2: slight, > 0.6: good and > 0.8: very high) .
Core dementia outcome measures
Although cognition is the hallmark feature of dementia, the original description and diagnosis also include functional and behavioural deficits, whereby dementia severity and progression is assessed by changes in one or more of these outcome areas . To explore current practice for outcome collection in dementia clinical trials, various resources were searched and appraised via a previous review conducted as part of a wider research project, see Additional file 1 for further details. Table 1 below provides a summary of the key recurring outcome areas and measurement instruments identified as recommended, and considered as core for collection in dementia studies and trials.
The literature search was conducted in three electronic databases (Medline, PsycINFO and CINAHL) by one author (HH) initially in April 2021 from database inception, and later re-ran and updated to September 2022. Search terms included those related to dementia, EQ-5D and core measure names. The full search strategy is provided in Additional file 2. Title and abstract screening were conducted by one author (HH) and verified independently by another author (AK) to check for discrepancies and establish study inclusion. Included studies were then screened in full text by one author (HH) against pre-defined inclusion and exclusion criteria to determine eligibility for the review. If at this point a text did not include extractable data, it was excluded from the review. Any discrepancies were discussed between the authors and resolved prior to data extraction.
Inclusion/ exclusion criteria
Studies were considered eligible for inclusion within the review if they included individual level data from people with dementia of any type (as opposed to general ageing or mild cognitive impairment alone), they collected self and/or proxy reported EQ-5D-3L or EQ-5D-5L and they collected at least one of the predefined core dementia outcome measures (see Table 1). Studies reporting only EQ VAS were excluded as the purpose of the review is to focus on measures for use in economic evaluation, and EQ VAS cannot be used for this purpose. Studies collecting outcomes related to caregivers alone were excluded. To allow for diversity of the study types, all study designs were considered eligible for the review (e.g., observational studies and randomized controlled trials (RCT)) but were limited to those published in English (see Additional file 3). Protocol papers, feasibility studies, conference abstracts, grey literature and previous systematic reviews were excluded, but were chain searched for additional eligible references.
Data on the study characteristics, settings, aspects of dementia (i.e., type and stage) and study objectives was synthesised using pre-defined extraction tables. Convergent validity of EQ-5D was captured against any additional reported core dementia outcome measures, as outlined in Table 1. Statistical data on relationships between the outcomes and EQ-5D index scores as well as with the EQ-5D dimensions were extracted. EQ VAS data were not extracted. A narrative synthesis was used to interpret the extracted data.
The standardised GRADE assessment tool was adapted and used to assess the quality of the papers included in the review . Although this method is less formal than using a pre-existing quality appraisal tool, it was deemed the most appropriate as the review allowed for the inclusion of all study types. The quality appraisal included nine items regarding the study’s population, sample, outcome assessment, analysis, and data resulting in a score of either high, medium, or low quality (see Additional file 4 for full details).
The outcome of the literature search and screening is shown in Fig. 1. The initial search strategy retrieved 282 records. Following the removal of duplicates there were 236 records remaining, for which the titles and abstracts were screened. After re-running and updating the search, 30 articles met the inclusion criteria for the review.
Table 2 provides a summary of the characteristics of the selected studies. Of the 30 studies included, they were predominantly from European countries (n = 15) [19,20,21,22,23,24,25,26,27,28,29,30,31,32,33] and the UK (n = 7) [34,35,36,37,38,39,40]. There were a total of 17 papers published since 2012 [20, 22, 23, 25,26,27,28, 31, 33, 35,36,37,38,39,40,41,42,43] (last ten years), and the most recent study included was from 2022 . Less than a fifth of the studies used data from RCTs (n = 5) [20, 21, 29, 36, 37]. Alzheimer’s disease (AD) was the most included type of dementia, and it was the sole subtype evaluated in over a third of the studies (n = 11) [20, 25, 27, 28, 32, 39, 43,44,45,46,47] ; four studies considered AD in combination with dementia with Lewy Bodies (DLB) [21, 31], vascular dementia (VD)  or mixed . Seven of the studies considered any type of dementia [19, 24, 33, 35, 37, 38, 40], and eight studies did not specify the type of dementia under investigation [22, 23, 26, 29, 36, 41, 42, 48].
The studies included samples across all stages of dementia, from very mild to severe, which was typically characterised by MMSE scores. Most of the studies defined dementia severity via MMSE scores, however some studies did not collect MMSE, thereby using an alternative outcome, i.e., CDR [22, 23, 26, 34, 37, 38]. The lowest mean MMSE reported was 12.8 , and the highest was 24.9 [32, 41]. The study sample sizes ranged from 48  – 1004  participants. EQ-5D-3L was most commonly collected (n = 27), and only four studies collected EQ-5D-5L [33, 41,42,43] (one study considered both EQ-5D-3L and EQ-5D-5L ). There was a mix of study settings; eleven studies focused solely on community dwelling PwD [20, 29, 30, 33, 34, 37, 41, 44,45,46,47], while six studies collected data from institutionalised residents alone [22, 23, 26, 36, 42, 43]. Quality of life data were collected entirely via a proxy (n = 9) [22, 23, 25,26,27, 34, 43, 45, 46] or self-report (n = 3) [31, 44, 47] in some studies, however over half of the studies used both proxy and self-report (n = 17) [19,20,21, 28,29,30, 32, 33, 35,36,37,38,39,40,41,42, 44, 45]. Where self-report was exclusively used, the studies explored people with mild dementia. Of the seventeen studies that included people with severe dementia within their sample, ten studies used both self and proxy reports [19, 21, 28, 30, 35, 36, 38,39,40, 48], six studies used proxy-reports alone [22, 23, 25, 26, 43, 46] and one study did not report the rater type . The proxy-type was most commonly an informal caregiver (i.e., family member, friend, or neighbour), however four studies also included formal caregivers [19, 26, 36, 41] and one study additionally used clinicians to proxy report . EQ-5D was mainly administered was via an interview (n = 21) [19,20,21,22, 25, 27,28,29,30, 32, 33, 36,37,38,39,40, 42, 44,45,46,47,48], three studies did not report this detail [23, 24, 31], and one study used interviews for PwD and self-administration for proxies . Of the four studies that solely used PwD self-reported EQ-5D, two collected this via interview [44, 46], one used self-administration booklets  and one did not report this information .
Core dementia outcome measures
Table 3 lists the measures that were used to measure the core dementia outcomes in the studies. Details of the measures are provided in Additional file 5. In total there were 12 distinct measures: Two cognition measures (MMSE and ADAS-Cog), six measures of function (Katz ADL, ADCS-ADL, Barthel index, Lawton scale, DAD and BADLS) and four behaviour/mood measures (NPI, CSDD, GDS and CMAI). Where it was reported, the measures were completed either by proxy, researcher observation, or a combination of information such as self or proxy information and recent care records (administration details are provided in Additional file 5). The most predominant measure was the MMSE (n = 25), followed by the NPI (n = 12).
The MMSE and ADAS-Cog measures commonly capture cognitive impairment, however the latter is administered via direct observation, resulting in a longer administration duration. There two types of daily activities – basic activities of daily living (BADLs) and instrumental-ADLs. Instrumental ADLs are not necessary for fundamental functioning, they are generally more complex activities that allow a person to live independently, e.g., managing one’s own finances. Basic ADLs are fundamental skills, typically related to basic physical needs, e.g., toileting and eating . Of the six function measures, two captured basic ADLs (BADL) alone (Katz ADL and Barthel index), one captured instrumental ADLs (IADL) alone (Lawton scale) and the remaining three included both BADL and IADL items (ADCS-ADL, DAD and BADLS). Of the behavioural measures, two measured depression (CSDD and GDS), while the others captured agitation (CMAI) and general neuropsychiatric symptoms (NPI).
EQ-5D convergent validity with cognition
It was hypothesised that cognition would have a positive correlation with EQ-5D whereby greater cognitive impairment would be associated with lower EQ-5D index scores (lower MMSE scores indicate greater cognitive impairment). Additional file 6 provides complete details of the empirical relationship between cognition and EQ-5D. In total, eighteen studies assessed the convergent validity between EQ-5D index scores and the cognitive measures (MMSE, n = 17; ADAS-Cog, n = 2; one study collected both measures ). Three studies reported a different relationship between cognition and EQ-5D by rater type [28, 35, 41].
Within only seven distinct studies a statistically significant relationship (p < 0.05) between cognition and EQ-5D was reported, all of which were positive correlations [28, 30, 35, 40, 41, 43, 46], and three of these seven studies had a sample size of greater than 300 [28, 35, 46]. Of the studies that reported statistically significant findings, the rater type was predominantly an informal caregiver (5/7 studies), and were studies that had included participants spanning the entire dementia severity range (mild-to-severe). The one study that reported a statistically significant association between self-reported EQ-5D and cognition was within a mild-stage study sample . Figure 2 shows the proportion of studies that demonstrated a relationship between cognition and EQ-5D in both directions (see Additional file 6 for more details).
EQ-5D convergent validity with function
For the convergent validity between EQ-5D and the measures of function, a positive correlation was hypothesised whereby greater functional independence would be associated with higher EQ-5D index scores (see Additional file 5 for details of function instrument scoring). Twenty distinct studies provided empirical evidence of the convergent validity between EQ-5D and the measure of function within the study (ADCS-ADL, n = 2; BADLS, n = 2; Barthel index, n = 7; DAD, n = 6; Lawton scale, n = 4; one study collected both Barthel index and Lawton scale ).
Two studies reported a difference in relationship between function and EQ-5D by rater type, whereby the Lawton index showed a positive and significant correlation with proxy EQ-5D and a negative and non-significant correlation with self-rated EQ-5D [38, 41]. These two studies were the only reports of a negative association; both of these studies had a mainly mild-stage sample size of < 200 participants. The remaining studies all reported a positive correlation between function and EQ-5D, of which the majority (15/16 studies) were statistically significant. Two studies explored function as a dependent variable within regression analyses, both of which found it to be a significant (p < 0.01) determinant of proxy reported EQ-5D [40, 43], but not self-reported EQ-5D . One study had reported a positive correlation between ADCS-ADL and EQ-5D for both rater types, but was only statistically significant for proxy-report , and was again within a mild-stage study sample . Figure 3 shows the characteristics of the studies that reported convergent validity evidence between function and EQ-5D (Additional file 7 provides complete details).
EQ-5D convergent validity with behaviour/mood
For the behaviour/mood measures, higher scores indicate greater severity (see Additional file 5). Therefore, it was hypothesised that the measures would have negative correlations with EQ-5D, whereby more behavioural disturbance is associated with lower EQ-5D index scores. Seventeen distinct studies reported empirical evidence of the convergent validity between EQ-5D and the measure of behaviour/mood, four studies collected multiple measures [20, 28, 46, 47] (CSDD, n = 2; CMAI, n = 1; GDS, n = 8; NPI, n = 10).
Only one study captured agitation (via CMAI), reporting a negative correlation with EQ-5D which was only statistically significant for proxy-report .
Ten studies measured depression (via CSDD and GDS). All ten studies reported a negative correlation between the measure of depression and EQ-5D, whereby statistically significant results were found with self-rated EQ-5D only n = 4 [31, 41, 44, 47]; proxy-EQ-5D only, n = 2 [37, 46] and both rater types, n = 2 [20, 28]. One study did not report statistical significance, but rather strength of correlation coefficients – indicating moderate convergent validity between EQ-5D index scores and GDS .
The NPI captures 12 broad neuropsychiatric symptoms and was administered in ten of the reviewed studies. Two of these studies reported a difference in relationship between NPI and EQ-5D by rater type; one study found a negative correlation with self-rated EQ-5D, but a positive correlation with proxy-EQ-5D , while the other study found the inverse . However, neither of these findings were statistically significant. The remaining studies all reported negative correlations, whereby statistical significance was found only with proxy-EQ-5D [20, 21, 25, 27, 28, 35, 46]. Figure 4 shows the characteristics of the studies that reported convergent validity evidence between the behaviour/mood measure and EQ-5D (Additional file 8 provides complete details).
Convergent validity evidence by EQ-5D dimension
A total of seven distinct studies reported empirical evidence of convergent validity of the pre-defined core dementia outcome measures with EQ-5D dimensions – summarised in Table 4. Cognition (via MMSE) was associated with self-rated anxiety/depression , and people with more cognitive impairment self-reported fewer problems across all EQ-5D dimensions .
Function via the Katz index was associated with self-rated mobility, self-care, usual activities, and pain/discomfort; no relation was found with anxiety/depression . Function via BADLS was correlated with proxy rated mobility, self-care and usual activities. Stronger correlations were observed for informal carer reports of self-care and usual activities, while the clinician rated mobility correlation was stronger . Function via Barthel index was significantly correlated with self and proxy mobility, self-care and usual activities , and was associated with reporting problems in all EQ-5D dimensions minus anxiety/depression .
Depression (via CSDD) was associated with reporting problems in anxiety/depression ; and depression (via GDS) showed evidence of moderate convergent validity with mobility, self-care, usual activities and anxiety/depression . NPI summary scores were associated with proxy rated anxiety/depression [34, 42] and mobility .
To further understand the potential impacts of rater-type upon EQ-5D assessment, information related to the inter-rater agreement was extracted and is summarised in Table 5.
Nine studies, representing samples across the entire dementia-severity range, commented on the inter-rater agreement between self and proxy rated EQ-5D index scores [19, 28, 29, 32, 35,36,37,38, 41]. Proxy-EQ-5D index scores were found to be significantly lower than self-report [28, 29, 37, 38, 41] and had stronger correlations with clinical variables [20, 35, 36].
Of the EQ-5D dimensions, it was reported in two distinct studies that the mobility dimension had the strongest inter-rater agreement, produced at an acceptable level (kappa > 0.4) (versus the other dimensions) [19, 37]. One of the studies reported that agreement between formal and informal proxies was also highest for mobility (kappa = 0.61), and all other dimensions remained below the usually accepted level .
Agreement between reports of the usual activities dimension was the lowest, with self-report reflecting more optimistic reports [29, 32, 37, 41]. Agreement in the pain/discomfort dimension was low, whereby proxies rated more problems than PwD themselves [32, 37, 41]. Evidence of agreement in the anxiety/depression dimension was mixed; one study found that PwD self-rated this dimension more optimistically than proxies , while another study reported that this was the only dimension that PwD had self-rated more problems (than proxies) .
Of the 30 papers included within the review, 18 were of high quality, and the remaining 12 were considered to be of medium quality (see Additional file 4 for full quality appraisal).
There is a growing recognition of the wide use and acceptability of EQ-5D within dementia populations, thereby capturing generic HRQoL that can be converted to utilities for use in cost-effectiveness analyses. An important factor in exploring the use of such a measure is its psychometric properties. This targeted literature review identified 30 studies which contained empirical evidence related to the convergent validity of EQ-5D with at least one pre-defined core measure of: cognition (n = 18), function (n = 20), or behaviour/mood (n = 17), the main clinical outcomes in studies of people with dementia. The findings indicate that EQ-5D convergent validity with clinical measures of function, behaviour/mood and cognition were in the expected direction, whereby increased clinical impairment was associated with lower EQ-5D index score. There is clear evidence on the absence of inter-rater agreement between self and proxy report. Evidence at the dimension-level was limited, however there were some data to support convergent validity between specific EQ-5D dimensions and clinical outcomes, and differences in inter-rater agreement by dimension type.
It was hypothesised that as cognition deteriorates, EQ-5D would also deteriorate – indicating a positive correlation. If a switch occurs from self to proxy report when a PwD is no longer able to accurately self-report, this relationship would still be sustained. Only seven studies reported evidence of statistically significant associations with cognition, and they were all with a positive correlation, therefore agreeing with the a priori hypothesis. As observed and expected, proxy-reports showed stronger associations with the clinical measure. Where a negative but non-significant correlation with self-rated EQ-5D (within the same dyad) was reported [35, 41]; this finding could indicate that self-rated EQ-5D was collected until a certain stage of dementia severity before a switch to proxy-EQ-5D was initiated. However, papers did not tend to report this substitution. A negative relationship between self-rated EQ-5D and cognition would only be anticipated at the more severe stages of dementia progression if self-report were to still be used (where the PwD’s self-awareness has deteriorated) [2, 35]. In addition, a relationship between cognition and EQ-5D dimension: anxiety/depression was observed – whereby reporting more problems in this dimension corresponded with greater MMSE scores (less cognitive impairment) . These findings potentially highlight the greater self-awareness at earlier cognitive stages, whereby the person can recognise their own deterioration, as well as newly identifying with the label of the dementia diagnosis (thereby inducing anxiety/depression).
The evidence on the convergent validity between EQ-5D and function was more robust. Fifteen studies reported a statistically significant positive association between the outcomes, whereby greater functional impairment was correlated with lower EQ-5D as hypothesised. This finding is echoed within the wider literature whereby using ADL as a marker of disease progression within economic models has been suggested, due to its importance within dementia disease experience and its alignment with HRQoL . Only two studies reported a negative association with self-rated EQ-5D, however both findings were from mild-stage study samples, where it would be expected that functional impairment would be relatively low. Overall, the studies that reported non-significant results were mainly (75%) from sample sizes of < 200, further highlighting the challenges associated with considering smaller studies for psychometric appraisals. As outlined earlier, there are two fundamental types of ADL: BADLs and IADLs. Where there were mixed reports such that proxy-EQ-5D correlated with the measure of function, but self-rated EQ-5D did not, the instruments were measures of IADL (ADCS-ADL [20, 28] and Lawton Scale [38, 41]). As IADLs are daily tasks that are not necessary for functional living, e.g., handling finances, it may be that PwD do not recognise their impairments in conducting these activities as they are now being performed by a proxy, and that the proxy is more aware of these impairments and their impact. All the studies that measured BADLs showed evidence of convergent validity with EQ-5D. In dementia progression, IADL impairment is experienced sooner and BADLs are not impacted until the more severe stages of disease [51, 52]. Therefore, the relationship between EQ-5D with BADLs is not unanticipated, with EQ-5D being a measure of health status, it has been found to be more responsive to changes in severe stages of disease .
It was hypothesised that lower EQ-5D scores would be associated with greater behavioural/mood disturbances, as demonstrated by a negative correlation. The included measures captured different aspects of behaviour/mood. The only study where agitation (via CMAI) was measured reported a statistically significant relationship with EQ-5D-proxy . Similarly, the evidence of convergent validity between neuropsychiatric disturbance (via NPI) and EQ-5D index scores was only statistically significant for proxy-report. Whereas evidence of EQ-5D convergent validity with measures of depression was observed to a greater extent with self-report. As depression is a more personal experience, and the NPI additionally captures broader neuropsychiatric symptoms, this pattern of association is predictable. However, this finding should be interpreted with caution as it is important to consider the impact of the administration of these clinical measures. The measures of depression are self-rated by the PwD themselves, while the broader clinical measures such as the NPI are informant-based, thereby completed by a proxy. Therefore, the relationships observed may be impacted by who is completing both measures, as opposed to the content of the measures themselves.
A key objective of this review was to explore the evidence surrounding convergent validity of EQ-5D dimensions with the core dementia outcome measures. The findings show that while the “observable” dimensions mobility, self-care and usual activities were associated with functional measures, the “non-observable” dimension anxiety/depression was associated with the measures of cognition and behaviour/mood. Although these findings indicate relationships between the clinical measures and the appropriate corresponding EQ-5D dimensions (thus demonstrating convergent validity of EQ-5D at the dimension-level), they should be interpreted with caution. Firstly, one paper reported that GDS had moderate convergent validity with all EQ-5D dimensions (minus pain/discomfort) . Secondly, as highlighted by Karlawish et al., 41% of EQ-5D scores were at 1.0 (perfect health), and people did not self-report impairment in dimensions such as usual activities, where one would typically expect to see disability in this population . Lastly, the number of papers reporting evidence at the EQ-5D dimension level was low (n = 7) and as indicated by Michalowsky et al., there are additional factors to consider when observing dimensions on both EQ-5D-3L and EQ-5D-5L .
Evidence assessing inter-rater agreement between self and proxy reports of EQ-5D index scores and where possible, EQ-5D dimensions mirrors the existing evidence – inter-rater agreement was generally poor, particularly for EQ-5D index scores. People with dementia self-rated fewer problems than their proxies. This finding is established within the literature [10, 14, 54] and is theorised to be the result of various factors such as response-shift [55, 56] and/or proxy burden resulting in “projection bias” [54, 57, 58].
When analysing inter-rater agreement at the dimension-level the findings were mixed. For the “non-observable” dimensions of pain/discomfort and anxiety/depression, agreement was low. This was also the case for the self-care dimension. The “observable” dimension of mobility was reported to have the strongest inter-rater agreement. Bonfiglio et al. commented on this phenomenon, highlighting the difficulty in establishing agreement between people with dementia and proxies for non-observable factors such as pain/discomfort and anxiety/depression . Overall, proxy-assessments tend to show a higher degree of association with the clinical measures, an observation that has been recurrently noted in the wider literature [13, 14, 59, 60] and may be influenced by disease severity. Garre-Olmo et al. reported that strength of the relationship between function (via DAD) and EQ-5D increased with increasing disease severity . However, exploring the impact of disease severity was beyond the scope of this review.
Strengths and limitations
A key strength of this review is that it addresses a research area that has not been fully explored, and is to our knowledge, the first review of its kind to specifically synthesise the relevant evidence of EQ-5D convergent validity with dementia outcomes. However, the search strategy was only applied to limited electronic databases and therefore there may have been relevant studies that were not identified for this review (additional chain-searching of references was performed in an aim to minimise this). Only published journal articles in English language were included, resulting in a total of only 30 papers, and 25 papers were excluded solely because of the absence of extractable data. This highlights the lack of research focusing exclusively on psychometric properties, as this was a specified objective of only n = 16 papers, with the review evidence originating mainly from trials and observational studies. However, despite the lack of studies with a specific psychometric focus, over half of the included studies were deemed as high quality for evaluating convergent validity. In addition, as the core dementia outcome measures considered within this review were pre-determined (as described in Additional file 1), it is possible that not all measures were covered in the evaluation. However, to mitigate against this potential bias, a methodological approach was adopted in selecting the core outcome measures for consideration.
The inclusion criteria were broad, a range of study types were included, and no date restriction was applied to the online databases. Another strength was the exploration of inter-rater reliability and extracting this information at the EQ-5D dimension-level where available. Lastly, there are several factors that were not explored as they were beyond the scope of this review. The characteristics of the proxy (i.e., sociodemographic factors) and pragmatic aspects i.e., instrument administration method were not explored, which could potentially impact the findings. A previously published study reported that where carers themselves were in pain , they reported more problems with pain in the PwD, inferring projection bias. Additionally, one of the studies commented on the difference in validity by proxy type, reporting that the data provided by clinician-proxies, when compared to that of informal caregivers, had higher construct validity for the more “observable” dimensions of EQ-5D (i.e., mobility and self-care). On the other hand, data from informal caregivers had higher construct validity for the less observable dimensions (i.e., anxiety/depression).
This systematic literature review concludes that there is published evidence to indicate convergent validity between EQ-5D and the core dementia outcome measures of function, behaviour/mood and cognition in the expected directions. Additionally, at the dimension-level, EQ-5D dimensions show associations with specific clinical measures, whereby the degree of association differed by rater-type (self vs. proxy raters). Overall inter-rater agreement was poor, particularly for the “non-observable” dimensions of EQ-5D (i.e., pain/discomfort, anxiety/depression). It should, however, be noted that these conclusions are based on very limited published data, and this review highlights the lack of studies investigating the psychometric properties of measures for use in dementia populations.
Measuring HRQoL in dementia populations is a complex issue, particularly when considering the challenges with both self and proxy reporting. This review revealed that, despite the limited evidence pool, EQ-5D exhibits convergent validity with other dementia outcomes, and that for “observable” dimensions (i.e., mobility, self-care, usual activities), the associations are stronger when using proxy-EQ-5D. There is currently no guidance on which report to use in evaluations and how to combine scores. Future empirical studies could investigate the severity range for which EQ-5D self-report could be reliably used, and the severity stage at which proxy-report become more appropriate.
Availability of data and materials
Activities of daily living
The Alzheimer's Disease Assessment Scale-Cognitive Subscale
Alzheimer’s Disease Cooperative Study Activities of Daily Living Scale
Basic activity of daily living
Bristol Activities of Daily Living Scale
Cumulative Index to Nursing and Allied Health Literature
Cohen-Mansfield Agitation Inventory
Cornell Scale for Depression in Dementia
Disability Assessment for Dementia
Dementia with lewy bodies
Geriatric Depression Scale
Centre for Review and Dissemination
Health-related quality of life
Instrumental activity of daily living
Mini-Mental State Examination
Montreal Cognitive Assessment
National Institute of Health and Care Excellence
The Preferred Reporting of Items for Systematic Review and Meta-Analysis guidance
Person with dementia
Quality of life
Quality of Life in Alzheimer's disease scale
Quality of Life in Late-Stage Dementia Scale
Quality of Life in Late-Stage Dementia Scale proxy
Randomised controlled trial
Severe Impairment Battery
Green C, Zhang S. Predicting the progression of Alzheimer’s disease dementia: a multidomain health policy model. Alzheimers Dement. 2016;12(7):776–85. https://doi.org/10.1016/j.jalz.2016.01.011[publishedOnlineFirst:2016/03/27].
Banerjee S, Smith S, Lamping D, et al. Quality of life in dementia: more than just cognition. An analysis of associations with quality of life in dementia. J Neurol Neurosurg Psychiatry. 2006;77(2):146–8.
Wu Y-T, Beiser AS, Breteler MM, et al. The changing prevalence and incidence of dementia over time—current evidence. Nat Rev Neurol. 2017;13(6):327–39.
NICE. NICE health technology evaluations: the manual: Published: 31 January 2022; 2022. Available from: https://www.nice.org.uk/process/pmg36/chapter/introduction-to-health-technology-evaluation2022.
Group TE. EuroQol-a new facility for the measurement of health-related quality of life. Health Policy. 1990;16(3):199–208.
Herdman M, Gudex C, Lloyd A, et al. Development and preliminary testing of the new five-level version of EQ-5D (EQ-5D-5L). Qual Life Res. 2011;20(10):1727–36.
Landeiro F, Mughal S, Walsh K, et al. Health-related quality of life in people with predementia Alzheimer’s disease, mild cognitive impairment or dementia measured with preference-based instruments: a systematic literature review. Alzheimer’s Res Ther. 2020;12(1):1–14. https://doi.org/10.1186/s13195-020-00723-1. [publishedOnlineFirst:2020/11/20].
Li L, Nguyen K-H, Comans T, et al. Utility-based instruments for people with dementia: a systematic review and meta-regression analysis. Value in Health. 2018;21(4):471–81.
Keetharuth A, Hussain H, Rowen D, et al. Assessing the psychometric performance of EQ-5D-5L in dementia: a systematic review. Health Qual Life Outcomes. 2022;20:139.
Hounsome N, Orrell M, Edwards RT. EQ-5D as a quality of life measure in people with dementia and their carers: evidence and key issues. Value in health. 2011;14(2):390–9.
O’Shea E, Hopper L, Marques M, et al. A comparison of self and proxy quality of life ratings for people with dementia and their carers: a European prospective cohort study. Aging Ment Health. 2020;24(1):162–70.
Smith S, Hendriks A, Cano S, et al. Proxy reporting of health-related quality of life for people with dementia: a psychometric solution. Health Qual Life Outcomes. 2020;18:1–10.
Pickard AS, Knight SJ. Proxy evaluation of health-related quality of life: a conceptual framework for understanding multiple proxy perspectives. Med Care. 2005;43(5):493. https://doi.org/10.1097/01.mlr.0000160419.27642.a8[publishedOnlineFirst:2005/04/20].
Rand S, Caiels J. Using proxies to assess quality of life: a review of the issues and challenges. 2015.
Reviews Cf, Dissemination. CRD's guidance for undertaking reviews in healthcare: York Publ. Services 2009.
Moher D, Liberati A, Tetzlaff J, et al. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. PLoS Med. 2009;6(7):e1000097.
Fayers PM, Machin D. Quality of life: the assessment, analysis and interpretation of patient-reported outcomes. New York: Wiley; 2013.
Meader N, King K, Llewellyn A, et al. A checklist designed to aid consistency and reproducibility of GRADE assessments: development and pilot validation. Syst Rev. 2014;3(1):1–9.
Ankri J, Beaufils B, Novella JL, et al. Use of the EQ-5D among patients suffering from dementia. J Clin Epidemiol. 2003;56(11):1055–63.
Bhattacharya S, Vogel A, Hansen MLH, et al. Generic and disease-specific measures of quality of life in patients with mild Alzheimer’s disease. Dement Geriatr Cogn Disord. 2010;30(4):327–33.
Boström F, Jönsson L, Minthon L, et al. Patients with dementia with Lewy bodies have more impaired quality of life than patients with Alzheimer disease. Alzheimer Dis Assoc Disord. 2007;21(2):150–4.
Castro-Monteiro E, Forjaz MJ, Ayala A, et al. Change and predictors of quality of life in institutionalized older adults with dementia. Qual Life Res. 2014;23(9):2595–601.
Diaz-Redondo A, Rodriguez-Blazquez C, Ayala A, et al. EQ-5D rated by proxy in institutionalized older adults with dementia: Psychometric pros and cons. Geriatr Gerontol Int. 2014;14(2):346–53.
Érsek K, Kovács T, Wimo A, et al. Costs of dementia in Hungary. J Nutr Health Aging. 2010;14(8):633–9.
Garre-Olmo J, Vilalta-Franch J, Calvó-Perxas L, et al. A path analysis of dependence and quality of life in Alzheimer’s disease. Am J Alzheimers Dis Other Demen. 2017;32(2):108–15.
González-Vélez AE, Forjaz MJ, Giraldez-García C, et al. Quality of life by proxy and mortality in institutionalized older adults with dementia. Geriatr Gerontol Int. 2015;15(1):38–44.
Haaksma ML, Leoutsakos JMS, Bremer JA, et al. The clinical course and interrelations of dementia related symptoms. Int Psychogeriatr. 2018;30(6):859–66.
Heßmann P, Seeberg G, Reese JP, et al. Health-related quality of life in patients with Alzheimer’s disease in different German health care settings. J Alzheimers Dis. 2016;51(2):545–61.
Kunz S. Psychometric properties of the EQ-5D in a study of people with mild to moderate dementia. Qual Life Res. 2010;19(3):425–34.
Schiffczyk C, Romero B, Jonas C, et al. Generic quality of life assessment in dementia patients: a prospective cohort study. BMC Neurol. 2010;10(1):1–8.
Van De Beek M, Van Steenoven I, Ramakers IH, et al. Trajectories and determinants of quality of life in dementia with Lewy bodies and Alzheimer’s disease. J Alzheimers Dis. 2019;70(2):389–97.
Vogel A, Mortensen EL, Hasselbalch SG, et al. Patient versus informant reported quality of life in the earliest phases of Alzheimer’s disease. Int J Geriatr Psychiatry. 2006;21(12):1132–8.
Michalowsky B, Hoffmann W, Xie F. Psychometric Properties of EQ-5D-3L and EQ-5D-5L in Cognitively Impaired Patients Living with Dementia. J Alzheimers Dis. 2021;83(1):77–87.
Bryan S, Hardyman W, Bentham P, et al. Proxy completion of EQ-5D in patients with dementia. Qual Life Res. 2005;14(1):107–18.
Farina N, King D, Burgon C, et al. Disease severity accounts for minimal variance of quality of life in people with dementia and their carers: analyses of cross-sectional data from the MODEM study. BMC Geriatr. 2020;20(1):1–13.
Martin A, Meads D, Griffiths AW, et al. How should we capture health state utility in dementia? Comparisons of DEMQOL-Proxy-U and of self-and proxy-completed EQ-5D-5L. Value in Health. 2019;22(12):1417–26.
Orgeta V, Edwards RT, Hounsome B, et al. The use of the EQ-5D as a measure of health-related quality of life in people with dementia and their carers. Qual Life Res. 2015;24(2):315–24.
Sheehan BD, Lall R, Stinton C, et al. Patient and proxy measurement of quality of life among general hospital in-patients with dementia. Aging Ment Health. 2012;16(5):603–7.
Trigg, Jones, Knapp, et al. The relationship between changes in quality of life outcomes and progression of Alzheimer’s disease (AD): results from the Dependence in AD in England 2 (DADE2) longitudinal study. Running Head: Changes in quality of life outcomes and progression of AD: the DADE-2 Investigator Groups, 2015.
King D, Farina N, Burgon C, et al. Factors associated with change over time in quality of life of people with dementia: longitudinal analyses from the MODEM cohort study. BMC Geriatr. 2022;22(1):1–13.
Bonfiglio V, Umegaki H, Kuzuya M. Quality of life in cognitively impaired older adults. Geriatr Gerontol Int. 2019;19(10):999–1005.
Easton T, Milte R, Crotty M, et al. An empirical comparison of the measurement properties of the EQ-5D-5L, DEMQOL-U and DEMQOL-Proxy-U for older people in residential care. Qual Life Res. 2018;27(5):1283–94.
Ashizawa T, Igarashi A, Sakata Y, et al. Impact of the Severity of Alzheimer’s Disease on the Quality of Life, Activities of Daily Living, and Caregiving Costs for Institutionalized Patients on Anti-Alzheimer Medications in Japan. J Alzheimers Dis. 2021;81(1):367–74.
Karlawish JH, Zbrozek A, Kinosian B, et al. Preference-based quality of life in patients with Alzheimer’s disease. Alzheimers Dement. 2008;4(3):193–202.
Karlawish JH, Zbrozek A, Kinosian B, et al. Caregivers’ assessments of preference-based quality of life in Alzheimer’s disease. Alzheimers Dement. 2008;4(3):203–11.
Naglie G, Hogan DB, Krahn M, et al. Predictors of family caregiver ratings of patient quality of life in Alzheimer disease: cross-sectional results from the Canadian Alzheimer’s Disease Quality of Life Study. Am J Geriatr Psychiatry. 2011;19(10):891–901.
Naglie G, Hogan DB, Krahn M, et al. Predictors of patient self-ratings of quality of life in Alzheimer disease: cross-sectional results from the Canadian Alzheimer’s Disease Quality of Life Study. Am J Geriatr Psychiatry. 2011;19(10):881–90.
Kuo YC, Lan CF, Chen LK, et al. Dementia care costs and the patient’s quality of life (QoL) in Taiwan: home versus institutional care services. Arch Gerontol Geriatr. 2010;51(2):159–63.
Mlinac ME, Feng MC. Assessment of activities of daily living, self-care, and independence. Arch Clin Neuropsychol. 2016;31(6):506–16.
Sopina E, Sørensen J. Decision modelling of non-pharmacological interventions for individuals with dementia: a systematic review of methodologies. Heal Econ Rev. 2018;8(1):1–12.
McLaughlin T, Feldman H, Fillit H, et al. Dependence as a unifying construct in defining Alzheimer’s disease severity. Alzheimers Dement. 2010;6(6):482–93.
Desai AK, Grossberg GT, Sheth DN. Activities of daily living in patients with dementia. CNS Drugs. 2004;18(13):853–75.
Payakachat N, Ali MM, Tilford JM. Can the EQ-5D detect meaningful change? A systematic review. Pharmacoeconomics. 2015;33(11):1137–54.
Shearer J, Green C, Ritchie CW, et al. Health state values for use in the economic evaluation of treatments for Alzheimer’s disease. Drugs Aging. 2012;29(1):31–43.
Römhild J, Fleischer S, Meyer G, et al. Inter-rater agreement of the Quality of Life-Alzheimer’s Disease (QoL-AD) self-rating and proxy rating scale: Secondary analysis of RightTimePlaceCare data. Health Qual Life Outcomes. 2018;16(1):1–13.
Sprangers MA, Schwartz CE. Integrating response shift into health-related quality of life research: a theoretical model. Soc Sci Med. 1999;48(11):1507–15.
Gräske J, Meyer S, Wolf-Ostermann K. Quality of life ratings in dementia care–a cross-sectional study to identify factors associated with proxy-ratings. Health Qual Life Outcomes. 2014;12(1):1–11. https://doi.org/10.1186/s12955-014-0177-1[publishedOnlineFirst:2014/12/17].
Gómez-Gallego M, Gómez-Amor J, Gómez-García J. Determinants of quality of life in Alzheimer’s disease: perspective of patients, informal caregivers, and professional caregivers. Int Psychogeriatr. 2012;24(11):1805. https://doi.org/10.1017/S1041610212001081[publishedOnlineFirst:2012/06/16].
Muus I, Petzold M, Ringsberg KC. Health-related quality of life after stroke: reliability of proxy responses. Clin Nurs Res. 2009;18(2):103–18. https://doi.org/10.1177/1054773809334912[publishedOnlineFirst:2009/04/21].
Elliott D, Lazarus R, Leeder SR. Proxy respondents reliably assessed the quality of life of elective cardiac surgery patients. J Clin Epidemiol. 2006;59(2):153–9. https://doi.org/10.1016/j.jclinepi.2005.06.010 [publishedOnlineFirst:2006/01/24].
Orgeta V, Orrell M, Hounsome B, et al. Self and carer perspectives of quality of life in dementia using the QoL-AD. Int J Geriatr Psychiatry. 2015;30(1):97–104.
The views expressed are those of the authors and not necessarily those of the NIHR or the Department of Health and Social Care or its arm's length bodies, or other UK government departments. Any errors are the responsibility of the authors.
This research is funded by the National Institute for Health Research (NIHR) Policy Research Programme, conducted through the Policy Research Unit in Economic Methods of Evaluation in Health and Social Care Interventions, PR-PRU-1217–20401.
Ethics approval and consent to participate
Consent for publication
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Core outcome measures in dementia studies and trials.
Complete search strategy.
Quality assessment of included papers adapted from the GRADE assessment tool.
Summary of core outcome measures.
Evidence of EQ-5D convergent validity with cognitive measures.
Evidence of EQ-5D convergent validity with function measures.
Evidence of EQ-5D convergent validity with behaviour measures.
About this article
Cite this article
Hussain, H., Keetharuth, A., Rowen, D. et al. Convergent validity of EQ-5D with core outcomes in dementia: a systematic review. Health Qual Life Outcomes 20, 152 (2022). https://doi.org/10.1186/s12955-022-02062-1
- Quality of life
- Systematic literature review