The key to selecting appropriate outcome measures is defining what an intervention targets, and therefore what a measure has to be able to capture. As can be seen in Table2, the composition of measures included in dementia carer research has changed over time. In earlier years, mood measures were the most prevalent. While this is still true of current research, the gap between use of mood and burden measures has narrowed. Measures capturing social support and relationships are more commonly used now.
Whichever instrument is used, NICE prefers results to be converted into a QALY to allow comparisons across different illnesses and interventions. In order to satisfy QALY methodology, quality weights must be based on preferences and anchored on an interval scale which contains full health and death points. Preference-based generic instruments, such as the EQ-5D are preferred; however, ‘when EQ 5D utility data are not available, direct valuations of descriptions of health states based on standardised and validated HRQL measures included in the relevant clinical trial(s) may be submitted. In these cases, the valuation of descriptions should use the time trade off method in a representative sample of the UK population, with ‘full health’ as the upper anchor, to retain methodological consistency with the methods used to value the EQ 5D’. Validity of the instrument selected is important for results to be meaningful. The most popular measures in the QoL category have been validated with members of the general population.
The aggregation of carer and patient QALYs is rarely undertaken; however, one trial of befriending for carers of people with dementia presented the incremental cost-effectiveness ratio (ICER), as calculated with the EQ-5D for the QALY component, for both the carer alone and the carer and person with dementia combined. The intervention was not cost-effective when the ICER was calculated for the carer alone, but became cost-effective when the spill-over effects on the person with dementia were incorporated. Aggregation of QALYs needs to be undertaken cautiously, with the information used to calculate resulting ICERs explicitly stated to allow for comparisons with interventions where QALYs have not been aggregated.
Out of the most popular instruments in the QoL category, only weights for the EQ-5D were derived using the time trade-off method. The SF-6D and HUI3 were valued using a visual analogue scale and standard gamble; the WHOQOL-BREF does not have preference based scoring. Three possible explanations for differences in health state valuations between measures have been put forward: coverage of descriptive systems, sensitivity of dimensions and valuation methods. Instruments which describe more health states will pick up smaller changes in health status and are more appropriate for research where smaller health gains are expected to be made, such as research involving carers. The HUI3 can describe 972,000 health states; the SF-6D either 7,500 or 18,000 depending on the version, while the EQ-5D only describes 243 health states. A ‘ceiling effect’, where higher health states are chosen more frequently, is known to be a feature of the EQ-5D. In contrast, the SF-6D appears to have a ‘floor effect’, with responses clustered at the lower end of the scale. The floor effect is amplified in population groups with more physical health problems, so may not be an issue when conducting research with carers of people with dementia. This is because although many carers do have health issues, one may assume that they already have reasonable physical health to be able to cope with the physical aspects of caring.
The World Health Organisation defines health as ‘a state of complete physical, mental and social well being and not merely the absence of disease or infirmity’; a definition unchanged since 1948. Furthermore, the seven determinants of health are suggested as: income and social status, education, physical environment, social support networks, genetics, health services and gender. This reinforces the idea that we need to go beyond physical health measurement, and consider other attributes affecting QoL. This is particularly relevant for dementia carers, as research is primarily aimed at relieving burden rather than improving physical health.
While the EQ-5D covers physical domains well there is only one question on mental well-being. Due to the dominance of physical domains, it is not particularly sensitive to changes in carers of people with dementia, who might not see changes in their physical health over time though their QoL is still affected. This issue was raised by Al-Janabi et al., who posed that measuring health related outcomes for carers places a ‘patient’ identity on carers. In a cross-sectional study involving carers of people with dementia completing the HUI2, Neumann et al. found that the stage of Alzheimer’s Disease was a negative predictor of patient utility (as reported by carers completing the HUI2 as a proxy); however, the utility that carers reported for themselves was insensitive to the stage of the care recipients dementia. For research involving carers of people with dementia it may be necessary to include additional outcome measures alongside a generic primary outcome measure for cost-effectiveness analysis.
It has been found that disease specific instruments are better at detecting QoL changes than generic instruments. The main advantage of disease specific instruments is that they are sensitive to changes associated with the disease in question; therefore studies do not need a large sample size. A disadvantage is that co-morbidities may be overlooked; by focusing on QoL changes associated with one particular illness, separate health issues are ignored. As people with dementia and their carers tend to be older, co-morbidities and side effects are particularly relevant. Disease specific instruments are typically focused on the person with the illness; therefore using a population group measure may be more appropriate for carers. Population specific measures cover a broader range of domains than disease specific instruments, with the additional benefit of being more sensitive than a generic instrument. This review found that the most popular instruments in the burden category were developed specifically to measure burden in dementia carers, combining disease specific with population specific domains.
This review found 29 studies which included details of costs; however, most of these were only partial economic evaluations which provided cost-outcome descriptions. Where cost-effectiveness analyses had been performed the unit of effect was typically time e.g. cost per additional year that the person with dementia lived at home, cost per reduction in hours spent on care tasks per day. Cost-utility analysis was included in 3 studies[35, 42, 43]; the outcome measures used were the EQ-5D, HUI2 and the Caregiver Quality of Life Instrument. All three measures are suitable for QALY calculations. The study that included the cost-utility analysis using the HUI2 aggregated carer and patient QALYs, which as mentioned above is not consistent with traditional QALY methodology. 9 of the studies listing costs were protocols, 7 planned to conduct cost-utility analysis using the EQ-5D and 2 planned to conduct cost-utility analysis using the SF-12 or SF-36.
Overall, burden and mood measures were the most frequently used. The earliest article retrieved from the searches was published in 1987 and included 4 mood measures and 1 QoL measure. Outcome measures in the mood category covered a broad range of symptoms from overall mental well-being, anxiety, depression and sleep quality. A variety of social support measures were used; the two most frequently used measures were not specific to dementia carers. Social support measures have grown in popularity but are still not as frequently used as burden, mastery, mood or QoL measures. The least frequently used category of measure was the staff competency and morale category. A large number of unspecified measures were found, mainly due to poor reporting of study methods precluding the authors of this review being able to identify the measure used. The increased use of guidelines such as CONSORT, has improved the quality of reporting of trials in recent years.