- Open Access
DEMQOL and DEMQOL-Proxy: a Rasch analysis among those diagnosed with dementia
Health and Quality of Life Outcomes volume 17, Article number: 161 (2019)
In previous work we concluded that DEMQOL and DEMQOL-Proxy can provide robust measurement of HRQL in dementia when scores are derived from analysis using the Rasch model. As the study sample included people with mild cognitive impairment, we undertook a replication study in the subsample with a diagnosis of dementia (PWD). PWD constitute the population for whom DEMQOL and DEMQOL-Proxy were originally developed.
We conducted a Rasch model analysis using the RUMM2030 software to re-evaluate DEMQOL (441 PWD) and DEMQOL-Proxy (342 family carers). We evaluated scale to sample targeting, ordering of item thresholds, item fit to the model, and differential item functioning (sex, age, severity, relationship), local independence, unidimensionality and reliability.
For both DEMQOL and DEMQOL-Proxy, results were highly similar to the results in the original sample. We found the same problems with content and response options.
DEMQOL and DEMQOL-Proxy can provide robust measurement of HRQL in people with a diagnosis of dementia when scores are derived from analysis using the Rasch model. As in the wider sample, the problems identified with content and response options require qualitative investigation in order to improve the scoring of DEMQOL and DEMQOL-Proxy.
DEMQOL and DEMQOL-Proxy [1,2,3] are disease-specific patient reported outcome measures (PROMs) for measuring health-related quality of life (HRQL) in people with dementia (PWD). Total scores on DEMQOL and DEMQOL-Proxy are typically used as outcomes in intervention and other evaluative studies [4, 5] or, as a measure of disease specific utility , in cost-effectiveness studies [7, 8]. In addition, there is growing interest in using PROMs for routine monitoring of the quality of health and social care [9,10,11,12,13], including dementia care [12, 13]. All these purposes require measurements that use an interval scale (i.e. with equal distances between scale points) and, if comparisons use data for individuals (patients), then individual-level standard errors are also required.
Measurements from conventionally developed questionnaires, using the methodology and psychometric principles of classical test theory, do not fulfil these requirements. Though usually treated as interval scores, such scores are de facto ordinal and, in addition, their standard errors are established at the group level, assuming that they are the same for everyone.
In our recent work with people attending a first appointment at Memory Assessment Services [14, 15], we have shown that the scoring for DEMQOL and DEMQOL-Proxy can meet these requirements using modern psychometric methods based on Rasch Measurement Theory [16, 17]. However, the sample in that work was somewhat heterogeneous and included all those referred for suspected dementia irrespective of eventual diagnosis (as that information is not usually available until sometime afterwards). It is possible that the heterogeneous nature of the sample introduced noise to that analysis and the scores generated from that model may not be appropriate for people with a specific diagnosis of dementia. At 6 months follow up, about half of the participants had a confirmed diagnosis of dementia . As DEMQOL/DEMQOL-Proxy were originally designed and validated for use with people with a diagnosis of dementia [1,2,3], our aim in this paper was to use Rasch Measurement Theory to undertake a diagnostic analysis of the items within DEMQOL and DEMQOL-Proxy to determine if our improved scoring of DEMQOL/DEMQOL-Proxy is replicated in a sample with a confirmed diagnosis of dementia. As these characteristics will potentially vary with each model we wanted to identify if these differed substantially for a model with a dementia diagnosis sample. Together with our original analysis this gives us a more complete diagnostic picture with which to understand how the DEMQOL and DEMQOL-Proxy scales are working and how they can be improved. In particular we investigated whether in this sub-sample, the items of DEMQOL and DEMQOL-Proxy work together as a scale, whether the scale works in the same way for different groups of people, such as men vs women (differential item functioning or DIF), and to what extent PWD are reliably distinguished in terms of their HRQL scores. In addition, the analysis aimed to identify whether anomalies identified in the original analyses such as response options not working as intended and item response dependencies were also found in this sub-sample.
From the original sample of 1434 people with cognitive impairment and 1030 informal family carers who were attending one of 78 Memory Assessment Services (MAS) for a first referral (either at the clinic or at a home visit) we selected those first attenders who were available at 6 months follow up and had a diagnosis of dementia, and their family carers (if present). For pragmatic reasons, participants who were diagnosed after 6 months were not included.
DEMQOL consists of 28 questions and DEMQOL-Proxy consists of 31 questions, each assessed on a 4-point Likert-type response scale: a lot, quite a bit, a little, not at all. The questions were derived from five conceptual domains: health and well-being, cognitive functioning, daily activities, social relationships and self-concept . Separate sub-scales are not supported so both instruments are scored as a single overall score. Emotion items have the stem “Have you felt…”, all other items have the stem “How worried have you been about…”. There is also an additional overall quality of life question, answered on a 4-point scale: very good, good, fair, poor. The items are scored according to a standard scoring algorithm  to produce an overall score where higher scores represent better HRQL. See Smith et al. [1,2,3] for details on the development and validation of DEMQOL and DEMQOL-Proxy based on classical test theory. DEMQOL is self-reported by the PWD (though interviewer-administered) and is appropriate for use in mild to moderate dementia. DEMQOL-Proxy is proxy-reported by a family carer on behalf of the PWD, either self-administered  or interviewer-administered, and can be used at all stages of dementia. The two instruments are intended to be used together. DEMQOL has been shown to have reliability (internal consistency and test-retest) and validity (convergent and discriminant) in mild/moderate dementia. DEMQOL-Proxy has been shown to have reliability (internal consistency and test-retest) and validity (convergent and discriminant) in mild/moderate and severe dementia [1, 3] Disease-specific utility scores are also available for both instruments  The robustness of both instruments has also been shown to be improved by using a scoring algorithm based on Rasch Measurement Theory .
We conducted psychometric analyses using the Rasch model (in RUMM2030 software ), separately for DEMQOL and DEMQOL-Proxy. For all analyses we used the partial credit model (although all the items have the same 4-point Likert type scale). This was because of the diagnostic nature of the analyses which included an evaluation of whether each responses scale was actually used in a similar way.
As in our original study , we investigated: scale to sample targeting, how well the items work together as a measuring instrument (ordering of item response thresholds, item fit, item dependency, differential item functioning by sex, age group, severity or relationship, on the basis that DEMQOL/DEMOQL-Proxy include a range of items about different aspects of daily life which arguably could also be affected by the aging process itself, gender roles and expectations and the deteriorating nature of dementia where eventually patients lose insight about their condition) and how well the instrument measures the people in the sample (person separation index, PSI). See the original study for details on the analyses. The positive emotion items were excluded from the analysis as in both this data set and our previous datasets [14, 15] they appear to be trait-like rather than state-like items and are thus qualitatively different from the rest of the instrument. We therefore focussed our analyses on the smaller remaining set of 23 items for DEMQOL and 26 items for DEMQOL-Proxy. Family wise p values were set at 0.01 for item fit and the more conservative value of 0.05 for DIF (to accommodate main effect class interval, main effect person factor and their interaction). For individual tests at the item level these were Bonferroni corrected within the RUMM2030 software. Therefore, at the item level p values for item fit were p = 0.000435 (DEMQOL, 23 items) and p = 0.000385 (DEMQOL-Proxy, 26 items), and for DIF p = 0.000725 (DEMQOL, 69 comparisons) and p = 0.000641 (DEMQOL-Proxy, 78 comparisons).
Descriptive characteristics of the sample
The sample consisted of 441 PWD, 204 males and 237 (53.7%) females with a diagnosis of dementia and a completed questionnaire. Their age ranged from 58 to 96 years (mean age = 79.6, SD = 6.8). In addition, we had data for 342 family carers, 110 males and 232 (67.8%) females. Carers’ age ranged from 31 to 91 years (mean age = 67.5, SD = 12.7). They were mostly the spouse (63.1%), or son or daughter (27.7%) of the PWD. Table 1 shows further details of the sample. The sample is demographically very similar to the original sample with a few slight differences; participants are slightly more likely to be female, older and less deprived. Also, their carers tend to be slightly older and are slightly more likely to be living with the person with dementia.
For both DEMQOL (23 items) (Fig. 1) and DEMQOL-Proxy (26 items) (Fig. 2) the targeting was very similar to the targeting in the original, full, sample of first attenders to MAS. In this subsample, DEMQOL item threshold locations ranged from roughly − 1.4 to + 2.0 logits and person locations from roughly − 1.8 to + 4.4 logits, compared with − 1.2 to + 1.8 logits and − 1.8 to + 4.6 logits, respectively, in the full sample. As before, there was a lack of item thresholds at the high end of the continuum. In this subsample, DEMQOL-Proxy item threshold locations ranged from roughly − 2.0 to + 2.8 logits and person locations from roughly − 2.6 to + 5.4 logits, compared with − 1.6 to + 3.0 logits and − 2.6 to + 5.4 logits, respectively, in the full sample. As in the full sample, DEMQOL-Proxy showed less of a gap in item thresholds at the high end of the continuum than DEMQOL because in contrast to DEMQOL it is not just positive emotion items having the highest located item thresholds.
Ordering of item thresholds
Seven of the 23 DEMQOL items and four of the 26 DEMQOL-Proxy items showed disordered thresholds, compared with five for DEMQOL and three for DEMQOL-Proxy in the previous full sample. In both cases, we found the same items disordered as in the full sample. The two additional items for DEMQOL were “having felt lonely” and “having been worried about forgetting what day it is”. The one additional item for DEMQOL-Proxy was “having been worried about forgetting where he/she is”. As in the full sample, all disordered thresholds showed that the middle two categories (“quite a bit” and “a little”) were not used as intended.
As in the full sample, none of the 23 DEMQOL items (Table 2) or 26 DEMQOL-Proxy items (Table 3) showed misfit to the model, considering the fit residual, chi square value and ICC together. More specifically, as in the full sample, none of the 23 DEMQOL items and 26 DEMQOL-Proxy items showed statistically significant misfit to the model. Only two of the 23 DEMQOL items (compared with nine in the full sample) and one of the 26 DEMQOL-Proxy items (compared with six in the full sample) showed large fit residuals (> +/− 2.5).
Differential item functioning
None of the 23 DEMQOL items showed DIF for PWD age group or severity, which is in agreement with the findings in the full sample. However, one of the 23 DEMQOL items showed uniform DIF for PWD sex: given the same amount of HRQL, females scored higher than males on “worried about making yourself understood” (Table 2). This item showed no DIF in the full sample. Three of the 26 DEMQOL-Proxy items showed uniform DIF, two of them were the same ones as in the full sample (Table 3). As in the full sample, “worried about not having enough company” showed uniform DIF for multiple sources. It showed DIF for PWD sex (carers of female PWD reporting more worry about not having enough company), PWD age group (no clear pattern), carer age group (no clear pattern) and relationship to the PWD (carers who are a spouse reporting less worry about not having enough company than child/other carers). “Felt irritable” showed less sources of uniform DIF than in the full sample. Its only source was PWD sex (carers of male PWD reporting more irritability), not PWD age group or relationship. Differently from the findings in the full sample, “worried about thoughts being muddled” showed uniform DIF for carer age group (older carers reporting less worry for the PWD) and relationship (spouse carers reporting less worry for the PWD than child/other carers). However, “worried about forgetting what day it is” showed no DIF in the subsample of PWD compared with DIF for severity in the full sample. None of the DEMQOL and DEMQOL-Proxy items showed non-uniform DIF. This is in agreement with the findings in the full sample.
We found one residual correlation > 0.3 for DEMQOL (felt lonely/worried about not having enough company: 0.33), one less than in the full sample. We found 11 residual correlations > 0.3 for DEMQOL-Proxy, of which nine pairs were identical to those (also 11) in the full sample. As in the full sample, item dependency occurred mainly among the negative emotion items, among the cognition items and among the daily activities items of DEMQOL-Proxy. Table 4 (DEMQOL) and Table 5 (DEMQOL-Proxy) show all residual correlations larger than zero and those > 0.3 are highlighted. For both DEMQOL and DEMQOL-Proxy, pattern and strength of the residual correlations strongly resembled those in the full sample.
The 23 DEMQOL items formed an acceptably unidimensional scale though the 26 items in DEMQOL-Proxy were not unidimensional. This is in accordance with our findings in the previous full sample. For DEMQOL the two subsets of measurements based on the four highest and four lowest loading items on the Rasch factor differed significantly for 7.4% [5.2; 10.3] of the cases at the 5% level and for 1.2% [0.4; 3.5] of the cases at the 1% level. These percentages are marginally more than in the full sample (7.1 and 1.1% respectively). For DEMQOL-Proxy, the two subsets of measurements differed significantly for 12.5% [9.4; 16.5] of the cases at the 5% level and for 4.2% [2.1; 8.0] at the 1% level, slightly more than in the full sample (11.9 and 3.0% respectively).
For the 23 DEMQOL items PSI = 0.86 (compared with 0.87 in the full sample), and for the 26 DEMQOL-Proxy items PSI = 0.90 (compared with 0.91 in the full sample). Both these are similar to the findings in the original full sample.
Overall fit to the model
For both DEMQOL (23 items) and DEMQOL-Proxy (26 items) the overall chi square statistic was significant (both: p < 0.001) suggesting that the data did not fit the model. However, for DEMQOL (but not DEMQOL-Proxy, p = 0.003) the data did fit the model after rescoring the items with disordered thresholds (DEMQOL: p = 0.13).
Rasch model based (logit) scores and their benefit
In Fig. 3 we show the relationship between raw scores (simple sums of item scores) and measurements based on the Rasch model (logits) for DEMQOL and DEMQOL-Proxy. The S-shaped curve clearly indicates that at the extremes of the distribution there is benefit from deriving the Rasch model based scores. For both DEMQOL (23 items) and DEMQOL-Proxy (26 items), a 10-point increase at one of the extremes of the raw score scale corresponds to a much larger increase in logits than a 10-point increase in the middle of the raw score scale. This strongly resembles what we found in the full sample.
The improved scoring of DEMQOL and DEMQOL-Proxy previously developed in a heterogeneous sample of people with cognitive impairment using Rasch Measurement Theory  also holds for the specific subset of people with a diagnosis of dementia, for whom DEMQOL and DEMQOL-Proxy were originally developed. The improved Rasch-model based scores for DEMQOL and DEMQOL-Proxy can provide more robust and meaningful estimates of change than their original scores based on classical test theory [1, 3]. Rasch-model based scores are truly interval measurements and invariant (i.e. independent of the sampling distributions of persons and items in which they were established). As such they are appropriate for use with individual people, such as in decision making about their clinical management. Our previous recommendation that DEMQOL and DEMQOL-Proxy should continue to be administered in their original format (28 and 31 questions respectively) and that the more robust scoring derived from our Rasch based analyses should be used, is also appropriate for the specific sub-sample of people with a dementia diagnosis.
This study identified the same anomalies as the full sample analysis and these need to be addressed. Disordered thresholds indicate that response options are not working as intended. In completing these items, PWD and their family carers make less fine distinctions than the four-category response scale offers. As previously recommended , future qualitative work should investigate why this is the case and how the response scale may be improved.
Other anomalies replicated in the present study are item response dependencies and DIF. Item pairs that are dependent share additional variance over and above the variance they share because of measuring the same underlying HRQL construct. Again, in future qualitative work we need to investigate if perhaps these items are not optimally phrased or are redundant. Furthermore, we need to investigate why some of the items show DIF and what we can do about it. Although uniform DIF can be resolved by splitting the affected items (e.g. separate items for male and female PWD), items showing no DIF are to be preferred.
This replication study is limited in much the same ways as our previous analyses . Our data did not allow us to investigate whether the scales are similar across ethnic groups, nor was it possible to investigate any differences across different levels of severity. This analysis has also not addressed any of the issues relating to the relationship between self-reports from DEMQOL and proxy-reports from DEMQOL-Proxy.
In previous work we concluded that DEMQOL and DEMQOL-Proxy can provide robust measurement of HRQL in dementia when scores are derived from analysis using the Rasch model . The results reported here, are similar enough to our previous findings to indicate that the improved scoring is appropriate for the specific sub-sample with a diagnosis of dementia. Future work should focus on improving content (e.g. the positive emotion items and investigating DIF) and response scales.
Availability of data and materials
The datasets generated and analysed during the current study are not publicly available because the study is still ongoing, but after the end of the study can be requested from the second author, Dr. Sarah C. Smith, email@example.com.
Differential item functioning
Health-related quality of life
Memory Assessment Services
Patient reported outcome measures
Person Separation Index
People with dementia
Smith SC, Lamping DL, Banerjee S, Harwood R, Foley B, Smith P, et al. Measurement of health-related quality of life for people with dementia: development of a new instrument (DEMQOL) and an evaluation of current methodology. Health Technol Assess. 2005;9(10):1–93. https://doi.org/10.3310/hta9100.
Smith SC, Murray J, Banerjee S, Foley B, Cook JC, Lamping DL, et al. What constitutes health-related quality of life in dementia? Development of a conceptual framework for people with dementia and their carers. Int J Geriatr Psych. 2005;20:889–95.
Smith SC, Lamping DL, Banerjee S, Harwood RH, Foley B, Smith P, et al. Development of a new measure of health-related quality of life for people with dementia: DEMQOL. Psychol Med. 2007;37:737–46.
Last J, Perrech M, Denizci C, Dorn F, Kessler J, Seibl-Leven M, et al. Long-term functional recovery and quality of life after surgical treatment of putaminal hemorrhages. J Stroke Cerebrovasc Dis. 2015;24:925–9.
Orrell M, Aguirre E, Spector A, Hoare Z, Woods RT, Streater A, et al. Maintenance cognitive stimulation therapy for dementia: single-blind, multicentre, pragmatic randomised controlled trial. Br J Psychiatry. 2014;204:454–61.
Mulhern B, Rowen D, Brazier J, Smith S, Romeo R, Tait R, et al. Development of DEMQOL-U and DEMQOL-Proxy U: generation of preference-based indices from DEMQOL and DEMQOL-Proxy for use in economic evaluation. Health Technol Assess. 2013;17(5):v–xv, 1-140. https://doi.org/10.3310/hta17050.
Livingston G, Kelly L, Lewis-Holmes E, Baio G, Morris S, Patel N, et al. A systematic review of the clinical effectiveness and cost-effectiveness of sensory, psychological and behavioural interventions for managing agitation in older adults with dementia. Health Technol Assess. 2014;18(39):1-226.
Gomes M, Pennington M, Wittenburg R, Knapp M, Black N, Smith S. Cost-effectiveness of memory assessment services for the diagnosis and early support of patients with dementia. J Health Services Res Policy. 2017;22:226–35.
Department of Health. Equity and excellence: liberating the NHS. London: Department of Health; 2010.
Black N. Patient reported outcome measures could help transform healthcare. BMJ. 2013;346:19–21.
Department of Health. Guidance on the routine collection of patient reported outcome measures (PROMs). London: Department of Health; 2008.
Department of Health. The adult social care outcomes framework 2015/16. London: Department of Health; 2014. p. 37.
Department of Health. Prime Minister’s challenge on dementia 2020. Implementation plan. London: Department of Health; 2016. p. 13.
Hendriks AAJ, Smith SC, Chrysanthaki T, Cano SJ, Black N. DEMQOL and DEMQOL-Proxy: a Rasch analysis. Health Qual Life Outcomes. 2017;15:164.
Smith, SC, Hendriks AA J, Chrysanthaki T, Cano SJ, Black N. How can we interpret proxy reports of HRQL when it is no longer possible to obtain a self-report? ISOQOL 22nd Annual Conference. Vancouver; 2015.
Andrich D. A rating formulation for ordered response categories. Psychometrika. 1978;43:561–73.
Rasch G. Probabilistic models for some intelligence and attainment tests. Copenhagen, Danish Institute for Educational Research. 1960. Expanded edition with foreword and afterword by BD Wright. Chicago: University of Chicago Press; 1980.
Park MH, Smith SC, Chrysanthaki T, Neuburger J, Ritchie CW, Hendriks AAJ, Black N. Change in health-related quality of life after referral to memory assessment services. Alzheimer Dis Assoc Disord. 2017;31:192–9.
Banerjee S. DEMQOL: dementia. Quality of Life measure. Brighton and Sussex Medical School. http://www.bsms.ac.uk/research/our-researchers/sube-banerjee/demqol/. Accessed 1 Dec 2016.
Hendriks AAJ, Smith SC, Chrysanthaki T, Black N. Reliability and validity of a self-administration version of DEMQOL-Proxy. Int J Geriatr Psychiatry. 2017;32:734–41.
Andrich D, Sheridan B. RUMM 2030. Perth: RUMM Laboratory Pty Ltd; 1997-2016.
We thank all people with dementia and their family carers who participated in this study.
The report is based on independent research commissioned and funded by the NIHR Policy Research Programme (Using Patient Reported Outcome Measures to Assess Quality of Life in Dementia). The views expressed in the publication are those of the author(s) and not necessarily those of the NHS, the NIHR, the Department of Health, ‘arms’ length bodies or other government departments.
Ethics approval and consent to participate
Patients and carers provided written consent to take part. The study protocol was approved by the National Research Ethics Service Committee London (reference: 14/LO/1146) and the London School of Hygiene and Tropical Medicine (reference: 8418).
Consent for publication
Dr. Sarah Smith is the first author of the original development of DEMQOL and DEMQOL-Proxy (Smith et al. Development of a new measure of health-related quality of life for people with dementia: DEMQOL. Psychol Med. 2007;37:737–46). The instrument is publically available with no charge. All other authors declared no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Hendriks, A.A.J., Smith, S.C. & Black, N. DEMQOL and DEMQOL-Proxy: a Rasch analysis among those diagnosed with dementia. Health Qual Life Outcomes 17, 161 (2019) doi:10.1186/s12955-019-1216-8
- Item analysis
- Rasch measurement theory