Scalability and internal consistency of the German version of the dementia-specific quality of life instrument QUALIDEM in nursing homes – a secondary data analysis

Background Quality of life (Qol) is a widely selected outcome in intervention studies. The QUALIDEM is a dementia-specific Qol-instrument from The Netherlands. The aim of this study is to evaluate the scalability and internal consistency of the German version of the QUALIDEM. Methods This secondary data analysis is based on a total sample of 634 residents with dementia from 43 nursing homes. The QUALIDEM consists of nine subscales that were applied to a subsample of 378 people with mild to severe dementia and six consecutive subscales that were applied to a subsample of 256 people with very severe dementia. Scalability, internal consistency and distribution scores were calculated for each predefined subscale using the Mokken scale analysis. Results In people with mild to severe dementia, seven subscales, care relationship, positive affect, negative affect, restless tense behavior, positive self-image, social relations and feeling at home, were scalable (0.31 ≤ H ≤ 0.65) and internally consistent (Rho ≥ 0.62). The subscales social isolation (H = 0.28) and having something to do (H = 0.18) were not scalable and exhibited insufficient reliability scores (Rho ≤ 0.53). For people with very severe dementia, five subscales, care relationship, positive affect, restless tense behavior, negative affect and social relations, were scalable (0.33 ≤ H ≤ 0.65), but only the first three of these subscales showed acceptable internal consistency (Rho 0.59 – 0.86). The subscale social isolation was not scalable (H = 0.20) and exhibited poor internal consistency (Rho = 0.42). Conclusions The results show an acceptable scalability and internal consistency for seven QUALIDEM subscales for people with mild to severe dementia and three subscales for people with very severe dementia. The subscales having something to do (mild to severe dementia), negative affect (very severe dementia), social relations (very severe dementia) and social isolation (both versions) produced unsatisfactory results and require revision.


Background
The main goal of caring for people with dementia is the maintenance and promotion of their quality of life (Qol) [1]. Qol has become an important concept as an outcome in intervention studies, particularly psychosocial interventions, as well as an indicator of the quality of care of people with dementia [2][3][4][5]. The World Health Organization defines Qol as "individuals' perceptions of their position in life in the context of the culture and value systems in which they live and in relation to their goals, expectations, standards and concerns" [6]. One early and highly recognized model describes dementiaspecific Qol consisting of objective (e.g., behavioral competence and environment) and subjective (e.g., perceived Qol and psychological well-being) components (called 'sectors' by Lawton) [7,8]. Based on this theoretical approach, Jonker et al. developed a hierarchical model that defines psychological wellbeing as the starting point and central indicator for dementia-specific Qol [9]. Those authors argue for the consideration of non-dementia-related domains of Qol, such as personal factors (e.g., religion, income, age), next to environmental characteristics and dementia related domain.
In 2005, Ettema et al. defined the Qol of people with dementia based on a literature review that specified dementia-specific Qol as 'the multidimensional evaluation of the person-environment system of the individual, in terms of adaption to the perceived consequences of the dementia' [10]. Based on this definition, the seven adaptive tasks of the adaption coping model were interpreted as dementia-specific Qol domains: [11] dealing with own disability, developing an adequate care relationship with the staff, preserving an emotional balance, preserving a positive self-image, preparing for an uncertain future, developing and maintaining social relationships and dealing with the nursing home environment [12]. This model highlights the importance of psychosocial domains, which is supported by a recent review which showed 10 psychosocial (e.g., attachment, social contact, spirituality) as well as 3 physical and practical domains (e.g., physical health, financial situation) of Qol judged by people with dementia [13]. During the course of the theoretical developments, several dementia-specific Qol instruments have been developed, using self-ratings, proxy-ratings or direct observations as the data sources [14,15].
The majority of these instruments have been developed in English-speaking countries (particularly the USA and the UK). In Germany, Qol has recently been characterized as a nursing outcome by the medical service of the statutory long-term care insurance program. [16]. Only one German Qol instrument has been developed to date: the Heidelberg instrument for the assessment of quality of life in dementia (H.I.L.D.E.) [17]. This instrument is not typically applicable in research studies because it is moderately time-consuming (> 30 min per resident). A recent review did not identify any Qol instrument for people with dementia that has been validated in Germany [18]. To the best of our knowledge, there are a limited number of Qol instruments that have been translated into German, including the Qol-AD [19], D-Qol [20,21] and QUALIDEM [22]. These instruments have not been fully psychometrically tested. With the exception of the QUALIDEM, these instruments do not sufficiently focus on the Qol domains that are judged important for people with dementia [23].
The QUALIDEM has been evaluated in terms of psychometric properties with a focus on the psychosocial domains of dementia-specific Qol [12]. The instrument is simple to administer [24] and was developed for proxyrating of Qol throughout the entire course of dementia in nursing home residents [25]. Consequently, the use of the QUALIDEM is recommended for Qol assessment in the late stage of disease [5] and for longitudinal ratings [26]. Its focus on psychosocial domains allows the instrument to assess several important Qol domains (affect, attachment, self-image, being useful, social contact, sense of aesthetics in the living environment, security and privacy, self-determination and freedom) that were described in an earlier review [13] and judged as important by people with dementia.
The QUALIDEM was developed and validated between 2005 and 2007 in the Netherlands. It consists of two consecutive versions for people with mild to severe and very severe dementia. The stages of dementia severity are classified according to the Reisberg scale, the Global Deterioration Scale (GDS) and Functional Assessment Staging (FAST), the last of which ranges from 1 (no cognitive impairment) to 7 points (very severe dementia) [27].
a) The Qol of people with mild to severe dementia (FAST = 2-6) is assessed by the 37-item version covering nine domains: care relationship, positive affect, negative affect, restless tense behavior, positive self-image, social relation, social isolation, feeling at home and having something to do. b) The domains positive self-image, feeling at home and having something to do cannot be assessed in people with very severe dementia (FAST = 7). The second version of the QUALIDEM comprises 18 items covering six domains of Qol: care relationship, positive affect, negative affect, restless tense behavior, social relation and social isolation.
The response options for all items are "never", "rarely", "sometimes" and "frequently". In 2008, the QUALIDEM was translated to German by a certified agency using forward-backward translation. The back-translated version was verified by the questionnaire's first author, whose comments were taken into account for the adaption of the German version. In an exploratory investigation, the German QUALIDEM indicates construct validity measured by factor analysis and moderate to high internal consistency [22]. This paper outlines the evaluation of scalability (construct validity) and internal consistency of the German QUALIDEM, based on a large sample. The study followed a confirmatory methodological approach that has been used successfully by other studies in The Netherlands [12,26]. Additionally, the distribution of the subscales scores as differentiated by the subgroups of dementia severity, age and gender will be presented.

Methods
A secondary data analysis of three German studies was performed. The data were collected from a pre/post-test evaluation of quality instruments in nursing homes (InDemA: Interdisciplinary Implementation of Quality Instruments for the Care of residents with Dementia in Nursing Homes) [28,29], a cluster-randomized controlled trial of the evaluation of the Serial Trial Intervention (STI-D: Serial Trial Intervention-Germany) [30] and a crosssectional study on Dementia Care Mapping utilization (Leben-QD I: Strengthening Qol for people with dementia) [31].
The ethical committee of the German Society of Nursing Science approved the study protocol of the Qol-Dem project. Guidelines for the good practice of secondary data analysis AGENS [32] were applied for the quality assurance of data. Prior to data pooling, all three primary data sets were tested with respect to structure, completeness and plausibility. We also tested the comparability of the designs, measurements and samples of the three primary studies based on a systematic approach to pooled datasets of observational studies [33]. We judged differences between the three datasets as not likely to be relevant for our analysis.

Setting and participants
Data collection took place between October and December 2008 for the InDemA study, between January and March 2009 for the STI-D study and between September and November 2010 for the Leben-QD I study. The total sample comprises 634 residents with dementia from 43 nursing homes located in the area of Frankfurt/Main (STI-D study, n = 19) and in North-Rhine Westphalia (InDemA study = 15, Leben-QD I study = 9).
The inclusion criteria for the residents were the following: Mini Mental Status Examination (MMSE) [34] score ≤ 24 (InDemA and STI-D) or a (FAST) [27] score ≥ 2 (Leben-QD I); living in the nursing home for at least 2 weeks (Leben-QD I and InDemA) or 4 weeks (STI-D). The exclusion criterion was a documented diagnosis of schizophrenia or other psychotic disorders (InDemA and STI-D).

Procedures
Caregivers with different formal qualifications (registered nurses and nursing assistants) retrospectively filled in the questionnaires, with the answers referring to the last two weeks of observation. The caregivers were highly involved in the care of the people with dementia. To ensure standardization, the data collection was initiated by trained external research assistants (registered nurses and students in health care study programs).

Measurements
In the InDemA and STI-D studies, the Mini Mental Status Examination (MMSE) was used for the assessment of the dementia severity. The MMSE ranges from 1 to 30 points (≤ 24 points: mild dementia, ≤ 10 points: severe dementia) [27]. The MMSE was given during an interview with the nursing home residents. Because this was associated with stress for many residents, the test was concluded early for ethical reasons if it was obvious that the resident would only reach an MMSE value < 10. Because there were no FAST scores available for the participants of the InDemA and STI-D studies (in contrast to Leben-QD I), the FAST scores of these participants were determined on the basis of their MMSE scores to form the two subsamples of interest (mild to severe dementia and very severe dementia). This approach followed a recommendation by Reisberg [27]. In all studies, the activities of daily living were assessed with the Physical Self Maintenance Scale (PSMS) [35]. This instrument consists of six items (toileting, feeding, dressing, grooming, physical ambulation and bathing). The response options range from 1 = no impairment to 5 = severe impairment, resulting in a total range of 6 to 30 points.
For the description of residents' care dependency, levels defined by the German statutory Long-term Care Insurance were used (ranging from 1 = low to 3 = high). The sample characteristics of age and gender were assessed with single items.

Statistical analysis
The Mokken scale analysis is a non-parametric iterative method for identifying unidimensional sets of polytomous items from a multidimensional item bank. It provides additional information about the relationship between items. The Mokken scale analysis is a method of item response theory (IRT), which is based on the following assumptions: unidimensionality, local independency and monotonicity. The method evolved from the Guttman analysis for investigating hierarchies of items within scales [36]. In contrast to the parametric Rasch analysis, the Mokken scale analysis requires no logistic function of an item as an assumption [37]. Because the 'true' course of an item traceline is only visible after repeated investigation in several studies [38], we selected the Mokken scale analysis for the investigation of the ordinal QUALIDEM data. The Mokken scale analysis is established in the context of scale development, which has been widely used in nursing [39,40], psychology [41,42] and quality of life research [12,26,43,44]. The evaluation of a scale using Mokken scaling results in Loevinger's coefficient H as an indicator for the scalability of each subscale. The scalability of a single item in relation to the other items in the scale or an item set is expressed by the value H i . According to Watson et al. [36], the following interpretation of H scores is applied: 0.30 -0.39 = weak scale, 0.40 -0.50 = medium scale, and > 0.50 strong scale. H i scores for each item should be non-negative for the Mokken model to hold. Additionally, the H i values should be > 0.30. Items that fail these H i levels have weak discrimination power and are not scalable [45].
Mokken's confirmatory scalability analysis was used to investigate whether the subscales of the original Dutch version of the QUALIDEM are represented in the German data. Based on the existing subscales, Loevinger's H i was calculated for each item of an item set, and H was calculated for each subscale. These calculations were performed for each of the three primary studies and for the total sample using the Mokken package for the software R [46,47]. The robustness of the scale is tested on different subpopulations [37].
The internal consistency of the QUALIDEM was assessed with the coefficient Rho. This coefficient is not as prone to bias as Cronbach's alpha [48]. A Rho score > 0.60 indicates a reliable scale [49]. For comparison purposes only, we also calculated Cronbach's alpha.
To investigate the distribution of the QUALIDEM scores, the 10th, 25th, 50th (median), 75th and 90th percentiles were calculated for each subscale for people with mild to severe dementia and for those with very severe dementia, grouped by gender and age. The subscale scores were transformed to values between 0 and 100. Cronbach's alpha, percentile, means, standard deviations and the missing value analysis were performed using SPSS version 18.

Study population
The sample consisted of 149 participants from the InDemA study, 338 participants from the STI-D study and 147 participants from the Leben-QD I study. For the investigation of both QUALIDEM versions, the sample was divided in two subsamples of 378 people with mild to severe dementia and 256 people with very severe dementia (Tables 1 and 2).

Missing value analysis
Of the 37 QUALIDEM items for people with mild to severe dementia, 19 responses (0.1%) from a maximum of 13986 possible responses were missing in the total sample (Table 1). Approximately 0.3% of responses were missing from the total sample of people with very severe dementia ( Table 2). Based on the two way imputation method suggested by van der Ark and Sijtsma [50], the missing values were imputed for the next steps of the analysis.

Scalability
The results are presented in Table 3. In brief, the results for the total sample of people with mild to severe dementia exhibited scalability of the adopted subscales with the exception of social isolation (H = 0.28) and having something to do (H = 0.18). Overall, the H values were stable in relation to the scalability level between the three primary studies. With the exception of the items 13, Indicates that he or she is bored, 20, Openly rejects contact with others, 26, Finds things to do without help from others, 32, Calls out, and 38, Enjoys helping with chores on the ward, all the items in the adopted subscales were all scalable.
The analysis for the people with very severe dementia demonstrated scalability for all predefined subscales excluding social isolation (H = 0.20). Between the primary studies, the difference in the H values varied from 0.11 for restless tense behavior and social isolation to 0.36 for negative affect. Items 16, Is rejected by other residents, 20, Openly rejects contact with others, 22, Has tense body language and 32, Calls out, were not scalable (Table 4).

Internal consistency
The scalable subscales for mild to severe dementia exhibited a Rho coefficient between 0.62 and 0.91, which indicates internally consistent scales (Table 3). For people with very severe dementia, the scalable subscales of care relationship, positive affect and restless tense behavior were internally consistent (Rho ≥ 0.59) ( Table 4).

Percentile scores
The results in Table 5 demonstrate the percentage of people who have a Qol score above or below a certain score on a respective subscale. A higher score indicates a higher Qol.
For all subscales, the median for participants was ≥ 50 regardless of the dementia severity, which reflects relatively high scores in general. The median in the subscales care relationship (81-78), positive affect (75-67), negative affect (78-67) and restless tense behavior (67-56) exhibited higher scores for people with mild to severe dementia than for people with very severe dementia. The Qol domains social relations and social isolation exhibited no difference in medians. The analysis of the 3 age groups demonstrated a stable distribution of the Qol scores for all subscales. With respect to gender, the median scores of men were higher in the subscales positive affect, negative affect, restless tense behavior and social isolation than the median scores of the women. In contrast, the subscales, care relationship, social relations, feeling at home and having something to do exhibited higher median scores for women.

Discussion
The results for the 37-item version for mild to severe dementia revealed that 7 of 9 subscales were scalable and internally consistent. The H values were stable with respect to the scalability level, independent of the nursing institution or primary study. The subscales social isolation and having something to do were not scalable. These results are largely comparable to a previous Dutch study in which the subscale social isolation was not scalable and the subscale having something to do was weakly scalable [26]. Consistent with these results, the subscale, social isolation, could not be identified in the German version of the QUALIDEM through an explorative factor analysis [22]. The results for the subscale social isolation might be explained by item 32, Calls out. Calling out is not necessarily an expression of social isolation, as it could be an   The analysis for the 18-item version of the QUALIDEM revealed that only the three subscales care relationship, positive affect and restless tense behavior were scalable and internally consistent. The Rho scores for the mentioned subscales varied between 0.86 (good) and 0.59 (just acceptable). This result is consistent with previous Dutch studies [12,26] and the results for alternative Qol instruments [15]. The remaining subscales were not reliable (negative affect and social relations) and not scalable (social isolation). The H values varied between the different primary studies, depending on the subscale, and no pattern was discernible. The insufficient internal consistency for these three subscales [26] and the unsatisfactory scalability of the subscale, social isolation, is consistent with previous results [22,26]. The subscale, negative affect, consists of 2 items, and the subscale, social relations, consists of three items. This difference could be a reason for the weak internal consistency. With respect to the H i values for each item, item 22, has tense body language, was not scalable. This finding is consistent with the results of Bouman et al. [26], who found a H i value of 0.29 for this item.
The investigation of distributional scores identified relatively high Qol scores in general. For all subscales, 50% of the participants reached a score of 50 or higher, regardless of dementia severity. This result raises the question of the QUALIDEM's sensitivity for change, which has not been assessed. Information on responsiveness is scarce in general [51], which highlights the need for research on this topic. To use Qol as an outcome in intervention studies, evidence of the QUALIDEM's sensitivity for change is required [52].
At the 50th percentile, people with mild to severe dementia exhibited higher Qol scores in the subscales care relationship, positive affect, negative affect and restless tense behavior than did people with very severe dementia. In the latest review investigating the influence of cognition on health-related quality of life (HRQL) in dementia, no convincing evidence was found that lower cognitive abilities are associated with a lower HRQL [53]. In a recent study, cognition was identified as a predictor of Qol [54]. Depending on the subscale, either men or women exhibited higher values at the 50th percentile. In the mentioned review, gender did not appear to have any effect on HRQL [53]. The differences in the Qol-distribution between different age and cognition groups and gender require further investigation for verification. In accordance with previous findings [54,55], our results of the descriptive statistical analyses did not exhibit substantial differences between age groups regarding Qol values. The skewed distribution towards high Qol values raises the question of the validity of the data. The Qol assessment using the QUALIDEM is based on a proxy-assessment, which is preferred in advanced dementia and for longitudinal Qol evaluation [10]. Proxy-rated Qol instruments have several methodological difficulties. Proxyrated Qol values from people with dementia are influenced by the burden [56] and attitudes of proxyraters [57]. In several studies, the scores are systematically lower than self-rated Qol values [19,58]. Proxy-raters are professional caregivers who are responsible for the wellbeing of the residents and might underlie social   The evidence of proxy-ratings appears to contradict the possible assumption that the reported high Qol values are affected by the perspective of the Qol assessment.

Limitations
Our analysis is based on a large sample of people with dementia at all stages of the disease. Some sub-analyses are based on a small sample size of n = 50. The large correlation with previous findings with respect to the scalability and internal consistency of the adopted subscales supports the validity of the study results [12,26]. These results must be interpreted based on the existing reliability data from the Netherlands. A first Dutch investigation of interrater and intrarater reliability showed a moderate to strong agreement depending on the QUALIDEM subscales [12]. There are preliminary results for the subscales of the German QUALIDEM that are comparable to the mentioned investigation [60]. Based on the available data, the FAST scores of the participants of the two primary studies (InDemA and STI-D) were determined based on their MMSE scores. Contrary to a recommendation by Reisberg [27], the patients with an MMSE score < 10 instead of an MMSE value < 6 were assigned to the very severe dementia group. A few more participants may have been assigned to the group of people with very severe dementia. This difference should have no effect on the results of the scalability and internal consistency because the 18 items for very severe dementia were also assessable for this group. This different classification might be a potential bias for the distribution scores of the subscales. Other influences on these scores were not investigated (e.g., effect of the type of dementia).

Conclusions
Our study demonstrates the acceptable scalability and internal consistency for seven subscales for people with mild to severe dementia and three subscales for people with very severe dementia exhibit robustness in different subpopulations. These results are largely consistent with previous findings. Concerning the subscales having something to do (mild to severe dementia), negative affect and social relations (both very severe dementia), the extent to which rewording of the items produces better scalability and internal consistency should be explored. The clarity and selectivity of the items of these subscales in particular must be taken into account, as the interpretation of the current items can differ. For example, item 12, responds positively when approached, is not a precise item of the subscale social relations, as the wording of this item is similar to items from the subscale positive affect. This finding is emphasized by a high factor loading on positive affect in a previous exploratory study [22]. The subscale social isolation should be omitted in the future because it was neither scalable nor internally consistent and represents a duplication of the content in the subscale social relations. A revision is recommended for item 13, indicating that he or she is bored, which is not scalable in its present format. The presented percentile scores provide orientation values for comparisons with future studies. The relatively high Qol values for all subscales underline the need for further investigation of the QUALIDEM. Such investigations must be theory-based validity tests that will be part of the Qol-Dem study. Methodological questions such as the possible influence of proxy characteristics (e.g., burden and attitudes on dementia) and the decision-making process of proxies through the Qol assessment of people with dementia in general should be taken into account in future research. In the next steps of the Qol-Dem project, the inter-rater and intra-rater reliability and validity (criterion and construct) of the German version of the QUALIDEM will be comprehensively investigated. These studies will result in a broader understanding of the quality of the QUALIDEM, which could be used for the development of a more advanced version of the instrument.