- Open Access
Cross-cultural validity of four quality of life scales in persons with spinal cord injury
Health and Quality of Life Outcomesvolume 8, Article number: 94 (2010)
Quality of life (QoL) in persons with spinal cord injury (SCI) has been found to differ across countries. However, comparability of measurement results between countries depends on the cross-cultural validity of the applied instruments. The study examined the metric quality and cross-cultural validity of the Satisfaction with Life Scale (SWLS), the Life Satisfaction Questionnaire (LISAT-9), the Personal Well-Being Index (PWI) and the 5-item World Health Organization Quality of Life Assessment (WHOQoL-5) across six countries in a sample of persons with spinal cord injury (SCI).
A cross-sectional multi-centre study was conducted and the data of 243 out-patients with SCI from study centers in Australia, Brazil, Canada, Israel, South Africa, and the United States were analyzed using Rasch-based methods.
The analyses showed high reliability for all 4 instruments (person reliability index .78-.92). Unidimensionality of measurement was supported for the WHOQoL-5 (Chi2 = 16.43, df = 10, p = .088), partially supported for the PWI (Chi2 = 15.62, df = 16, p = .480), but rejected for the LISAT-9 (Chi2 = 50.60, df = 18, p = .000) and the SWLS (Chi2 = 78.54, df = 10, p = .000) based on overall and item-wise Chi2 tests, principal components analyses and independent t-tests. The response scales showed the expected ordering for the WHOQoL-5 and the PWI, but not for the other two instruments. Using differential item functioning (DIF) analyses potential cross-country bias was found in two items of the SWLS and the WHOQoL-5, three items of the LISAT-9 and four items of the PWI. However, applying Rasch-based statistical methods, especially subtest analyses, it was possible to identify optimal strategies to enhance the metric properties and the cross-country equivalence of the instruments post-hoc. Following the post-hoc procedures the WHOQOL-5 and the PWI worked in a consistent and expected way in all countries.
QoL assessment using the summary scores of the WHOQOL-5 and the PWI appeared cross-culturally valid in persons with SCI. In contrast, summary scores of the LISAT-9 and the SWLS have to be interpreted with caution. The findings of the current study can be especially helpful to select instruments for international research projects in SCI.
In the general population, quality of life (QoL) is measured across countries to indicate the state and development of societies like, for example, in the annual Eurobarometer of the European Commission  or the World Values Survey . National levels of QoL have been found to be related with wealth, human rights, individualism, and the fulfillment of basic biological needs in a given society [3, 4]. Measuring QoL of individuals with certain health conditions provides information about health states beyond diagnosis, about the impact of a disease and its treatment on different domains of daily life, and about the health experience from the "insider" perspective of the affected persons themselves [5, 6]. In relation to health, QoL is measured across countries to compare the burden of disease and disability in different populations. However, QoL is not restricted to health-related issues.
The notion of QoL in general covers various concepts including health-related quality of life (HRQoL) but also subjective well-being (SWB) . HRQoL, on the one hand, describes difficulties caused by poor health on mental and physical functioning, task performance, participation in life areas, or "health status" [8, 9]. SWB on the other hand, includes overall life satisfaction, satisfaction with life domains, as well as positive and negative affect . Life satisfaction is traditionally viewed as a cognitive, needs-based approach towards QoL. It refers to the individual's personal evaluation of the gap between his or her aspirations and achievements. More currently, also a cognitive-affective conceptualization of satisfaction has been discussed [10, 11].
Essentially, life satisfaction is related to the subjective "insider" perspective and is increasingly considered as a meaningful and efficient way to collect information about QoL [12, 13]. Assessing QoL of individuals in health services provision and research complements measurement that is based on performance, and adds relevant information for treatment decision-making and outcome evaluation [6, 14].
QoL of persons who sustained spinal cord injury (SCI) seems to be diminished compared to the general population [15, 16] QoL appears not to be directly related to the severity of SCI [16, 17], but it is related to perceived health, participation and integration, to social support and relationships as well as to living circumstances, e.g. accessibility or income [15, 17].
Several reviews summarized the application and metric properties of QoL measures in SCI [16, 18–20]. Among the various instruments with promising properties were also short scales, such as the Satisfaction with Life Scale (SWLS) , which is part of the United States SCI Model Systems , the Life Satisfaction Questionnaire (LISAT) , or the World Health Organization Quality of Life Assessment (WHOQOL-BREF) .
QoL in persons with SCI has been found to differ across countries [25, 26]. Such differences may be related to country level factors (e.g. culture and values), to internal and external individual level factors (e.g. personality, self-esteem or social support), as well as their interactions (e.g. social desirability) . Differences found in these studies may reflect the properties of the measurement instruments used.
The comparability of measurement results between countries depends on the cross-cultural validity of the applied instruments . Common steps in various guidelines for cross-cultural adaptation of QoL instrument include systematic translation procedures and cross-cultural testing of psychometric properties . There have been efforts to develop and/or validate QoL instruments cross-culturally (e.g. the WHOQoL-development or the International Quality of Life Assessment project) [30, 31]. However, the cross-cultural validity and international comparability of QoL measurement is not well established in SCI.
Psychometric properties, like reliability, validity, etc. can be examined using different techniques. Currently, Rasch-based methods are becoming increasingly popular in the context of rehabilitation outcome measurement . They are used to create interval scale measurement, can reveal metric difficulties of the measures, but also provide techniques to account for them at a statistical level in certain circumstances, for example, by item reduction, collapsing response scale options, splitting items, etc. Thus, Rasch-based methods have also been used to examine and account for cross-cultural bias in outcome measures [33, 34].
The objective of this study is to examine the cross-cultural validity of selected QoL scales across countries in a sample of persons with SCI using Rasch analysis. The specific aims are (1) to examine and compare measurement properties of the instruments, namely, dimensionality, response scale structure, and reliability; (2) to examine the validity of the instruments across countries; and (3) to examine possibilities to enhance the measurement properties and the cross-cultural validity of the instruments.
Design and setting
This cross-sectional multi-centre study was conducted as a nested project within the international collaborative development of the "ICF Core Sets for Spinal Cord Injury" [35, 36]. For the current analyses, data from participating study centers in Australia, Brazil, Canada, Israel, South Africa, and the United States are used.
Participants and data collection
Subjects were recruited through the six participating rehabilitation facilities. Patients were recruited who had sustained a SCI with an acute onset and who were at least 18 years old. Acute onset was defined as a trauma or non-traumatic event resulting in spinal cord dysfunction within 14 days of onset. Subjects with significant traumatic brain injury or diagnosed mental disorders prior to SCI were excluded. Prior to data collection participants were informed about the purpose and reason of the study and signed an informed consent.
For the purpose of the analyses presented in this paper data from outpatients were selected. In four of the participating centers data were also collected for inpatients. Overall, 109 inpatient data sets were available; however, 76% of these were from one country only (Israel). Thus, to avoid confounding of country with care setting, and to obtain a more homogeneous data set for the cross-country comparisons, the inpatient data were omitted.
The data collection included, beside socio-demographic and injury related variables, four QoL measures: The Satisfaction with Life Scale (SWLS) , the Life Satisfaction Questionnaire-9 (LISAT-9) , the Personal Well-Being Index (PWI)  and five satisfaction items from the World Health Organization Quality of Life Assessment (WHOQOL-5) [24, 38]. For the data collection, instruments were selected that include less than 10 items, focus on the concepts of life and domain satisfaction, and contain items that are applicable and not offensive to people with SCI (do not contain items on walking, kneeling, bending, etc.). In addition, psychometric properties and the availability of different language versions were considered. Short questionnaires are more feasible, acceptable, and impose less burden on the patients compared to longer instruments. They can be more easily embedded into routine clinical assessments or larger scale data collection schemes. Instruments were chosen with a focus on the aspect of satisfaction within the broader notion of QoL, as satisfaction is not only conceptually well-defined, but has also been traditionally considered as a clinically relevant person-centered outcome in rehabilitation .
In Australia, Canada, South Africa, and the United States the English versions of the instruments were used. For the SWLS and the WHOQOL also the Portuguese (Brazil) and the Hebrew (Israel) versions exist. However, for the LISAT and the PWI translations were not available in Brazil and Israel. In these cases, translations of the English version were prepared at the participating facilities.
Satisfaction with Life Scale
The SWLS was designed to assess global life satisfaction. It addresses the cognitive evaluation of one's own life in terms of ideal life, wish for change, and current and past satisfaction. The SWLS consists of five items with a 7-point Likert-scale from "strongly disagree" to "strongly agree". Reliability and validity of the scale have been examined in several studies [21, 40, 41] also for various translations and in different countries [42, 43]. The SWLS has been used in cross-country studies in the general and student populations  and is also widely used in SCI research, especially in the United States [22, 44–49]. Internal consistency coefficients range between .79 and .89  and several studies confirmed the single factor structure of the SWLS [21, 41–43, 50]. However, studies in SCI scarcely reported about the psychometric properties of the instrument . Two studies comparing general population samples in the United States and Russia , Norway and Greenland , respectively, hinted at potential cross-cultural biases affecting the interpretation of the SWLS.
Life Satisfaction Questionnaire
The LISAT-9 is a measure of domain-specific life satisfaction. It consists of nine items including one on general life satisfaction and eight domain-specific items (self-care, vocational, financial, leisure situation, sexual life, partner relationship, family life, social contacts). Responses are rated along a 6-point scale from "very dissatisfying" to "very satisfying". Among the psychometric properties of the LISAT, internal consistency and factorial structure are reported in the literature [23, 53, 54]. A 3-factor has been shown for the LISAT-9 and a 4-factor structure for the LISAT-11 with internal consistency reliability of the factors between .57 and .79 (overall .85) [23, 53]. Thus, analyses using the LISAT are frequently done item-wise, but also using mean or median of the scores. The instrument has been widely used in SCI research, mainly in Europe [25, 54–59], little is known about the measurement properties of the LISAT in non-European countries, and only few studies have addressed the psychometric properties of the LISAT in the SCI population  The LISAT has also been used to compare SCI samples across countries (Sweden and Japan; China and UK; UK, Germany, Austria, and Switzerland), however, without considering potential cross-cultural validity issues [25, 26, 58].
Personal Well-Being Index
The PWI consists of 7 items about satisfaction with specific life domains (living standard, health, achievement, relationships, safety, community, future security) and one optional item about overall life satisfaction. Responses are provided on a 0-10 numeric rating scale with the end points "completely dissatisfied" to "completely satisfied". The PWI has been developed in Australia for use in national surveys  and has been adapted for international use . Validity and reliability of the PWI have been demonstrated in general population samples from different countries [37, 60–62]. The PWI has been designed as a unidimensional tool with internal consistencies between .70 and .85. Although already used in various countries (Australia, Hong Kong/China, Algeria), a rigorous examination of cross-cultural validity has not yet been conducted. The PWI has not been used with persons with SCI so far.
World Health Organization Quality of Life Assessment-5
The WHOQOL-5 is a selection of five satisfaction items out of the World Health Organization's short health-related quality of life measure, the WHOQOL-BREF. The 5 items cover overall quality of life, satisfaction with health, daily activities, relationships, and living conditions. The WHOQOL and WHOQOL-BREF were specifically developed for cross-cultural use and are currently available in 36 languages. Psychometric properties have been examined in 23 countries with samples of sick and healthy persons [24, 38, 63], with internal consistency coefficients lying between .75 and .87. The WHOQOL-BREF has also been applied in people with SCI [64, 65]. A selection of 8 items out of the WHOQOL-BREF (including the 5 items in this study) was used in the EUROHIS project across 10 European countries and showed satisfactory psychometric properties, unidimensionality and cross-cultural validity [66, 67]. The 5-item version has been used in different international WHO collaboration projects since 2002 [35, 68, 69], but has not been psychometrically tested previously in this format.
Ethics committee approval
The study was carried out in compliance with the Helsinki Declaration, the design and materials were approved by the Ethics Committee of the Ludwig-Maximilian University Munich, as well as by the respective Ethics Committees for the study centers in each world region.
Rasch analyses were carried out using the RUMM software  and applying the partial credit Rasch model . This model is a special case of the one-parameter Rasch model. In the field of Rasch-based or item response modeling further types of models exist, e.g. two- or three-parameter item response models, nonparametric Mokken analyses, or mixed Rasch models, etc. The use of these models might result in better fit of the data, as they consider varying item difficulty curves, varying homogeneity or monotonicity of the data, or multiple latent classes within the sample populations. However, the one-parameter Rasch model is especially helpful for developing precise and accurate measurement instruments, as it imposes strict requirements on the items and is not data-driven. It can ensure through its mathematical formulation fundamental measurement in the tradition of Guttman's work within a probabilistic framework [72, 73].
Applying this type of Rasch analysis, three parameters are estimated: The person parameters (for the patients), the item parameters, and the parameters of the thresholds of the response scale (e.g. four threshold parameters for a 5-point Likert-scale). These parameters describe the position of the persons, items and thresholds on the unidimensional continuum of the measured latent trait (e.g., low to high quality of life).
First, the unidimensionality of each instrument was examined. Unidimensionality describes the idea that items should contribute to the measurement of only one attribute at a time and should not be confounded by other attributes or dimensions . This ensures the interpretability of the summary scores of the instrument. Unidimensionality can be checked for by comparing the observed responses in a set of items to the expected values predicted by the unidimensional Rasch model . The fit of each item is indicated by standardized residuals (z values) and Chi2 test results. Z values exceeding +/-2.5 are considered to indicate misfit to the Rasch model . For the Chi2 significance tests a Bonferroni-corrected critical p-value at the 5% level  was applied.
To further examine unidimensionality, principal components analyses (PCA) of the residuals not explained by the Rasch-model were performed. The residuals should show a random pattern to indicate unidimensionality . Given the sample size in this study, eigenvalues below 1.9 in the PCA results are indicative of random residual variation, eigenvalues above 1.9 indicate some structure in the residuals . In addition, the Rasch person parameters of each patient were estimated separately for the items with positive versus negative loadings on the first PCA factor, and then compared using independent t-tests. The percentage of significant t-tests (α = 0.05) should not exceed 5% [78, 79].
The structure of the response scale for each instrument was studied based on the ordering of the threshold parameters. The threshold parameters should take increasing values, as they represent the successive transition points along the response scale from low to high quality of life. Reversed thresholds show that the scores do not differentiate as intended .
Reliability is indicated by the person reliability index, which is the Rasch-based correspondent to Cronbach's alpha [71, 81]. The person reliability index is constructed using the person parameter estimates and the standard errors of measurement to calculate the ratio of true person ability variance to the observed variance [74, 82]. It ranges between 0 and 1, where the value of 1 indicates perfect reproducibility of person placements on the latent continuum.
To examine the cross-cultural validity of the four instruments across countries, differential item functioning (DIF) analyses were conducted . Potential DIF is ascertained for each item by comparing the standardized residuals between the countries and across the latent trait continuum of QoL using a two-way analysis of variance (ANOVA). A significant main effect of the country (uniform DIF) or a significant interaction effect in the ANOVA results (e.g. Country × QoL, non-uniform DIF) indicates problems with the cross-country comparability of the responses. If no DIF is apparent, the scores are comparable across countries. A respective Bonferroni-corrected type I error level was applied . Tukey-Cramer post-hoc tests allowed identifying the countries that contribute to DIF in the data.
Based on the results of Rasch analyses different approaches can be taken to account for weaknesses in the metric properties of the instruments post-hoc. To come up with suggestions to enhance the measurement properties and cross-cultural validity of the instruments across countries, four alternative strategies of handling the data set were tested and compared. As a result, for each instrument an optimal solution for handling the data could be identified, which allows for acceptable measurement properties with as little change to the instrument as possible. Figure 1 gives an overview of the four strategies implemented in the post-hoc analyses.
In the first strategy, response scale disorder was addressed first. Disordered response categories were collapsed, i.e. adjacent response options were merged and the scores recoded for all items of the instrument if more than half of the items showed disorder . In addition, items that still misfitted after the collapsing, were deleted using a step-wise top-down deletion strategy until the remaining items fit the model .
In the second strategy, item misfit was attended to first by using the step-wise top-down deletion strategy and the remaining fitting items are checked again for response scale disorder.
The third strategy focused on accounting for DIF. So-called subtest analyses were conducted, which were used to merge the scores of those items that display DIF for country. Thereby, if two items of an instrument show DIF but in opposite directions, they can be combined into one score, which adjusts for invariance across countries. The advantage of this strategy-if it is successful in ameliorating DIF-is that no changes to the items are necessary and the summary score of the instrument can be interpreted as comparable across countries.
The fourth strategy also addressed DIF, but applied the subtest analyses to either option one or option two, depending on which of the two represented the most effective strategy for the instrument so far (i.e. enhanced statistics with less change).
The strategies one to three were calculated for all four instruments (according to the properties in the basic analyses), and after each step, the overall and item fit, DIF, response scale ordering, and reliability were documented. The fourth strategy was only applied, if the first three did not result in acceptable metric properties.
The efficiency of the different strategies was determined by the metric properties on the one side and the modifications to the instrument on the other side. Hereby, the metric properties were considered hierarchical in terms of desirability: Item and overall fit were considered the most important criteria to be fulfilled first, DIF as second, and response scale ordering as the third criterion. Regarding the modifications to the instruments, the merging strategy was considered the least invasive strategy, as it does not require changes to the items or the response scale. Collapsing of response options was considered the second least invasive strategy, as it requires the recoding of responses, but no changes to the items. Deletion of items was considered an invasive strategy, as it alters the instrument from its original version.
Thus, if for example the strategies one to three all resulted in acceptable metric properties in terms of fit, DIF, and response scale ordering, then the merging strategy three would be preferred as optimum solution, for being least invasive.
From six countries and four different world regions, overall, 243 out-patients with SCI were included in the study. Table 1 shows the socio-demographic and SCI-related characteristics of the study sample. Table 2 shows the mean raw scores, respective standard deviations, and the number of valid responses in the four instruments overall, per item, and per country.
Statistics for the examined measurement properties of the 4 instruments are documented in Table 3. The SWLS showed overall misfit to the Rasch model according to the significant Chi2 test and the PCA eigenvalue. At the item level, 3 out of 5 items showed misfit to the model. In terms of response scale structure, 3 out of 5 items had disordered thresholds. Reliability was high with a value of 0.88.
For the LISAT-9, the overall fit statistics (i.e. Chi2 test, PCA eigenvalue, and independent t-test approach) consistently contradict the assumption of unidimensionality. At item level, 3 items out of 9 showed misfit to the Rasch model. In 5 items the response scale thresholds were disordered. The person reliability index was high with a value of 0.86.
For the PWI the Chi2 statistics suggested unidimensionality overall as well as for the individual items. However, the eigenvalue and the t-test approach questioned the assumption of unidimensionality of the instrument. The response scale thresholds were all ordered with the exception of 1 item out of the 8. Reliability was found high with a value of 0.92.
For the WHOQoL-5 all overall statistics confirmed unidimensionality, but one of the items misfitted the model according to the significant Chi2 test result. All response scale thresholds were ordered and reliability was within an acceptable range with a value of 0.78.
The results of the DIF analyses to examine the cross-cultural validity of the 4 instruments are displayed in Table 4. Uniform DIF across countries was found in two items of the SWLS and the WHOQoL-5, three items of the LISAT-9 and four items of the PWI. Non-uniform DIF was found only in the item "Leisure situation" of the LISAT-9 (data not shown). For the SWLS and the LISAT-9 the data from Israel showed most frequently significant differences from the other countries. For the PWI, the data from Australia and Canada showed most frequently significant differences to other countries. For the WHOQoL-5 this was the case for the data from Canada (data for post-hoc tests not shown).
Table 5 shows the statistics about instrument and item fit, response scale structure, and reliability for the 4 different strategies applied to enhance the measurement properties and the cross-cultural validity of the 4 instruments. Also, Table 4 contains the results of the final check for DIF after having identified the optimal option for handling the data.
Strategy 2 was regarded as the optimum choice for the SWLS. Two misfitting items were deleted using the step-wise data purification procedure. With this handling of the data, item fit and response scale order were achieved, and no DIF was apparent.
Strategy 4 was regarded the optimum choice for handling the data for the LISAT-9. Only after collapsing the response options, deleting two misfitting items and merging another two items with DIF were all the remaining items fitting, the response scale thresholds ordered (with one exception), and DIF not present.
Strategy 3 appeared the optimum choice for the PWI. The scores of the four items that displayed DIF prior to applying any post-hoc strategies were merged into two items, which lead to no item misfit and no response scale disorder. However, one of the merged items remained inconsistent across countries and displayed DIF.
Strategy 3 was also the optimum choice for the WHOQoL-5. After merging the scores of those two items which initially indicated DIF, all items fitted the Rasch model, the response scale thresholds were ordered, and no DIF was found.
The study examined the metric properties of the Satisfaction with Life Scale (SWLS), the Life Satisfaction Questionnaire (LISAT), the Personal Well-Being Index (PWI) and the 5-item World Health Organization Quality of Life Assessment (WHOQoL-5) in a cross-country sample of persons with SCI based on Rasch analysis. Although all instruments displayed metric problems in the analyses and showed cross-country bias at first, it was possible to identify post-hoc strategies to ameliorate those problems. Such strategies can also be used in further studies to enhance the metric comparability of data across countries. The two instruments which performed best overall in this comparison in terms of reliability, dimensionality, response scale structure, and cross-cultural validity were the WHOQoL-5 and the PWI, prior as well as after applying the post-hoc strategies.
In the current study, high values of the person reliability index were found for all four instruments. The person reliability index was similar for the WHOQoL-5 and for the SWLS in our study to alpha coefficients reported in the literature in different samples and countries, including also persons with spinal cord injuries [40, 41, 43, 63, 65–67]. However, for the PWI and the LISAT-9, the reliability index was higher than reliability measures reported earlier [37, 53, 54, 61, 62]. The person reliability index is the Rasch-based counterpart of Cronbach's alpha. In this study, alpha coefficients could not be calculated because of missing data. Rasch analysis, however, not only deals readily with missing data , but in general the person reliability index can also have the advantage of being a more conservative estimate of reliability under certain circumstances, e.g. when alpha may be inflated due to the number of items or the sample variance .
In line with an earlier study using structural equation modeling , unidimensionality can be assumed for the WHOQoL-5. For the PWI, previous studies indicated unidimensionality, which is partially supported by the statistics in this analysis [60, 62]. Although unlike previous authors, we included the first overall item in the analyses , in the item-wise examination, this overall item fitted the model along with the domain-specific items.
The assumption of unidimensionality was rejected for the LISAT-9 and the SWLS. Earlier studies, as well as the findings presented here, suggest that more than one dimension is assessed by the LISAT [23, 53]. In this study, with deleting the two items "partner relations" and "family life" unidimensionality of the remaining items was established. The item "partner relations" had far more missing data than any of the other items (see Table 2), which might have caused metric irregularities. However, the standard error of the estimates was not larger compared to the other items, indicating acceptable precision of estimation. However, a potential explanation how these two items differ from the others could lay in the specific meaning of the items in the context of SCI and in the specific experiences of the affected persons. While the other items may be related to the experienced difficulties and problems in body functions, activities and participation imposed by SCI (e.g. difficulties in sexuality, less contact with friends), the partner and family life items may be related to the more stable, positive, and support providing relationships . Thus, the difference between the separate dimensions identified in the statistical analyses might be interpreted conceptually as negative versus positive experience, problems in own functioning versus support by others.
The results regarding the unidimensionality of the SWLS contradict the findings of several earlier studies, which demonstrated a single underlying dimension [21, 40–43, 47, 50]. In this study the last two items ("If I could live my life over, I would change almost nothing" and "So far I have gotten the important things I want in life") had to be removed before unidimensionality was achieved for the remaining three. A study from France using structural equation modeling found no support for the unidimensionality of the SWLS in a general population sample and the authors proposed to take the last two items separately . They suggest that the semantic structure of those two items, which relate to the past, may explain the inconsistency among the items. In the current study the sample consisted of persons who have met with a major life event in the past, namely SCI. One thing that persons with SCI might want to change in the past and might be strongly dissatisfied with is the SCI itself . In the context of SCI, it could be hypothesized that the first items (related to present life satisfaction) of the SWLS might be connected to acceptance, the last two items to grief and regret. These different connotations might explain in line with the suggestion of Vautier et al. (2004) the observed inconsistency and disjunction among the items within the instrument.
Response scale structure
Considering the response scale structure of the instruments, the results suggest that the 5-steps scale of the WHOQOL-5 ("very dissatisfied", "dissatisfied" "neither satisfied nor dissatisfied", "satisfied", "very satisfied") and the 11-steps numeric rating scale of the PWI ("completely dissatisfied" to "neutral" to "completely satisfied") have the expected ordering and persons with SCI could differentiate between the steps consistently when responding to the items.
For the SWLS and the LISAT the response scale structure showed disorder in several items. For the SWLS, after removing the last two items for misfit, only one disordered item ("ideal life") remained. For the LISAT-9 the original 6-step rating scale was reduced to a 4-step solution in this study. The optimal solution in the post-hoc analyses appeared to be the merging of the response options "dissatisfying", "rather dissatisfying" and "rather satisfying". This merging of the response options parallels the cut-off used by Fugl-Meyer to dichotomize item scores (1-4 = satisfied/5-6 = unsatisfied) placing the "rather satisfying" option in the unsatisfied category . Accordingly, future studies could test the metric properties and usefulness of a modified 4-step scale for the LISAT with a suggested structure as "very dissatisfying", "dissatisfying", "satisfying", and "very satisfying".
The current findings hint at potential cross-country bias in all four examined instruments largely in line with existing research. In the case of the SWLS, two earlier studies using different methodologies found indications that the comparability and interpretability of the scores across countries is not consistent [51, 52], which is now supported in an SCI sample.
Lau et al (2005) found cross-cultural differences in the performance of the PWI between an Australian and a Hong Kong Chinese population and suggested that cultural response bias would be a plausible explanation for the differences . Our results in SCI showed DIF for 4 of the PWI items across the 6 countries, and Australia was among the countries which showed strongest deviation from the other five (beside Canada). However, by merging the scores of those items which had DIF, the deviations proved to be balanced out. Thus, at the level of the summary score, cross-country comparability may be possible.
Schmidt et al (2006) examined DIF for the Eurohis-QoL-8 instrument, which is a selection of 8 items out of the WHOQOL-BREF and which includes the 5 items used in this study . They found acceptable cross-cultural properties in their instrument which is in line with the findings here for the reduced 5-item version. Again, the minor deviation in the first DIF analyses could be alleviated by merging the two items "health" and "quality of life" to establish cross-country comparability of the summary score.
Although the LISAT has been used in cross-country studies [25, 26], those did not examine potential bias between the different language versions of the instrument. In this study, the post-hoc analyses showed that acceptable metric properties could only be achieved for the LISAT by applying the whole range of modification strategies, including the collapsing of response options, the deletion of items and the merging of item scores.
The study is subject to several methodological limitations. The major drawback of the study is the low sample size in the individual countries. For this reason certain statistical techniques for assessing psychometric characteristics and handling DIF could not be applied, e.g. the item-splitting method suggested by Tennant et al . However, the overall sample size was sufficient to reliably sustain the performed analyses . According to Linacre (1994) a sample size of n = 250 is sufficient to achieve stable item parameters. In the current analyses the stability of the parameters was high, obvious from the small standard errors of the item parameters (SE = 0.04-0.09, see Table 3). Secondly, as the study included a convenience sample of persons with SCI, selection bias cannot be ruled out and the generalizability of the results may be compromised. Third, the quality of the Portuguese and Hebrew language versions of the questionnaires were not tested prior to their use in these data collections. Fourth, as a more current development, the PWI includes a further item on spirituality, which was not yet taken up in the data collections for this study. Fifth, in these analyses, only basic psychometric characteristics (i.e. reliability, unidimensionality) were considered, but features like stability or sensitivity to change were not examined. Sixth, the DIF analyses only focused on potential cross-country biases, but were not extended to other factors that might influence the participants' responses, e.g. sociodemographic factors or depression. Finally, the post-hoc solutions shown in this study can be considered "optimum" only in the current sample, and in other studies the results may look different. However, we have shown that using these strategies data can be handled in a way that increases the confidence in the metric quality and interpretability of the data.
The Rasch analyses of the four quality of life instruments showed that the raw scores were not consistently comparable across countries at first in an international SCI sample. However, by accounting for DIF across countries in a way that the requirements of the Rasch model are met, the scores can become comparable. Following the post-hoc procedures the items of the WHOQOL-5 and the PWI worked in a consistent and expected way in all countries. Thus, the differences between countries assessed by these instruments could potentially show cross-culturally valid differences in the responses of the persons. In contrast, summary scores of the LISAT-9 and the SWLS have to be interpreted with caution. The findings of the current study can be especially helpful to select instruments for international research projects in spinal cord injury.
Eurobarometer: Eurobarometer 71. Public opinion in the European Union. Brussels: European Commission; 2009.
Inglehart R, Foa R, Peterson C, Welzel C: Development, freedom, and rising happiness. A global perspective (1981–2007). Perspectives on Psychological Science 2008, 3: 264–285. 10.1111/j.1745-6924.2008.00078.x
Argyle M: The psychology of happiness. 2nd edition. Hove: Routledge; 2001.
Diener E, Suh EM: National differences in subjective well-being. In Well-being The foundations of hedonic psychology. Edited by: Kahnemann D, Diener E, Schwarz N. New York: Russell Sage Foundation; 1999.
Acquadro C, Berzon R, Dubois D, Leidy NK, Marquis P, Revicki D, Rothman M: Incorporating the patient's perspective into drug development and communication: an ad hoc task force report of the Patient-Reported Outcomes (PRO) Harmonization Group meeting at the Food and Drug Administration, February 16, 2001. Value Health 2003, 6: 522–531. 10.1046/j.1524-4733.2003.65309.x
Higginson IJ, Carr AJ: Measuring quality of life: Using quality of life measures in the clinical setting. BMJ (Clinical research ed 2001, 322: 1297–1300. 10.1136/bmj.322.7297.1297
McDowell I: Measuring health: a guide to rating scales and questionnaires. 3rd edition. New York: Oxford University Press; 2006.
McHorney CA: Health status assessment methods for adults: past accomplishments and future challenges. Annu Rev Public Health 1999, 20: 309–335. 10.1146/annurev.publhealth.20.1.309
Ware JE Jr: Conceptualization and measurement of health-related quality of life: comments on an evolving field. Archives of physical medicine and rehabilitation 2003, 84: S43–51. 10.1053/apmr.2003.50246
Diener E, Suh EM, Lucas RE, Smith HL: Subjective well-being: three decades of progress. Psychological Bulletin 1999, 125: 276–302. 10.1037/0033-2909.125.2.276
Davern M, Cummins RA, Stokes MA: Subjective wellbeing as an affective-cognitive construct. Journal of Happiness Studies 2007, 8: 429–449. 10.1007/s10902-007-9066-1
Dijkers M: "What's in a name?" The indiscriminate use of the "Quality of life" label, and the need to bring about clarity in conceptualizations. Int J Nurs Stud 2007, 44: 153–155. 10.1016/j.ijnurstu.2006.07.016
Moons P, Budts W, De Geest S: Critique on the conceptualisation of quality of life: a review and evaluation of different conceptual approaches. Int J Nurs Stud 2006, 43: 891–901. 10.1016/j.ijnurstu.2006.03.015
Naughton MJ, Shumaker SA: The case for domains of function in quality of life assessment. Qual Life Res 2003,12(Suppl 1):73–80. 10.1023/A:1023585707046
Dijkers MP: Quality of life of individuals with spinal cord injury: a review of conceptualization, measurement, and research findings. J Rehabil Res Dev 2005, 42: 87–110. 10.1682/JRRD.2004.08.0100
Post M, Noreau L: Quality of life after spinal cord injury. J Neurol Phys Ther 2005, 29: 139–146.
Hammell KW: Exploring quality of life following high spinal cord injury: a review and critique. Spinal Cord 2004, 42: 491–502. 10.1038/sj.sc.3101636
Wood-Dauphinee S, Exner G, Bostanci B, Glass C, Jochheim KA, Kluger P, Koller M, Krishnan KR, Post MW, Ragnarsson KT, et al.: Quality of life in patients with spinal cord injury--basic issues, assessment, and recommendations. Restor Neurol Neurosci 2002, 20: 135–149.
Hallin P, Sullivan M, Kreuter M: Spinal cord injury and quality of life measures: a review of instrument psychometric quality. Spinal Cord 2000, 38: 509–523. 10.1038/sj.sc.3101054
Spinal Cord Injury Rehabilitation Evidence. Version 3.0. Outcome Measures [http://www.scireproject.com/outcome-measures]
Diener E, Emmons RA, Larsen RJ, Griffin S: The Satisfaction With Life Scale. J Pers Assess 1985, 49: 71–75. 10.1207/s15327752jpa4901_13
National Spinal Cord Injury Statistical Center: The Spinal Cord Injury Model Systems' data collection syllabus for the National Spinal Cord Injury Database. Birmingham, Alabama: National Spinal Cord Injury Statistical Center; 2006.
Fugl-Meyer AR, Bränholm I-B, Fugl-Meyer KS: Happiness and domain-specific life satisfaction in adult northern Swedes. Clinical rehabilitation 1991, 5: 25–33. 10.1177/026921559100500105
WHOQOL Group: Development of the World Health Organization WHOQOL-BREF quality of life assessment. The WHOQOL Group. Psychol Med 1998, 28: 551–558. 10.1017/S0033291798006667
Songhuai L, Olver L, Jianjun L, Kennedy P, Genlin L, Duff J, Scott-Wilson U: A comparative review of life satisfaction, quality of life and mood between Chinese and British people with tetraplegia. Spinal Cord 2009, 47: 82–86. 10.1038/sc.2008.83
Ide M, Fugl-Meyer AR: Life satisfaction in persons with spinal cord injury: a comparative investigation between Sweden and Japan. Spinal Cord 2001, 39: 387–393. 10.1038/sj.sc.3101171
Diener E, Suh EM, Smith H, Shao L: National differences in reported subjective well-being: Why do they occur? Social Indicators Research 1995, 34: 7–32. 10.1007/BF01078966
Herdman M, Fox-Rushby J, Badia X: A model of equivalence in the cultural adaptation of HRQoL instruments: the universalist approach. Qual Life Res 1998, 7: 323–335. 10.1023/A:1008846618880
Acquadro C, Conway K, Hareendran A, Aaronson N: Literature review of methods to translate health-related quality of life questionnaires for use in multinational clinical trials. Value Health 2008, 11: 509–521. 10.1111/j.1524-4733.2007.00292.x
Skevington SM, Sartorius N, Amir M: Developing methods for assessing quality of life in different cultural settings. The history of the WHOQOL instruments. Soc Psychiatry Psychiatr Epidemiol 2004, 39: 1–8. 10.1007/s00127-004-0700-5
Ware JE, Gandek B: Overview of the SF-36 Health Survey and the International Quality of Life Assessment (IQOLA) Project. Journal of clinical epidemiology 1998, 51: 903–912. 10.1016/S0895-4356(98)00081-X
Granger CV: Rehabilitation and outcome measurement: where is Rasch analysis going? Eura Medicophys 2007, 43: 559–560.
Tennant A, Penta M, Tesio L, Grimby G, Thonnard JL, Slade A, Lawton G, Simone A, Carter J, Lundgren-Nilsson A, et al.: Assessing and adjusting for cross-cultural validity of impairment and activity limitation scales through differential item functioning within the framework of the Rasch model: the PRO-ESOR project. Medical care 2004, 42: I37–48. 10.1097/01.mlr.0000103529.63132.77
Catz A, Itzkovich M, Tesio L, Biering-Sorensen F, Weeks C, Laramee MT, Craven BC, Tonack M, Hitzig SL, Glaser E, et al.: A multicenter international study on the Spinal Cord Independence Measure, version III: Rasch psychometric validation. Spinal Cord 2007, 45: 275–291. 10.1038/sj.sc.3101939
Biering-Sorensen F, Scheuringer M, Baumberger M, Charlifue SW, Post MW, Montero F, Kostanjsek N, Stucki G: Developing core sets for persons with spinal cord injuries based on the International Classification of Functioning, Disability and Health as a way to specify functioning. Spinal Cord 2006, 44: 541–546. 10.1038/sj.sc.3101918
Cieza A, Kirchberger I, Biering-Sorensen F, Baumberger M, Charlifue S, Post MW, Campbell R, Kovindha A, Ring H, Sinnott A, et al.: ICF Core Sets for individuals with spinal cord injury in the long-term context. Spinal Cord 2010,8(4):305–12. Epub 2010 Jan 12 10.1038/sc.2009.183
International Wellbeing Group: Personal Wellbeing Index. 4th edition. Melbourne: Australian Centre on Quality of Life, Deakin University; 2006.
WHOQOL Group: The World Health Organization Quality of Life Assessment (WHOQOL): development and general psychometric properties. Soc Sci Med 1998, 46: 1569–1585. 10.1016/S0277-9536(98)00009-4
Keith RA: Patient satisfaction and rehabilitation services. Archives of physical medicine and rehabilitation 1998, 79: 1122–1128. 10.1016/S0003-9993(98)90182-4
Pavot W, Diener E: Review of the Satisfaction With Life Scale. Psychological Assessment 1993, 5: 164–172. 10.1037/1040-3518.104.22.168
Pavot W, Diener E, Colvin CR, Sandvik E: Further validation of the Satisfaction With Life Scale: evidence for the cross-method convergence of well-being measures. J Pers Assess 1991, 57: 149–161. 10.1207/s15327752jpa5701_17
Arrindell WA, Meeuwesen L, Huyse FJ: The Satisfaction With Life Scale (SWLS): psychometric properties in a non-psychiatric medical outpatients sample. Pers Individ Diff 1991, 12: 117–123. 10.1016/0191-8869(91)90094-R
Laranjeira CA: Preliminary validation study of the Portuguese version of the satisfaction with life scale. Psychol Health Med 2009, 14: 220–226. 10.1080/13548500802459900
Chwalisz K, Diener E, Gallagher D: Autonomic arousal feedback and emotional experience: evidence from the spinal cord injured. J Pers Soc Psychol 1988, 54: 820–828. 10.1037/0022-3522.214.171.1240
Tate DG, Forchheimer M: Health-related quality of life and life satisfaction for women with spinal cord injury. Topics in Spinal Cord Injury Rehabilitation 2001, 7: 1–15.
Rintala DH, Robinson-Whelen S, Matamoros R: Subjective stress in male veterans with spinal cord injury. J Rehabil Res Dev 2005, 42: 291–304. 10.1682/JRRD.2005.10.0155
Dijkers MP: Correlates of life satisfaction among persons with spinal cord injury. Archives of physical medicine and rehabilitation 1999, 80: 867–876. 10.1016/S0003-9993(99)90076-X
Charlifue S, Lammertse DP, Adkins RH: Aging with spinal cord injury: changes in selected health indices and life satisfaction. Archives of physical medicine and rehabilitation 2004, 85: 1848–1853. 10.1016/j.apmr.2004.03.017
Putzke JD, Richards JS, Hicken BL, DeVivo MJ: Predictors of life satisfaction: a spinal cord injury cohort study. Archives of physical medicine and rehabilitation 2002, 83: 555–561. 10.1053/apmr.2002.31173
Shevlin M, Brunsden V, Miles JNV: Satisfaction With Life Scale: analysis of factorial invariance, mean structures and reliability. Person Indiv Diff 1998, 25: 911–916. 10.1016/S0191-8869(98)00088-9
Tucker KL, Ozer DJ, Lyubomirsky S, Boehm JK: Testing for measurement invariance in the Satisfaction With Life Scale: A comparison of Russians and North Americans. Social Indicators Research 2006, 78: 341–360. 10.1007/s11205-005-1037-5
Vitterso J, Biswas-Diener R, Diener E: The divergent meanings of life satisfaction: Item response modeling of the Satisfaction With Life Scale in Greenland and Norway. Social Indicators Research 2005, 74: 327–348. 10.1007/s11205-004-4644-7
Fugl-Meyer AR, Melin R, Fugl-Meyer KS: Life satisfaction in 18- to 64-year-old Swedes: in relation to gender, age, partner and immigrant status. J Rehabil Med 2002, 34: 239–246. 10.1080/165019702760279242
Post MW, de Witte LP, van Asbeck FW, van Dijk AJ, Schrijvers AJ: Predictors of health status and life satisfaction in spinal cord injury. Archives of physical medicine and rehabilitation 1998, 79: 395–401. 10.1016/S0003-9993(98)90139-3
Kennedy P, Smithson E, McClelland M, Short D, Royle J, Wilson C: Life satisfaction, appraisals and functional outcomes in spinal cord-injured people living in the community. Spinal Cord 2009 2010,48(2):144–8. Epub 2009 Jul 14 10.1038/sc.2009.90
Woolrich RA, Kennedy P, Tasiemski T: A preliminary psychometric evaluation of the Hospital Anxiety and Depression Scale (HADS) in 963 people living with a spinal cord injury. Psychol Health Med 2006, 11: 80–90. 10.1080/13548500500294211
Kennedy P, Taylor N, Hindson L: A pilot investigation of a psychosocial activity course for people with spinal cord injuries. Psychol Health Med 2006, 11: 91–99. 10.1080/13548500500330494
Kennedy P, Lude P, Taylor N: Quality of life, social participation, appraisals and coping post spinal cord injury: a review of four community samples. Spinal Cord 2006, 44: 95–105. 10.1038/sj.sc.3101787
Norrbrink Budh C, Kowalski J, Lundeberg T: A comprehensive pain management programme comprising educational, cognitive and behavioural interventions for neuropathic pain following spinal cord injury. J Rehabil Med 2006, 38: 172–180. 10.1080/16501970500476258
Cummins RA, Eckersley R, Pallant J, van Vugt J, Misajon R: Developing a national index of subjective wellbeing: the Australian unity wellbeing index. Social Indicators Research 2003, 64: 159–190. 10.1023/A:1024704320683
Lau ALD, Cummins RA, McPherson W: An investigation into the cross-cultural equivalence of the Personal Wellbeing Index. Social Indicators Research 2005, 72: 403–430. 10.1007/s11205-004-0561-z
Tiliouine H, Cummins RA, Davern M: Measuring wellbeing in developing countries: the case of Algeria. Social Indicators Research 2006, 75: 1–30. 10.1007/s11205-004-2012-2
Skevington SM, Lotfy M, O'Connell KA: The World Health Organization's WHOQOL-BREF quality of life assessment: psychometric properties and results of the international field trial. A report from the WHOQOL group. Qual Life Res 2004, 13: 299–310. 10.1023/B:QURE.0000018486.91360.00
Chapin MH, Miller SM, Ferrin JM, Chan F, Rubin SE: Psychometric validation of a subjective well-being measure for people with spinal cord injuries. Disabil Rehabil 2004, 26: 1135–1142. 10.1080/09638280410001714772
Lin MR, Hwang HF, Chen CY, Chiu WT: Comparisons of the brief form of the World Health Organization Quality of Life and Short Form-36 for persons with spinal cord injuries. Am J Phys Med Rehabil 2007, 86: 104–113. 10.1097/01.phm.0000247780.64373.0e
Schmidt S, Power M: Cross-cultural analyses of determinants of quality of life and mental health: results from the EUROHIS study. Social Indicators Research 2006, 77: 95–138. 10.1007/s11205-005-5555-y
Schmidt S, Muhlan H, Power M: The EUROHIS-QOL 8-item index: psychometric results of a cross-cultural field study. Eur J Public Health 2006, 16: 420–428. 10.1093/eurpub/cki155
Cieza A, Ewert T, Ustun TB, Chatterji S, Kostanjsek N, Stucki G: Development of ICF Core Sets for patients with chronic conditions. J Rehabil Med 2004, 9–11.
Grill E, Ewert T, Chatterji S, Kostanjsek N, Stucki G: ICF Core Sets development for the acute hospital and early post-acute rehabilitation facilities. Disabil Rehabil 2005, 27: 361–366. 10.1080/09638280400013974
Andrich D, Sheridan B, Luo G: RUMM 2030 (Beta Version for Windows). Perth, Western Australia: RUMM Laboratory Pty Ltd; 2009.
Wright BD, Masters GN: Rating Scale Analysis. Chicago, IL: MESA; 1982.
Andrich D: Controversy and the Rasch model: a characteristic of incompatible paradigms? Medical care 2004, 42: I7–16. 10.1097/01.mlr.0000103528.48582.7c
Bond TG, Fox CM: Applying the Rasch model: Fundamental measurement in the human sciences. Mahwah, NJ: Lawrence Erlbaum Associates; 2001.
Andrich D: Rasch Models for Measurement. Newbury Park, CA: Sage; 1988.
Bland JM, Altman DG: Multiple significance tests: the Bonferroni method. BMJ (Clinical research ed) 1995, 310: 170.
Smith RM, Miao CY: Assessing unidimensionality for Rasch measurement. In Objective Measurement: Theory into Practice. Volume 2. Edited by: Wilson M. Norwood: Ablex; 1994:316–327.
Raîche G: Critical eigenvalue sizes in standardized residual principal components analysis. Rasch Measurement Transactions 2005, 19: 1012.
Smith EV Jr: Detecting and evaluating the impact of multidimensionality using item fit statistics and principal component analysis of residuals. Journal of applied measurement 2002, 3: 205–231.
Tennant A, Pallant JF: Unidimensionality Matters! (A Tale of Two Smiths?). Rasch Measurement Transactions 2006, 20: 1048–1051.
Linacre JM: Optimizing rating scale category effectiveness. Journal of applied measurement 2002, 3: 85–106.
Fisher WP: Reliability statistics. Rasch Measurement Transactions 1992, 6: 238.
Andrich D: An index of person separation in latent trait theory, the traditional KR.20 index, and the Guttman scale response pattern. Education Research and Perspectives 1982, 9: 95–104.
Lange R, Irwin HJ, Houran J: Top-down purification of Tobacyk's Revised Paranormal Belief Scale. Personality and individual differences 2000, 29: 131–156. 10.1016/S0191-8869(99)00183-X
Andrich D, Luo G: Conditional pairwise estimation in the Rasch model for ordered response categories using principal components. Journal of applied measurement 2003, 4: 205–221.
Fisher WP: The cash value of reliability. Rasch Measurement Transactions 2008, 22: 1158–1161.
Post MWM, Ros WJG, Schrijvers AJP: Impact of social support on health status and life satisfaction in people with a spinal cord injury. Psychology and Health 1999, 14: 679–695. 10.1080/08870449908410757
Vautier S, Mullet E, Jmel S: Assessing the structural robustness of self-rated satisfaction with life: a SEM analysis. Social Indicators Research 2004, 68: 235–249. 10.1023/B:SOCI.0000025595.92546.eb
Linacre JM: Sample Size and Item Calibration Stability. Rasch Measurement Transactions 1994, 7: 328.
The project was funded by Swiss Paraplegic Research, Nottwil, Switzerland.
The authors would like to expand a special thanks to the Regional Project Coordinators Michael Baumberger (European Region), Robert Campbell (African Region), Susan Charlifue (Region of the Americans), Apichana Kovindha (South-East Asian Region), Haim Ring† (Eastern Mediterranean Region), Anne Sinnott (Western Pacific Region), to all health professionals who were involved in the local study organization or data collection, and to the staff from the ICF Research Branch in Munich for their contribution regarding data management. The authors are indebted to all persons with spinal cord injury, who participated in the study.
The authors declare that they have no competing interests.
SG contributed to the conception and design of the study, the conception and interpretation of the statistical analyses, and drafted the manuscript. BF conducted the statistical analyses, contributed to the interpretation of data, the drafting and revision of the manuscript. IK contributed to the acquisition and management of the data and revised the manuscript. MP contributed to the conception and design of the study, the acquisition of data, the interpretation of the statistical analyses, and revised the manuscript. All authors read and approved the final manuscript.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.