Skip to main content

Does the relative importance of the OxCAP-MH’s capability items differ according to mental ill-health experience?



Some capability dimensions may be more important than others in determining someone’s well-being, and these preferences might be dependent on ill-health experience. This study aimed to explore the relative preference weights of the 16 items of the German language version of the OxCAP-MH (Oxford Capability questionnaire-Mental Health) capability instrument and their differences across cohorts with alternative levels of mental ill-health experience.


A Best–Worst-Scaling (BWS) survey was conducted in Austria among 1) psychiatric patients (direct mental ill-health experience), 2) (mental) healthcare experts (indirect mental ill-health experience), and 3) primary care patients with no mental ill-health experience. Relative importance scores for each item of the German OxCAP-MH instrument were calculated using Hierarchical Bayes estimation. Rank analysis and multivariable linear regression analysis with robust standard errors were used to explore the relative importance of the OxCAP-MH items across the three cohorts.


The study included 158 participants with complete cases and acceptable fit statistic. The relative importance scores for the full cohort ranged from 0.76 to 15.72. Findings of the BWS experiment indicated that the items Self-determination and Limitation in daily activities were regarded as the most important for all three cohorts. Freedom of expression was rated significantly less important by psychiatric patients than by the other two cohorts, while Having suitable accommodation appeared significantly less important by the expert cohort. There were no further significant differences in the relative preference weights of OxCAP-MH items between the cohorts or according to gender.


Our study indicates significant between-item but limited mental ill-health related heterogeneity in the relative preference weights of the different capability items within the OxCAP-MH. The findings support the future development of preference-based value sets elicited from the general population for comparative economic evaluation purposes.


The assessment of outcomes in health intervention and care related studies requires information not only about the identification and measurement of impacts, but also the consideration of how valuable these impacts are to patients or the general public [1]. Outcomes in the area of mental health often go beyond health and incorporate wider impacts [2]. The capability framework, introduced by Amartya Sen in the early 1980s as an alternative to standard utilitarian welfare economics, provides a richer evaluative space beyond health [3]. The core focus of the capability approach is on what individuals are able to be and do in their lives, in other words, what they are capable of [4]. One of the outcome assessment instruments specifically designed to capture different dimensions of well-being within the capability framework in the area of mental health is the self-reported Oxford Capability questionnaire-Mental Health (OxCAP-MH) instrument [2]. The development of the OxCAP-MH was based on the list of ten central capabilities endorsed by Martha Nussbaum in 2003 [5]. This list included life; bodily health; bodily integrity; senses, imagination and thought; emotions; practical reason; affiliation; other species; play; and control over one’s environment. The scoring of the OxCAP-MH currently relies on equal weights across items and the instrument has been used also as a non-preference based outcome measure in economic evaluations so far [6]. There are, however, suggestions in the literature that some capability items may be more important than others in determining someone’s well-being [5, 7]. The weighting of preferences may vary not only between different cultural settings (i.e. regions/countries) and main sociodemographic characteristics (i.e. age, gender), but also across different cohorts influenced by specific insight into or adaptation to an illness [8,9,10]. It has long been acknowledged in the preference elicitation and value weighting literature that the time spent in a health state may have an influence on the way that state is perceived too [11]. Beside the methods used for valuation, controversies exist concerning who should provide valuations. In principle, preferences or values could be provided by members of the general public, who most often pay for interventions via taxation or social insurance; or by patients, who have lived experience with the health state in question; or by experts, who have relevant scientific or clinical insights [12].

The idea of developing value sets for capability instruments has first appeared in 2008, during the development of the OCAP-18 [13]. However, there is still no consensus on the best method to elicit values and whether capability instruments should be anchored similarly to preference-based health-related quality of life (HRQoL) measures. A recent literature review identified that all studies aiming to develop value sets for capability instruments used the Best–Worst Scaling (BWS) method [14]. These include the UK value sets for the Adult Social Care Outcomes Toolkit (ASCOT), Investigating Choice Experiments Capability Measure for Adults (ICECAP-A), for older people (ICECAP-O) and Supportive Care Measure (ICECAP-SCM) instruments [7, 15,16,17]. Moreover, a recently published study aimed to establish Austrian preference weights for the ASCOT instrument also used the BWS method [18]. The main reason for this is that BWS elicits values rather than preferences because individuals are not asked to trade one thing for another, but rather only state which option they find most and least important within hypothetical scenarios [15]. Thus, the BWS approach may come closest from all the potentially available methods, which have been used to elicit value sets, that would satisfy Sen's interpretation of the capability approach [3]. Moreover, alternative methods, such as discrete choice experiments (DCEs), could not handle the number of attributes and levels in the OxCAP-MH and may pose problems disentangling independent effects because of inter-item correlations [19].

This study aimed to explore the relative preference weights of the 16 items of the German language version of the OxCAP-MH (Oxford Capability questionnaire-Mental Health) capability instrument and their differences across cohorts with alternative levels of mental ill-health experience.



The study was conducted prospectively in Austria using the German language version of the OxCAP-MH capability wellbeing questionnaire. The OxCAP-MH is a 16-item instrument originally developed for capability wellbeing measurement in the mental health context. Items are rated on a 1–5 Likert scale. The total level sum score assumes equal weights between the different items and is converted using the formula: 100 × [OxCAP-MH total score – minimum score)/range to arrive to a standardised score between 0 and 100. In other words, the total raw score is calculated by multiplying the number of items with the number of levels: 16 × 5 = 80, the minimum score therefore equals 16, whilst the range is 64 (16–80). Higher scores indicate better capabilities [20]. The OxCAP-MH has been so far validated in English [20], German [21, 22], Luganda [23], Hungarian [24] and Chinese [25] languages and used in multiple studies [26,27,28]. A sample copy of the OxCAP-MH is available at, and the full list of items is included in the Additional file 1.

Best–worst scaling (BWS) experiment

Relative preference weights of the OxCAP-MH capability items were elicited by a Best–Worst Scaling (BWS) experiment where an individual's decision regarding the most and least preferred options in a set of attributes is elicited across repeated hypothetical scenarios that are systematically varied via an experimental design.

The survey questionnaire consisted of the BWS instrument and several sociodemographic questions, including age, gender, education, living environment, marital status and indirect experience with mental disorders. The attributes of the hypothetical scenarios for the BWS were developed from the 16 OxCAP-MH items, in a set of qualitative interviews following pilot testing with seven persons [29, 30]. Item wording adaptation was necessary so that the attribute wordings reflected the correct individual choices and was carried out similar to the attribute development process for DCE studies [29].

The valuation of item weights was based on the object case of the BWS method. The object case is designed to determine the relative importance of the listed attributes (adapted OxCAP-MH items), which have no level, and choice scenarios differ merely in the particular subset of attributes shown [31]. Each participant was presented with 16 hypothetical scenarios with six attributes. This was considered as the optimal, sufficient design based on recommendations from the literature [32] and established in the pilot phase. Respondents were asked to select the most and least important attributes for them as an individual. An example of the choice task is shown in Table 1, and an exemplary questionnaire can be seen in the Additional file 1.

Table 1 Exemplary BWS task

The 16 items of the German OxCAP-MH, each of them with 5 levels (5^16), would result in 152,587.890,625 potential combinations of items and levels, i.e. capability states. Currently available BWS methods for the weighting of instruments are not able to handle this many hypothetical states because respondents would be presented with a number of hypothetical scenarios that they are not able to cope with. One potential option to solve this problem would be to divide the potential options into blocks, and only present respondents with a manageable number of scenarios. However, this would have required a study participant sample size that was not feasible in practice. We, therefore, concentrated on the preference weights of the 16 items without including levels. Hypothetical scenarios were reduced through Balanced Incomplete Block Design (BIBD) in Sawtooth Software [30, 33]. BIBD means that each item appears the same number of times, and it also forces paired combinations of attributes to appear together [34]. Part of the analysis focused on three cohorts of the overall sample which reduced the sample size, hence, response efficiency was increased by developing three blocks of the BWS questionnaire with the Paper-and-Pencil module of Sawtooth Software’s Lighthouse Studio 9.13.1, based on 1000 iterations. All three versions were equally distributed within each cohort of respondents and randomly assigned to them.

Participants and data collection

The study included three cohorts of participants from Austria: (1) psychiatric patients (direct mental ill-health experience), (2) (mental) healthcare experts (indirect mental ill-health experience), and 3) primary care patients with no mental ill-health experience. All participants had to be between 18 and 80 years of age and able and willing to complete the questionnaire. Sufficient intellectual capacities and German language skills were judged by the recruiting team. The minimum sample size estimate of 50 in each cohort, i.e. 150 respondents in total, allowed differences between means of a magnitude at least 0.57 within-group standard deviations to be detected with 80% power at a two-sided significance level of 5% and was similar to previous relevant studies that had aimed to develop weights or value sets to capability wellbeing questionnaires [7, 15, 17, 35].

Similar to some other elicitation studies (e.g. [16, 36]), convenience sampling was used. Mental health patients were approached in the relevant health care facilities by their treating health care professionals in the Department of Psychiatry and Psychotherapy, Clinical Division of Social Psychiatry at the Medical University of Vienna. Persons attending primary care appointments were approached in the waiting areas of the collaborating Primary Care Centres in Vienna. (Mental) health experts were recruited through collaborating professional contacts at the Medical University of Vienna. Participants were recruited between January and May 2020. The study received approval from the Medical University of Vienna (EK Medizinische Universität Wien: 1779/2019). People were first given a participant information sheet and received the BWS questionnaire after they provided consent to participation in the study. The survey was paper-based and self-completed. Data were pseudo-anonymised for further analysis.

Statistical analysis

The statistical analysis aimed to estimate weights to the OxCAP-MH items across different cohorts. Sociodemographic characteristics of the respondents were summarised as mean values by cohorts. The weights for the items were calculated by Hierarchical Bayes (HB) estimation using Sawtooth Software’s analysis tool. HB estimation was employed because its calculation is based on individual respondents and consequently enables segmentation by cohorts. Instead of estimating each respondent’s utilities individually, the algorithm estimates how different the respondent’s utilities are from the other respondents in the study [37]. Hence, the HB estimation is based on the probability that a respondent selects a specific concept in a choice task given a specific set of preferences and the probability that the respondent’s preferences are consistent with the patterns of the preferences observed in all other respondents [37]. HB estimation can yield reliable individual best–worst values even when the number of responses per participant is small [38]. The mean relative importance score (RIS) was calculated for each item based on HB estimation [39]. An individual fit statistic (root likelihood) per respondent below 0.2 was used to identify inconsistent responders following guidance from Orme [40]. A lower value would be an indication of purely random responses to the choice task. Differences between the RIS scores calculated by HB estimation were explored in a graphical presentation and tables including rank orders of the items. The rank order analysis based on HB estimation was repeated for cohorts, testing the differences across groups by Kruskal–Wallis equality of populations rank tests. Furthermore, to quantify linear association between variables, Pearson correlation coefficients between the RIS scores of the items for the full cohort were calculated and visualised by means of a heatmap.

Finally, multivariable linear regression analyses were conducted to explore the relative adjusted importance of the items across cohorts. The continuous RIS of each item was regressed on binary group indicators, whilst adjusting for gender and cohort. Other demographic variables (e.g., age, profession) were not considered as adjustments as they were main attributes of the cohorts' characteristics. Robust standard errors to account for violations of model assumptions and the implicit correlation of the outcome variables were obtained using the Jackknife method. It means that each participant was omitted in turn, the HB scores were re-estimated and the regressions of the HB scores on the covariates were re-fitted. From the regression coefficients of these 16 (items) \(\times\) 158 (participants) analyses, robust variances of regression coefficients were obtained by the following formula:

$${\widehat{v}}_{kj}=\frac{n-1}{n}\sum_{i=1}^{n}({\stackrel{\sim }{\beta }}_{kj}^{\left(-i\right)}-{\widehat{\beta }}_{kj}{)}^{2},$$

where \({\widehat{\beta }}_{kj}\) denotes the regression coefficient of covariate \(j\) for item \(k\) (where j and k are indices for covariate and item) from the full sample; and \({\stackrel{\sim }{\beta }}_{kj}^{(-i)}\) denotes the corresponding estimated regression coefficients obtained from re-estimating the HB scores after omitting participant \(i\).

The standard errors were calculated as \({\widehat{s}}_{kj}=\sqrt{{\widehat{v}}_{kj}}\), and p-values for testing the hypothesis \({\beta }_{kj}=0\) were obtained by comparing \({\widehat{\beta }}_{kj}/{\widehat{s}}_{kj}\) to a t-distribution with \(n-J-1\) degrees of freedom, \(J\) being the number of regression coefficients, i.e. four.

The HB estimation and count analysis were performed in Sawtooth Software, the regression analyses were calculated using R version 4.0.2 [41] and all other analyses were conducted with STATA Version 16 [42]. A two-sided significance level of \(\alpha =0.05\) was considered to indicate statistical significance in all analyses, unless stated otherwise. P-values were not corrected for multiple testing, but all performed comparisons were reported.

Indicative preference weight set development

An indicative preference weight set for the German OxCAP-MH was developed for Austria by a linear transformation of values based on HB estimation across all three cohorts. Since each of the 16 OxCAP-MH items has five levels and the assumption was that they carry proportionately constant weights, i.e. their contribution to the overall preference weight was assumed to be equally distributed. Worst capability levels carried the multiplication factor of 0, best capability levels the multiplication factor of 1, and those in-between the multiplication factors of 0.25, 0.5, 0.75, respectively. Similar to other capability instruments, the proposed indicative preference weight set of the OxCAP-MH instrument is anchored on a scale of 0 (no capability) to 1 (full capability).


Participant characteristics

Initially, 235 people were approached, with response rates of ca. 90% for psychiatric patients, nearly 100% for (mental) healthcare experts, and ca. 70% for primary care attendees. In total, 195 persons agreed to participate, out of whom 159 participants fully completed the BWS questionnaire and was initially included in the analysis as shown in Fig. 1.

Fig. 1
figure 1

Overview of recruitment strategy and inclusion of participants

In a subsequent step, one psychiatric patient had to be excluded because the calculated HB fit statistic was under 0.2, and it was therefore assumed that the responses of this participant were provided by chance. Table 2 provides further details on participant characteristics of the final included 158 participants.

Table 2 Characteristics of participants

Relative importance of items

Figure 2 illustrates the RIS of the items calculated by HB estimation and by means of box plots. The mean RIS scores for the full cohort ranged from 0.76 to 15.72. The RIS based on HB estimation indicate very strong relative preference weights for the items Self-determination and Limitation in daily activities. The RIS also suggest that Influencing local decisions and Losing sleep over worry are regarded as the least important items.

Fig. 2
figure 2

Relative Importance Scores of OxCAP-MH items calculated by Hierarchical Bayes estimation (n = 158). White vertical line = median; box = interquartile range; ‘whiskers’ extending to max 1.5 interquartile ranges outside the box; any values outside the whisker are depicted individually

More detailed information about the Mean Relative Importance Scores and Standard Deviations of OxCAP-MH items calculated by Hierarchical Bayes estimation can be found in the Additional file 1.

Table 3 provides a rank order of mean RIS scores attributed to the individual OxCAP-MH items calculated by both HB estimation and count analysis, with links to Nussbaum’s ten central human capabilities.

Table 3 Rank orders per item using different analysis methods

Figure 3 provides a heatmap of the correlation between the items, indicating that some items are strongly correlated including: Social networks with Enjoying friendship and support; Likelihood of assault with Likelihood of discrimination; and Likelihood of discrimination with Freedom of expression.

Fig. 3
figure 3

Heatmap of pairwise Pearson correlation coefficients between OxCAP-MH items

Analysis by cohorts with different mental ill-health experience and gender

The rank orders and observed mean RIS scores for each of the three cohorts revealed that the mean RIS per cohort significantly differs in eight out of the 16 OxCAP-MH items (Table 4). This is also visible on the graphical presentation of mean RIS estimates by cohort (see Additional file 1).

Table 4 Rank order (1 = most important, 16 = least important) and observed mean Relative Importance Scores for each cohort (n = 158)

The regression coefficients with robust standard errors and p-values presented in Table 5 described the gender-adjusted difference in expected RIS between the cohorts and the cohort-adjusted difference between males and females. This analysis revealed that Freedom of expression was rated significantly less important by psychiatric patients than by the other two cohorts, given that the adjusted difference in RIS was 3.25 units, which is approximately half of the overall mean RIS for this item. Moreover, Having suitable accommodation appeared significantly less important for the expert cohort than for primary care patients with no mental ill-health experience. The difference for this item between psychiatric patients and experts was even larger in terms of the regression coefficient, but with a larger standard error. There were no further significant differences in the relative preference weights of OxCAP-MH items between the cohorts or according to gender.

Table 5 Multivariate regression analysis per item with robust standard errors (n = 158)

Preference weight set

Table 6 provides an indicative preference weight set that could be used as current best proxy to calculate individual scores for different capability states in economic evaluations [43].

Table 6 Preference weights of the OxCAP-MH items and their levels (n = 158)


This study is the first to explore the relative preference weights of the items of the OxCAP-MH, a capability well-being instrument that was originally developed for outcome measurement in this context, and the impact of mental ill-health experience on these. As opposed to other capability instruments with existing preference weight or value sets such as the ASCOT [17], ICECAP-A [7], ICECAP-O [15] and ICECAP-SCM [35], the high number of items posed a major challenge in case of the OxCAP-MH. Hence, this study only focused on the items without incorporating preferences related to levels. The overall study sample was balanced for gender and for differing mental ill-health experiences between the cohorts.

The results confirmed significantly different preference weights attributed to the 16 items of the OxCAP-MH. The RIS scores for the full cohort ranged from 0.76 to 15.72 and indicated very strong relative preference weights for Self-determination and Limitation in daily activities, suggesting that these concepts have large spans within the ‘capability space’ covered by the OxCAP-MH. The relatively high importance of these items also demonstrates that respondents placed a much higher emphasis on individual aspects of capabilities than on community-based aspects, including Influencing local decisions, in the Austrian context [18]. The most important items of the OxCAP-MH are those related to Nussbaum’s practical reasoning and overall health capabilities, whilst aspects related to emotions and control over one’s environment are regarded significantly less important. These results are in line with the findings of the BWS experiment eliciting preference weights for the ASCOT instrument for service users in Austria. Hajji et al. found that the most important choices were related to being meaningfully occupied during the day and having control over daily life, whilst the least important choices were associated with dignity and social participation [18]. These findings are also in line with the results of the UK preference weighting study of the ASCOT [17]. Conversely, the development of the initial UK value set of ICECAP-A in 2013 found that attachment and stability had slightly higher contribution to the capability space then the items of autonomy, achievement and enjoyment. This suggests that the different capability instruments either elicit different aspects of capabilities, or the actual wording of questions and the cultural context play a more significant role than previously thought.

With regard to differences across the cohorts, we could not find evidence that mental ill-health experience or gender were significantly associated with the relative importance of the OxCAP-MH items. While rank analysis revealed some differences across the cohorts, this was not to a significant extent. The magnitudes of the regression coefficients with respect to the mean values of the OxCAP-MH item scores were relatively small across cohorts, apart from the Having suitable accommodation and Freedom of expression items. These findings suggest that potential future value sets for the OxCAP-MH elicited from the general population are likely to represent well also the direct preferences of people with lived experience of mental ill-health. These results somewhat contradict the findings that patient preferences differ from preferences derived from the general population for the EQ-5D-5L items [10]. In the study by Ludwig et al., 2021, patients in the EQ-5D-5L study gave more importance to mobility, self-care, or usual activities and less importance to pain/discomfort and anxiety/depression compared to the general population [10]. The differential findings between the two studies could be explained by the specific perception of mental health itself and the difference between health-related quality of life and capabilities as evaluative spaces.

Our study is unique by explicitly exploring the impact of ill-health experience on preferences in the context of a capability wellbeing instrument. Moreover, the study is based on a thoroughly designed BWS experiment and sophisticated statistical analysis. This takes into account the multivariate distribution with implicit correlation of outcomes, and robustness against model misspecification and violation of assumptions of linear regression, which probably implied larger estimated standard errors.

The interpretation of the study results, however, require the consideration of its limitations. One of the main limitations is the relatively small sample size. A larger number of observations would have strengthened the robustness of the statistical analysis, and also it would have enabled the formation of bigger groups for subgroup analyses, including latent class estimation. Moreover, we adjusted for gender, but not for other demographic variables because they were intrinsic to the definition of the population, for instance, in terms of experts’ age. This meant that the analysis compared the items between population cohorts in a simple form and did not artificially equalise their characteristics, which would have caused over-adjustment and could have induced bias. A further limitation is that the sample is not representative of the Austrian population because it is generally younger and all respondents live in or around Vienna. In the future, the current methods could be expanded and be developed into a full preference weight set using a representative Austrian general population sample. Finally, the research focused only on the capability items of the OxCAP-MH themselves and not their levels due to the focus of our research and the high number of possible item-level combinations which would have proven challenging to incorporate. The assumption that levels carry proportionately constant weights may prove problematic for an actual value set development. Similar BWS studies developing weights for capability instruments, including the ICECAP-A and the Carer Experience Scale, found that greater value was placed on differences between the bottom and middle levels of items than between the middle and top levels [7, 44]. Future preference weight set developments should address the issue of preferences related to levels. The indicative preference weight set could be used as current best proxy to calculate individual scores for different capability states in economic evaluations [42].


This study found no evidence that preference weights for the items of the mental health-specific OxCAP-MH capability well-being instrument could not be elicited from alternative population cohorts with differing mental ill-health experience. Interim, the current preference weight set can be seen as the best proxy for calculating individual scores for different capability states as defined by the OxCAP-MH in economic evaluations. Our approach can also serve as an example for other BWS preference elicitation studies for more complex patient-reported outcome measures (PROMs) in order to be used for cost-effectiveness analyses.

Availability of data and materials

The datasets used and analysed during the current study are available from the corresponding author on reasonable request.


  1. Fox-Rushby J, Cairns .J., Approaches to measuring health and life, in Economic evaluation, Cairns J, Fox-Rushby J, Editor. 2008, Open University Press: London School of Hygiene and Tropical Medicine. p. 85-100.

  2. Simon, J., Health economic analysis of service provision (Chapter 23/136). , in New Oxford Textbook of Psychiatry, A.N. Geddes JR, Goodwin GM. , Editor. 2020, Oxford University Press: Oxford.

  3. Sen, A., Commodities and capabilities. 1985, Amsterdam New York New York, N.Y., U.S.A: North-Holland Sole distributors for the U.S.A. and Canada: Elsevier Science Pub. Co. .

  4. Sen, A., Rationality and freedom. 2002: Harvard University Press.

  5. Nussbaum M. Capabilities as fundamental entitlements: sen and social justice. Fem Econ. 2003;9(2–3):33–59.

    Article  Google Scholar 

  6. Simon J, et al. Cost and quality-of-life impacts of community treatment orders (CTOs) for patients with psychosis: economic evaluation of the OCTET trial. Soc Psychiatry Psychiatr Epidemiol. 2021;56(1):85–95.

    Article  Google Scholar 

  7. Flynn TN, et al. Scoring the icecap-a capability instrument. Estimation of a UK general population tariff. Health Econ. 2015;24(3):258–69.

    Article  Google Scholar 

  8. Ubel PA, et al. Misimagining the unimaginable: the disability paradox and health care decision making. Health Psychol. 2005;24(4S):S57-62.

    Article  Google Scholar 

  9. Froberg DG, Kane RL. Methodology for measuring health-state preferences—III: population and context effects. J Clin Epidemiol. 1989;42(6):585–92.

    CAS  Article  Google Scholar 

  10. Ludwig K, et al. To what extent do patient preferences differ from general population preferences? Value Health. 2021;24(9):1343–9.

    Article  Google Scholar 

  11. Dolan P, Gudex C. Time preference, duration and health state valuations. Health Econ. 1995;4(4):289–99.

    CAS  Article  Google Scholar 

  12. Gray, A.M., et al., Applied methods of cost-effectiveness analysis in healthcare. Vol. 3. 2011: Oxford University Press.

  13. Lorgelly PK, Lorimer K, Fenwick E, Briggs AH, The capability approach: developing an instrument for evaluating public health interventions: final report. 2008: Glasgow Centre for Population Health.

  14. Helter TM, C.J., Laszewska A, Stamm T, Simon J., Capability instruments in economic evaluations of health-related interventions – a comparative review of the literature. Quality of Life Research, 2020. 29: 1433–1464.

  15. Coast J, et al. Valuing the ICECAP capability index for older people. Soc Sci Med. 2008;67(5):874–82.

    Article  Google Scholar 

  16. Coast J, et al. Complex valuation: applying ideas from the complex intervention framework to valuation of a new measure for end-of-life care. Pharmacoeconomics. 2016;34(5):499–508.

    Article  Google Scholar 

  17. Netten A, Burge P, Malley J, Potoglou D, Towers AM, Brazier J, Flynn T, Forder J, Wall B. Outcomes of social care for adults: developing a preference-weighted measure. Health Technol Assess. 2012;16(16):1–166.

    CAS  Article  Google Scholar 

  18. Hajji A, et al. Population-based preference weights for the Adult Social Care Outcomes Toolkit (ASCOT) for service users for Austria: findings from a best-worst experiment. Soc Sci Med. 2020;250: 112792.

    Article  Google Scholar 

  19. Bahrampour M, et al. Discrete choice experiments to generate utility values for multi-attribute utility instruments: a systematic review of methods. Eur J Health Econ. 2020;21(7):983–92.

    Article  Google Scholar 

  20. Vergunst F, et al., Psychometric validation of a multi-dimensional capability instrument for outcome measurement in mental health research (OxCAP-MH). Health and Quality of Life Outcomes, 2017. 15(1).

  21. Łaszewska A, Schwab M, Leutner E, et al., Measuring broader wellbeing in mental health services: validity of the German language OxCAP-MH capability instrument. Qual Life Res, 2019.

  22. Simon J, et al., Cultural and linguistic transferability of the multi-dimensional OxCAP-MH capability instrument for outcome measurement in mental health: The German language version. BMC Psychiatry, 2018. 18(1).

  23. Katumba KR, Laurence YV, Tenywa P, Ssebunnya J, Laszewska A, Simon J, Greco G, Cultural and Linguistic Adaptation of the Multi-dimensional OXCAP-MH for Outcome Measurement in Mental Health among people living with HIV/AIDS in Uganda: The Luganda Version. J Patient Rep Outcomes, 2021.

  24. Helter TM, K.I., Kanka A, Varga O, Kalman J, Simon J. , Internal and external aspects of freedom in the application of the capability approach – the case study of developing a linguistically and culturally valid Hungarian version of the OxCAP-MH well-being questionnaire. BMC Psychology, 2021(in press).

  25. Department of Health Economics, Center for Public Health, Medical university of Vienna. Development of the Chinese (simplified Chinese) version of the OxCAP-MH. 2022.

  26. Au-Yeung SK, et al, PAX-D: study protocol for a randomised placebo-controlled trial evaluating the efficacy and mechanism of pramipexole as add-on treatment for people with treatment resistant depression. Evid Based Ment Health, 2021.

  27. Steel C, et al. The IBER study: study protocol for a feasibility randomised controlled trial of Imagery Based Emotion Regulation for the treatment of anxiety in bipolar disorder. Pilot Feasibil Stud. 2020;6(1):83.

    Article  Google Scholar 

  28. Kingslake, J., et al., The effects of using the PReDicT Test to guide the antidepressant treatment of depressed patients: Study protocol for a randomised controlled trial. Trials, 2017. 18(1).

  29. Helter TM, Boehler CE. Developing attributes for discrete choice experiments in health: a systematic literature review and case study of alcohol misuse interventions. J Subst Use. 2016;21(6):662–8.

    Article  Google Scholar 

  30. Chrzan KOBK, Applied maxdiff: a practitioner's guide to best-worst scaling. 2019: Sawtooth Software.

  31. Louviere JJ, Flynn TN. Using best-worst scaling choice experiments to measure public perceptions and preferences for healthcare reform in Australia. Patient Patient Cent Outcomes Res. 2010;3(4):275–83.

    Article  Google Scholar 

  32. Cheung KL, et al. Using best-worst scaling to investigate preferences in health care. Pharmacoeconomics. 2016;34(12):1195–209.

    Article  Google Scholar 

  33. Street AP, Street DJ. Combinatorics of experimental design. Oxford: Oxford University Press; 1986.

    Google Scholar 

  34. Lagerkvist CJ, Okello J, Karanja N. Anchored vs. relative best–worst scaling and latent class vs. hierarchical Bayesian analysis of best–worst choice data: investigating the importance of food quality attributes in a developing country. Food Qual Preference. 2012;25(1):29–40.

    Article  Google Scholar 

  35. Huynh E, et al. Values for the ICECAP-Supportive Care Measure (ICECAP-SCM) for use in economic evaluation at end of life. Soc Sci Med. 2017;189:114–28.

    Article  Google Scholar 

  36. Yusof FA, Goh A, Azmi S. Estimating an EQ-5D value set for Malaysia using time trade-off and visual analogue scale methods. Value Health. 2012;15(1 Suppl):S85-90.

    Article  Google Scholar 

  37. Howell, J., CBC/HB for beginners. 2009.

  38. Mühlbacher AC, et al. Experimental measurement of preferences in health and healthcare using best-worst scaling: an overview. Heal Econ Rev. 2016;6(1):1–14.

    Article  Google Scholar 

  39. Orme B. Hierarchical Bayes: Why all the attention? 2000.

  40. Software, S. Identifying "Bad" Respondents. [cited 2020; Available from:

  41. Team R.C. R: A language and environment for statistical computing. 2020.

  42. StataCorp., Stata Statistical Software: Release 16. 2019, StataCorp LLC.: College Station, TX.

  43. Proud, L., C. McLoughlin, and P. Kinghorn, ICECAP-O, the current state of play: a systematic review of studies reporting the psychometric properties and use of the instrument over the decade since its publication. Quality of Life Research, 2019.

Download references


We would like to say thank you to all participants of the study.


The study received no funding.

Author information

Authors and Affiliations



JS conceived the original study idea. TH and JS developed the conceptual framework and methods. JS provided the resources to this study. TH, AK, JB, JW, AS contributed to the data collection. TH conducted the analysis with input on different aspects of the study from GH and JS. TH took the lead in writing the manuscript in close consultation with JS. All authors provided critical feedback and helped shape the research, analysis and manuscript. All authors approved the final manuscript.

Corresponding author

Correspondence to Judit Simon.

Ethics declarations

Ethics approval and consent to participate

The study received approval from the Medical University of Vienna (EK Medizinische Universität Wien: 1779/2019). All participants received oral and written information on the study and were asked to give informed written consent prior to participation in the study.

Consent for publication

All authors consent.

Competing interests

JS has led the development of the OxCAP-MH measure. The remaining authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1. Appendix 1:

List of OxCAP-MH items, questions and attributes included in the BWS task. Appendix 2: Mean Relative Importance Scores and Standard Deviations of OxCAP-MH domains calculated by hierarchical Bayes estimation. Appendix 3: Sample BWS questionnaire (translated from German). Appendix 4: Mean Relative Importance Scores (Hierarchical Bayes estimates) by cohort (n=158)

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Helter, T.M., Kaltenboeck, A., Baumgartner, J. et al. Does the relative importance of the OxCAP-MH’s capability items differ according to mental ill-health experience?. Health Qual Life Outcomes 20, 99 (2022).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


  • Capabilities
  • OxCAP-MH
  • Mental health
  • Economic evaluation
  • Value set
  • Preference
  • Best–worst-scaling
  • Hierarchical Bayes estimation