Valuation of preference-based measures: can existing preference data be used to generate better estimates?

Background Experimental studies to develop valuations of health state descriptive systems like EQ-5D or SF-6D need to be conducted in different countries, because social and cultural differences are likely to lead to systematically different valuations. There is a scope utilize the evidence in one country to help with the design and the analysis of a study in another, for this to enable the generation of utility estimates of the second country much more precisely than would have been possible when collecting and analyzing the country’s data alone. Methods We analyze SF-6D valuation data elicited from representative samples corresponding to the Hong Kong (HK) and United Kingdom (UK) general adult populations through the use of the standard gamble technique to value 197 and 249 health states respectively. We apply a nonparametric Bayesian model to estimate a HK value set using the UK dataset as informative prior to improve its estimation. Estimates are compared to a HK value set estimated using HK values alone using mean predictions and root mean square error. Results The novel method of modelling utility functions permitted the UK valuations to contribute significant prior information to the Hong Kong analysis. The results suggest that using HK data alongside the existing UK data produces HK utility estimates better than using the HK study data by itself. Conclusion The promising results suggest that existing preference data could be combined with valuation study in a new country to generate preference weights, making own country value sets more achievable for low and middle income countries. Further research is encouraged.


Background
Health resource allocation is becoming increasingly important in an economic climate of increasing demands on healthcare systems with constrained budgets. Economic evaluation using cost-utility analysis has become widely popular technique internationally to inform resource allocation decisions. Cost-utility analysis measures benefits using Quality Adjusted Life Years (QALYs), a measure that multiples a quality adjustment for health by the duration of that state of health [1]. The quality adjustment weight is generated using utility values where 1 denotes full health and 0 denotes dead, and is most often generated using an existing preference-based measure. Such a measure consists of a classification system used to describe health (patients report their own health and this is assigned to a health state using a classification system) and a value set that generates a utility value for every health state defined by the classification system.
Among the large number of currently available preference-based measures of health-related quality of life (HRQoL) are the generic EuroQol five dimensional (EQ-5D) questionnaire [2], health utilities index 2 (HUI2) and 3 [3,4], Assessment of Quality of Life (AQoL) [5], Quality of Well-being scale (QWB) [6], and the six-dimensional health state short form (derived from short-form 36 health survey) (SF-6D) [7], though there are an increasing number of condition-specific measures available [8].
There is now an increasing number of datasets of preference data, where preferences have been elicited for the same measure for different countries. Kharroubi et al. [9] use a novel nonparametric Bayesian approach to model the disparities between the United States (US) and UK which is simpler, better fitting and more appropriate for the data than the previously adopted conventional parametric model of Johnson et al. [10]. Such an approach has also been applied to the joint UK-Hong Kong and UK-Japan SF-6D data set ( [11], [12]). The nonparametric Bayesian model offers a major added advantage as it permits the utilization of findings of country 1 to improve those of country 2, and as such generated utility estimates of the second country will be more precise than would have been the case if that country's data was collected and analyzed on its own.
There are two distinct ways in which such a model may be useful. In the existence of large quantity of data pertaining to two countries, good estimates of population utility functions corresponding to each country can be generated through the analysis of data from each country on its own (using the model of [13]) and this is the best option. However, in case where a significant quantity of data is available in one country but limited in another, there is a scope to borrow strength from country 1 in an effort to obtain better population utility estimates for the second country than those generated when analyzing that second country's data on its own.
Recently, Kharroubi [14,15] developed a modified nonparametric Bayesian statistical method that permits the utilization of evidence from one country as substantial prior information for a study in another, and employed this method in the analysis of a valuation study for EQ-5D in US using the already existing UK data. Crucial assumption underlying this analysis was that preferences of the UK population are in essence the same as those of the US in addition to that both countries have plenty of data. However, different countries have different population compositions, work, cultures and language. These can all impact on the relative values given to different dimensions of health (for example, self-care and anxiety/depression) as well as where on the 1-0 full health-dead scale each health state lies.
The present paper seeks to explore the use of such a model in the context of smaller countries with different cultures. This is explored using a case study for SF-6D HK and UK data, where the health states valued in the HK valuation study are modelled using the already existing UK dataset, and the estimates are compared to the estimates generated modelling HK data alone. It should be noted that this method was used to model the US/ UK data (the Kharroubi et al. [14,15] articles describe this at length), and as such the method given in this article is a replication of that method. Hence, though it does not present new methodological developments, it further accentuates the key point made in the Kharroubi et al. [14,15] articles, i.e. the good performance of the new modelling approach.
First, SF-6D valuation surveys along with employed data corresponding to UK and HK are summarized here. Second the Bayesian non-parametric model is described and third the results are presented. Finally, the results are discussed, including limitations and suggestions of possible future outlooks.

Methods
The SF-6D The SF-6D includes six health dimensions: physical functioning, role limitation, social functioning, bodily pain, mental health and vitality, each with between four and six levels [7]. Through the selection of one level from each dimension, physical functioning being the first and vitality being the last, an SF-6D health state is defined. Different combinations result in 18,000 possible health states, which are associated with a six-digit descriptor ranging from 111,111 representing full health and 645,655 representing the worst possible state called "the pits".

The valuation survey and data set UK
A sample of 249 health states is described through the SF-6D and then valued by a representative sample of the UK population (n = 836). Selection methods of respondents along with health states are discussed elsewhere [7]. All the selected respondents have been asked to rank and value six health states according to the McMaster 'ping pong' variant of the standard gamble (SG) technique. Accordingly, each of the five SF-6D health states was valued against the perfect health state and against the "pits" by the respondents. As for the sixth question, it consisted of valuing the "pits" by determining whether they perceived it as worse or better than death by considering one of the following choices: (i) the certain prospect of being in the "pits" state and the uncertain prospect of full health or immediate death; or (ii) the certain prospect of death and the uncertain prospect of full health or the "pits" state [16]. Negative values were bounded at − 1, and they designate the states value as worse than death [17]. Then, the other 5 health states were chained onto the zero to one scale, where 0 s designates the perceived equivalent to being dead, and 1 corresponds to perfect health [7]. As such, the dependent variables (y) in the models below correspond to the adjusted SG values.
Of the original 836 respondents, a total of 225 respondents had to be excluded for several reasons. For instance, 130 respondents failed to value the "pits" state; consequently, the corresponding data couldn't be processed any further [10]. Of the total 611 included respondents, 148 missing values from 117 respondents were present thereby resulting in a total of 3518 observed SG valuations across the 249 health states. Details pertaining to the valuation of the 249 SF-6D UK health states can be found in [7].

Hong Kong
The HK study comprised of a sample of 197 health states (selected according to the UK procedures) which were valued using the same valuation procedures as those in the UK study [18]. Each respondent was asked to rank and value eight health states, and the interview procedure was modelled on the basis of that in the UK study.
Out of the original 641 respondents, a total of 59 respondents were disqualified from the analysis according to the same exclusion conditions as in the UK study [6] leaving 582 respondents' data for analysis. Each of the 582 respondents made 8 SG valuations, giving 4596 valuations. Of these, 60 missing health state values were present and so 4596 observed SG valuations across 197 health states were finally included in the analysis. Details pertaining to the valuation of the 197 SF-6D HK health states can be found in [18].

Modelling
The modelling approach is described in Kharroubi [14], where a nonparametric Bayesian model was employed in the modelling of the US EQ-5D dataset using the already existing UK dataset as informative prior. In this article, we follow on from its work to examine whether the adoption of HK health states, while drawing extra information from the UK data, generates better estimation than analyzing the HK sample by itself. The estimates are compared using different prediction criterion, including predicted versus actual mean health states valuations, mean predicted error and root mean square error.
Kharroubi [14] propose the following model Where, for i = 1,2,…,I j and j = 1,2,…,J, x ij is the i-th health state valued by the respondent j in the HK experiment, y ij is the respondent j's time trade-off (TTO) valuation for that health state i, α j is a term to allow for individual characteristics of respondent j and ε ij is a random error term. Let t j be a vector of covariates representing individual characteristics of respondent j, Kharroubi [14] propose the following distributions: α j LN t T j γ; τ 2 and ε ij N 0; υ 2 À Á : where γ is the vector of coefficients for the covariates and τ 2 and v 2 are further parameters to be estimated.
We next let u(x) and u UK (x) be the utility functions for health state x valued in the HK and UK experiments respectively, Kharroubi [14] then model the prior distribution for u(x) as multivariate normal with mean defined as and variance-covariance matrix where E(u UK (x)) is the expected value of the utility of health state x and cov(u UK (x), u UK (x′)) is the variancecovariance matrix between u UK (x) and u UK (x′) for two different states x and x′ in the UK experiment, both of which are readily available from the analysis of the UK study. Given Eqs. 2 and 3, note that x represents a vector consisting of discrete levels on each of the six health dimensions and γ, β and σ 2 are unknown parameters. If follows from Kharroubi [14] that the mean function of u(x) represents a prior expectation that the utility will be approximately a simple additive linear function of the dimension level in x. Additionally, the true function is allowed to deviate around this mean according to its multivariate normal distribution, and so it can as a result assume any form. It is in this sense that the Bayesian model is described as nonparametric. Furthermore, there seem to be a high correlation c(x,x′) between u(x) and u(x′) when x and x′ are close enough, and is given by where b d is a roughness parameter in the dimension d that controls the extent to which the true utility function is anticipated to adhere to a linear form in a dimension d. It is to be noted that many other choices have been made for this covariance matrix; see for example [19] or [20], but the resulting estimates are not generally sensitive to the change of this function. However, the proposed form is appropriate here [13]. See Kharroubi et al. [14] for more details on this. Finally, it is to be noted that the novel method of modelling utility function u(x), represented by adding the two terms E(u UK (x)) and cov(u UK (x), u UK (x′)) in Eqs. 2 and 3, allows the already existing UK evidence to contribute significant prior knowledge to the HK study. In other words, the posterior density of the UK utility function was treated as a prior density to analyse the new study in the HK.
Full theory of the Bayesian approach here is discussed in Kharroubi [14]. Programs to undertake the Bayesian approach were written in Matlab and are available on request.

Results
The new modelling approach is now applied to the analysis of SF-6D HK study using the previously existing UK study (to be indicated by HK/UK model hereinafter). From a Bayesian prospective, the old posterior contains all that we know before seeing the new data, and so becomes the new prior distribution. Thus for our analysis, the posterior of the UK utility function becomes our prior for the analysis of the HK study. The estimates are compared to those estimated using the HK data excluding the UK data (to be indicated by HK model hereinafter) using different prediction criterion, including predicted versus actual mean health states valuations, mean predicted error, root mean square error along with the Bland-Altman agreement plots [21]. Figure 1 shows the HK predicted and observed mean valuations corresponding to the 197 health states evaluated in the sample along with the perfect health, sorted via the predicted valuations. Figure 1a shows the predicted (squared line) and actual (diamond marked line) mean valuations using the HK model. The line marked with triangles denotes the errors computed based on the difference between the two valuations. Figure 1b shows the corresponding results obtained using HK/UK model. Based on the plots it is apparent that the estimates of the HK/UK utilities for the various SF-6D health states are much more precise than those corresponding to the HK only results. These plots also reveal the HK model tends to under predict at low health state values (meaning the poor health states). However, this is not the case for the HK/UK model. Additionally, the plots suggest that the variations of the predictions are larger and so a high fluctuation and non-steady trend of the difference line, so this suggests that the HK/UK model is less susceptible to systematic bias. a b Fig. 1 Sample mean and predicted health states valuations for a the HK model and b the HK/UK model Figure 2a and b depict the Bland-Altman agreement plots for HK and HK/UK models. In this context, the difference between the observed and predicted mean valuations is plotted against the mean of the difference (or the average bias). The solid line corresponds to the mean bias, whereas the dotted lines depict the 95% limits of agreement. For better visual judgment of how good the two valuations agree, the 95% limits-of-agreement lines are drawn. The narrower the range between these two limits, the better the agreement is. When comparing these two figures, we see that the HK/UK model reveals a better agreement as the length of the 95% limits of agreement is 0.163, i.e. narrower than that of the HK model of length 0.197. Additionally, the difference in mean bias between the two models is also obvious, with values of 0.0116 for the HK/UK model and 0.0175 for the HK model. Moreover, the differences standard deviation corresponding to the HK/UK model is much smaller (0.0416) as compared to that corresponding to the HK model (0.0503), thereby vindicating the variations of the differences in Fig. 2a. On the other hand, the HK/UK model differences are well validated as observed in Fig. 2b. Table 1 provides the inferences for the utilities of the 197 states evaluated in the study along with the perfect health. Table 1 displays the actual mean, the standard error corresponding to each health state for both models. The results for the population utilities from the UK that were treated as prior information in the HK/UK model are also provided. As depicted all through the 197 health states (excluding the perfect health state) presented in Table 1, it is evident that the HK/UK model has a better predictive performance compared to the HK model overall, and as a results it has a root mean square error (RMSE) of 0.045 whereas the HK model has an RMSE of 0. 051.  Additionally, Table 1 indicates other noteworthy differences between the HK and HK/UK models. For the pits state, for instance, the HK model predicts a value of 0.0983 albeit the actual average for this state is 0.067, whereas the HK/UK model attains a value of 0.0708. Furthermore, the standard deviations corresponding to the HK/UK model are smaller as a result of using the UK results as priors thereby providing a better estimate. Differences in performance based on monotonicity are also apparent. Of the total 18,000 health states defined by the SF-6D descriptive system, 10,000 health states were sampled at random without replacement. In theory, there are 6-12 health states adjacent to each state of the 10,000 health. Then, as a result of selecting one health state at random from these 6-12 states, 10,000 adjacent pairs were obtained. Out of these 10,000 adjacent pairs, 20% display non-monotonicity in the HK model compared to 10% for the HK/UK model.
A more apparent presentation of the differences between the HK and HK/UK models is shown in Fig. 3, which depicts the fitted values corresponding to the HK model (Fig. 3a) and the HK/UK model (Fig. 3b) against the observed of the 198 health states, as well as the perfect predictions given by a 45°unity line (solid line). Theoretically, the fitted values from the two models are expected to lie roughly on the unity line. When comparing these two plots, it is clear from Fig. 3b that estimates  Fig. 3a, which depicts a larger scatter and the valuations deviate largely from the 45°theoretical line. As a result, we emphasize the fact that the HK/UK model provides predictions much more precisely than the HK model.

Discussion
In this paper, we have applied a nonparametric Bayesian model to estimate the utility values of health states based on the SF-6D descriptive system. This model was undertaken in an effort to use the already existing information from one country to serve as an informative prior for a study in another. The methodology was applied to the HK SF-6D data set using the already available UK valuation, whereby the posterior of the UK utility function was used as a substantial prior to evaluate the new HK study. The method given here is a replication of that used in modelling the US/UK data (the Kharroubi et al. [14,15] articles describe this fully). Hence, though it does not present new methodological developments, it further accentuates the key point made in the Kharroubi et al. [14,15] articles, i.e. the good performance of the new modelling approach.
Crucial assumption underlying the US/UK analyses (Kharroubi et al. [14,15]) was that preferences of the UK population are in essence the same as those of the US; in addition to that both countries have plenty of data. The novelty of the analysis presented here was to explore the use of new modelling in the context of smaller countries with different population compositions, work, cultures, language, all of which can impact on the relative values given to different dimensions of health (for example, self-care and anxiety/depression) as a b Fig. 3 Sample mean and predicted health states valuations for a the HK model and b the HK/UK model well as where on the 1-0 full health-dead scale each health state lies. This is explored using a case study for SF-6D HK and UK data, where the HK valuations are modelled using the already existing UK dataset and the estimates are compared to the estimates generated modelling HK data alone. It is shown that the new modelling of the utility function permitted the already existing UK dataset to contribute significant prior belief to the HK analysis, and for this to enable the generalisability of this approach by making use of experience in a European country to aid the analysis of a study in another Asian country. Consequently, much more precise estimates of the HK utilities corresponding to the various SF-6D health states were obtained using the HK/UK model than would have been the case if the data from HK study was used on its own, yet respect the inherent monotonicity of the underlying utility measure even further. Cautious model diagnostics affirm that the HK/UK model performs well and better than the HK model. The nonparametric Bayesian model offers a major added advantage: in the existence of lots of data on one country and limited on another, it permits the utilization of results of country 1 to improve those of country 2, and as such generated utility estimates of the second country will be much more precise than would have been the case if that country's data was collected and analyzed on its own. This in turn reduces the need for undertaking large surveys in every country using costly and more often time-consuming face to face interviews with techniques such as SG and TTO. To our knowledge, this concept hasn't been investigated properly yet, but clearly it has a lot of potential value. Further research is underway to assess this.
Experimental studies to develop valuations of health state descriptive systems like EQ-5D, HUI or SF-6D need to be conducted in different countries and such work is costly and is potentially wasteful. The work presented here suggests how making use of the already existing data as substantial prior information improve the accuracy of prediction, thereby reducing the number of states to be valued which in turn reduces the cost of cross-country valuation. Work on the demonstration of this idea in a smaller country setting is still in progress.
One limitation of this study is that, as many international agencies recommend the use of country own value sets to generate QALYs, it is unclear whether a value set generated using own country data modelled alongside another country's dataset would be acceptable. However, this may not be a concern if the estimates are accurate and the ordering of health states and location on the 1-0 full health-dead scale is similar to those achieved using a large scale valuation study.
Our basic model Eq. 1 has the potential to allow for more than two countries to be analysed. Additionally, it would be possible to generalize Eqs. 2 and 3 to handle more than two countries. Indeed, we can generalize further to a generic form where P n k¼1 Eðu k ðxÞÞ is the total mean utility of health state x and P n k¼1 covðu k ðxÞ; u k ðx 0 ÞÞ is the total variancecovariance matrix between u k (x) and u k (x′) for two different states x and x′, all of which are readily available from the analysis of the n available countries data.
A final note regarding the potential impact of our study in terms of health and quality of life gains: Note from Table 1 that health state 635,651, for instance, has an estimated health state utility value of 0.3799 from the HK model and 0.4841 from the HK/UK model. Thus, the difference in utility estimates is nearly 0.11. This could bring about an shift in QALYs from a treatment that prolongs life by 1 yr from 0.5 to 0.61. This implies that if a treatment costs 12,000, for example, the cost per QALY would decrease from £24,000 to £19,672, thereby it below the cost effectiveness threshold used by National Institute for Health and Clinical Excellence. In other words, it could influence whether or not a treatment is funded. Heijink et al. [22] found analogous impact of different valuation functions on QALYs.

Conclusion
In conclusion, this novel method of modelling utility functions permitted the UK data to contribute considerable prior to the HK analysis. Consequently, estimates of the HK utilities for the various SF-6D health states could be generated much more precisely than would have been the case if the data from HK study was used alone. It is likely that this will prove to allow the need for much smaller studies compared to what has been employed when developing valuations for new countries. The promising results suggest that existing preference data could be combined with valuation study in a new country to generate preference weights, making own country value sets more achievable for low and middle income countries. Availability of data and materials Publicly available datasets have been used for this study.
Author's contributions SAK has solely carried out the data analysis, wrote and approved the manuscript.
Authors' information SAK is an associate professor in Biostatistics based in the Department of Nutrition and Food Sciences at the American University of Beirut. SAK's research is in the theory and applications of Bayesian statistics. His main area of research and consulting activity is in the theory of inference, computational aspects of Bayesian statistics and in Bayesian modelling generally. He has been involved in many application areas, particularly in medicine and Health Economics.
Ethics approval and consent to participate Secondary publicly available data were used in this study.

Competing interests
The author declares that he has no competing interests.

Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.