A head-to-head comparison of the EQ-5D-5L and 15D descriptive systems and index values in a general population sample
Health and Quality of Life Outcomes volume 21, Article number: 17 (2023)
The EQ-5D-5L and 15D are generic preference-accompanied health status measures with similar dimensions. In this study, we aim to compare the measurement properties of the EQ-5D-5L and 15D descriptive systems and index values in a general population sample.
In August 2021, an online cross-sectional survey was conducted in a representative adult general population sample (n = 1887). The EQ-5D-5L and 15D descriptive systems and index values were compared in terms of ceiling and floor, informativity (Shannon’s Evenness index), agreement, convergent and known-groups validity for 41 chronic physical and mental health conditions. Danish value sets were used to compute index values for both instruments. As a sensitivity analysis, index values were also estimated using the Hungarian EQ-5D-5L and Norwegian 15D value sets.
Overall, 270 (8.6%) and 1030 (3.4*10−6%) unique profiles occurred on the EQ-5D-5L and 15D. The EQ-5D-5L dimensions (0.51–0.70) demonstrated better informativity than those of 15D (0.44–0.69). EQ-5D-5L and 15D dimensions capturing similar areas of health showed moderate or strong correlations (0.558–0.690). The vision, hearing, eating, speech, excretion and mental function 15D dimensions demonstrated very weak or weak correlations with all EQ-5D-5L dimensions, which may indicate potential room for EQ-5D-5L bolt-ons. The 15D index values showed lower ceiling than the EQ-5D-5L (21% vs. 36%). The mean index values were 0.86 for the Danish EQ-5D-5L, 0.87 for the Hungarian EQ-5D-5L, 0.91 for the Danish 15D and 0.81 for the Norwegian 15D. Strong correlations were found between the index values (Danish EQ-5D-5L vs. Danish 15D 0.671, Hungarian EQ-5D-5L vs. Norwegian 15D 0.638). Both instruments were able to discriminate between all chronic condition groups with moderate or large effect sizes (Danish EQ-5D-5L 0.688–3.810, Hungarian EQ-5D-5L 1.233–4.360, Danish 15D 0.623–3.018 and Norwegian 15D 1.064–3.816). Compared to the 15D, effect sizes were larger for the EQ-5D-5L in 88–93% of chronic condition groups.
This is the first study to compare the measurement properties of the EQ-5D-5L and 15D in a general population sample. Despite having 10 fewer dimensions, the EQ-5D-5L performed better than the 15D in many aspects. Our findings help to understand the differences between generic preference-accompanied measures and support resource allocation decisions.
Generic preference-accompanied measures (PAMs) are health status measures that consist of two parts: the first is a descriptive system, and the second is a value set that allows assigning utilities to health profiles defined by the descriptive system. Over the past decades, an increasing number of generic PAMs have been developed, such as the EQ-5D, Short-Form 6-Dimension (SF-6D), Assessment of Quality of Life (AQoL) and Health Utilities Index (HUI) . Despite the abundance of PAMs, the most commonly used one on an international level is the EQ-5D [2, 3]. In over 20 countries, national health technology assessment guidelines recommend the use of this instrument in cost-utility analyses of health interventions . It was developed in the late 1980s by an international organization, the EuroQol Group, and currently, it has two versions for adults, the EQ-5D-3L and the newer EQ-5D-5L [5, 6]. Both versions showed good validity in several countries, languages, and patient populations [7, 8].
The 15D is a 15-dimensional generic PAM, which was developed in Finland starting from the 1970s . The instrument has been validated in numerous patient populations and translated to multiple languages; however, its popularity is predominantly concentrated in the Nordic countries . Country-specific 15D value sets have been developed in Finland , Denmark , and Norway [13, 14]. Two countries (Norway and Chile) mention the 15D as an acceptable instrument in their health technology assessment guidelines [15, 16]. It has also been used in several cost-effectiveness and cost-utility analyses in different countries and as part of health technology assessment dossiers, in conditions such as hip and knee arthrosis , Parkinson’s disease , cataract , acute and chronic liver failure [20, 21] and anorexia nervosa  in Finland, multiple myeloma in three Nordic countries (Denmark, Norway, Sweden) , stress urinary incontinence in Canada , and breast cancer in Iran .
Compared to the EQ-5D, the descriptive system of the 15D is considerably longer, more detailed, and comprehensive. Notwithstanding, the 15D and EQ-5D-5L instruments are similar in many aspects, which offers a strong basis for comparison. Firstly, a range of corresponding dimensions can be found between the two measures with similar wording, such as mobility, usual activities, pain/discomfort, and anxiety/depression/distress. Secondly, on both instruments each dimension of health has one item with five response levels measured on a severity or capability scale. Finally, both instruments investigate the current health status of the respondent. A few studies compared the measurement properties of the EQ-5D-3L and 15D in different patient populations, such as epilepsy , HIV/AIDS  and stroke  in Norway, multiple chronic conditions , chronic pain , critical care setting  and patients after cardiac surgery  in Finland. However, to date, only one study has examined the psychometric properties of the EQ-5D-5L and 15D, in a sample of Parkinson’s disease in Spain . Furthermore, no studies have provided a comparison of the measurement properties of either the EQ-5D-3L or EQ-5D-5L and 15D in a general population sample.
Comparing PAMs in different populations is important to inform researchers, analysts and health policy decision-makers about their performance and support the choice of instrument for cost-utility analysis. Although the EQ-5D-5L has proved to be a valid instrument in a multitude of health conditions, it might not capture all important aspects of health, especially in sensory disorders  and mental health conditions . Furthermore, a clear need emerged to include extra dimensions in the EQ-5D-5L, so-called “bolt-ons” . In that respect, the 15D with its broader descriptive system may offer advantages over the EQ-5D-5L. On the other hand, the 15D needs to fit many more dimensions into the utility space, allowing on average less space for each separate dimension. Given the abovementioned similarities between the EQ-5D-5L and 15D, one may almost consider the 15D a variant of the EQ-5D-5L expanded with bolt-ons. Interestingly, five of the 15 dimensions of 15D have a corresponding EQ-5D dimension and further eight have earlier been proposed as possible bolt-ons for the EQ-5D (vision, sleeping, hearing, vitality, speech, breathing, mental function, and sexual activities) . A comparative analysis between the two instruments may also provide new evidence that can later support the development of candidate bolt-on dimensions.
Therefore, in this study, we aim to conduct an exploratory analysis that compares the measurement properties of the EQ-5D-5L and 15D in a large general population sample in Hungary. We compare measurement properties of both the descriptive systems and the index values focusing on ceiling and floor effects, informativity, agreement, redistribution properties, convergent and known-groups validity.
A cross-sectional survey was conducted with a targeted sample size of 2000 members of the Hungarian adult general population (response rate 77.8%). The broader aim of the survey was to assess the mental health of the population. Permission for conducting the study was granted by the Research Ethics Committee of the Corvinus University of Budapest (no. KRH/166/2021). Participants were recruited in August 2021 from one of the largest available online panels in Hungary by a third-party survey company. Respondents registered voluntarily to complete surveys in return for points, which could be redeemed for rewards. Respondents were included who were at least 18 years old at the time of completion, gave informed consent, and confirmed that they had understood the terms and were willing to participate. ‘Soft’ quotas were applied to ensure the representativeness of the sample for the general population by age, gender, the highest level of education, geographical region, and settlement type.
A self-administered survey was designed for the study that asked questions about health-related quality of life, well-being, presence of physical and mental health conditions, resource utilization related to mental health care, and sociodemographic characteristics. The list of the physical health conditions was selected according to the 2019 Hungarian results of the European Health Interview Survey (EHIS)  complemented by some common chronic diseases. Similarly, the list of mental health conditions was chosen according to the fifth edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5) . We asked respondents to report any physical and mental health conditions experienced in the past 12 months in two questions. Firstly, they had to state whether they had any self-reported physical or mental health conditions. Secondly, they had to mark those that were also diagnosed by a physician. All participants completed a set of standardized questionnaires, including the validated Hungarian versions of EQ-5D-5L and 15D. The participants answered the questions in a fixed order, starting with the EQ-5D-5L and multiple questions were included between the EQ-5D-5L and 15D. As a base case, we used the Danish value sets for both the EQ-5D-5L  and 15D , because currently, Denmark is the only country with national value sets for both measures. However, using these value sets may have limitations. They were developed in different decades, using different preference elicitation methods, and thus have largely different value set ranges. Furthermore, using Danish value sets for Hungary may also pose additional problems given the differences in sociodemographic and economic characteristics and cultural values between the two countries . Therefore, to test the robustness of our results, we repeated all analyses using the Hungarian EQ-5D-5L  and Norwegian 15D value sets . The former was selected because of the study country, while the latter was considered as the most recently developed 15D value set with a similar value set range to the Hungarian EQ-5D-5L value set.
The EQ-5D-5L is a generic PAM that consists of two parts: a five-item descriptive system and a 0–100 visual analogue scale (EQ VAS) [5, 6]. The descriptive system contains five dimensions of health: mobility, self-care, usual activities, pain/discomfort and anxiety/depression, each with five response levels (no problems = 1, slight problems = 2, moderate problems = 3, severe problems = 4 and unable to/extreme problems = 5), allowing 55 = 3125 unique health states in total . Respondents are asked to recall their current health state (i.e., ‘your health today’). The Danish value set used as a base case in this study is based on a heteroscedastic censored hybrid model using both composite time trade-off (cTTO) and discrete choice experiment (DCE) data from a representative adult general population sample in Denmark (data collection period 2018–19) . The lowest possible value with this value set is − 0.758, where negative values describe health states considered to be worse than dead and 1 refers to full health. The Hungarian value set that was used for the sensitivity analysis had been estimated by a pooled heteroscedastic Tobit model using cTTO data from a representative sample of the Hungarian adult general population (data collection period 2018–19) . Index values range from − 0.848 to 1 with this value set.
The 15D is another generic PAM that covers 15 dimensions of health-related quality of life: mobility, vision, hearing, breathing, sleeping, eating, speech, excretion, usual activities, mental function, discomfort and symptoms, depression, distress, vitality, and sexual activities . Each of these dimensions has five response levels (1 being the best possible health state of an individual, while 5 being the worst) with 515 (more than 30 billion) possible distinct health states. The 15D asks respondents to recall their current health (i.e., ‘present health status’). The Danish value set was selected in this study as a base case. This was developed using an additive model of the multi-attribute utility theory based on valuations on a 0–100 visual analogue scale (VAS). Firstly, respondents were asked to weigh the top and bottom levels of each dimension individually on a VAS, then they were asked to assign a score to each level of each dimension on VAS (‘within dimension tasks’). Data were collected in 2001 and preferences of the non-institutionalized general population of Denmark aged 18–75 were assessed . The index values of the final value set range from 0.160 to 1. The Norwegian value set, used for the sensitivity analysis, also relies on an additive model . However, it only kept the ‘within dimension tasks’ from the original valuation that was supplemented by a pits-task, whereby respondents were asked to rate the worst possible health state on a VAS together with ‘being dead’. Data were collected in 2010 and 2015–16 from a representative sample of the Norwegian general population. The index values range from − 0.516 to 1.
Our analytical framework builds on previous studies that compared the measurement properties of other generic PAMs [43,44,45,46]. As a result of a technical problem in the online survey interface, a few respondents’ EQ-5D-5L responses may have been inadvertently recorded as level 5 responses. Therefore, the research team examined all level 5 responses attentively in the EQ-5D-5L and compared them with other information (i.e. self-reported health status on other measures, physician-diagnosed physical and mental health conditions) provided by the respondents. As a result, 113 participants were excluded from the sample before the statistical analysis. To compare the two instruments, corresponding dimensions of EQ-5D-5L and 15D were matched, e.g. EQ-5D-5L mobility and 15D mobility. All analyses were performed on the total sample, and also for two subsets of respondents: (1) respondents with physical health conditions, and (2) respondents with mental health conditions. Statistical analyses were performed using R Statistical Software (version 4.1.1; R Foundation for Statistical Computing, Vienna, Austria). All the statistical tests were two-sided, and p < 0.05 was considered statistically significant.
Ceiling and floor
The proportion of participants reporting ‘no problems’ (ceiling) and ‘extreme problems’ (floor) was computed for each dimension of the descriptive systems. In addition, we calculated the ceiling and floor for the EQ-5D-5L and 15D health profiles, i.e. ‘no problems’ and ‘extreme problems’ in all dimensions, respectively. We expected a higher overall ceiling in the EQ-5D-5L than the 15D at an instrument level since the descriptive system of the latter is more detailed .
The informativity of EQ-5D-5L and 15D dimensions, index values, and health state profiles was examined by Shannon’s (absolute informativity, H′) and Shannon’s Evenness (relative informativity, J′) indices [47, 48]. The Shannon index (H′) can be defined as
where pi is the proportion of observations in the ith level (where i = 1, …, L), and L is the number of levels in a dimension of the descriptive system. The greatest amount of information can be gathered if the responses are equally used across the levels. The Shannon Evenness index (J′) measures the evenness of distribution and was calculated as
Thus, H′ ranges from 0 to log2L, and J′ ranges from 0 to 1, where a higher value indicates better informativity.
Inconsistencies and agreement
We performed cross-tabulations of the corresponding EQ-5D-5L and 15D dimensions to explore how consistent the responses were. We considered an EQ-5D-5L and 15D response pair inconsistent if the 15D response was at least two levels away from the EQ-5D-5L response . The average size of inconsistencies was assessed according to the following weights: 0 if EQ-5D-5L and 15D responses did not differ more than 1 level, 1 if responses differed by 2 levels, and so forth .
The agreement between the EQ-5D-5L and 15D index values was examined using intraclass correlation coefficient (ICC)  and Bland–Altman plot . A two-way random model with absolute agreement was applied to obtain an ICC value . Agreement was considered poor 0 ≤ ICC < 0.4, fair 0.4 ≤ ICC < 0.6, good 0.6 ≤ ICC < 0.75, and excellent 0.75 ≤ ICC < 1 .
We examined the convergent validity between the EQ-5D-5L and 15D dimensions (Spearman’s correlation) and index values (Pearson’s correlation). The absolute value of the correlation coefficient (r) was interpreted as follows: very weak correlation |r| < 0.2, weak correlation 0.2 ≤ |r| < 0.4, moderate correlation 0.4 ≤ |r| < 0.6 and strong correlation 0.6 ≤ |r| ≤ 1 . We expected higher correlations among the corresponding dimensions covering similar aspects of health .
Known-groups validity was evaluated for self-reported physician-diagnosed health condition groups in contrast to being healthy. We hypothesized that respondents with a diagnosed physical or mental condition had significantly lower EQ-5D-5L and 15D index values. Student’s t test was used to compare the healthy and non-healthy groups. Effect size (ES, Cohen’s d) and relative efficiency (RE) were calculated. ES values were interpreted as negligible d < 0.2, small 0.2 ≤ d < 0.5, medium 0.5 ≤ d < 0.8, and large 0.8 ≤ d . The RE was calculated as the ESs ratio of the two indices, where the 15D test statistic was used as reference; thus, a RE > 1 indicated that the EQ-5D-5L was more efficient in discriminating between two subgroups. To test whether the RE statistically differs from 1, 95% confidence intervals were calculated using 2000 bootstrap samples with accelerated bias correction.
Characteristics of the study population
The distribution of the sample (n = 1887) reasonably approximated that of the general population in terms of sociodemographics (Additional file 1: Supplementary material 1). Altogether 63.4% of the sample responded that they had one or more physical conditions and 35.2% reported at least one mental health condition diagnosed by a physician.
As for the EQ-5D-5L dimensions, the floor varied between 0.2% (usual activities) and 1.2% (anxiety/depression), while the ceiling ranged from 50.8% (pain/discomfort) to 87.7% (self-care) (Table 1). Regarding the 15D dimensions, the floor reached its lowest at 0.2% (eating) and its highest at 3.9% (sexual activities), while for the ceiling, the values varied between 48.4% (sleeping) and 94.4% (eating). The EQ-5D-5L had lower ceiling in all corresponding dimension pairs, except for the EQ-5D-5L anxiety/depression vs. 15D distress pair. The highest difference in ceiling was found between EQ-5D-5L pain/discomfort (50.8%) and 15D discomfort and symptoms (68.2%). Similarly, the floor was equal or lower in the EQ-5D-5L for all pairs but EQ-5D-5L anxiety/depression vs. 15D depression. The largest difference in floor was seen between EQ-5D-5L anxiety/depression (1.2%) and 15D distress (1.7%).
EQ-5D-5L outperformed 15D regarding relative informativity (J′) for all dimensions (ranging from 0.51 to 0.70 for the EQ-5D-5L and from 0.44 to 0.69 for the 15D), except for the EQ-5D-5L anxiety/depression (0.65) vs. 15D distress (0.69) (Table 2). Considering all dimensions of each instrument, the average J′ values showed better results for the EQ-5D-5L (0.56) than for the 15D (0.49).
Responses covered all levels in both the EQ-5D-5L and 15D among the corresponding dimensions (Additional file 1: Supplementary materials 2–5). The rate of inconsistent response pairs was ranging from 4.6% (EQ-5D-5L anxiety/depression and 15D depression) to 7.9% (EQ-5D-5L mobility and 15D mobility). The average size of inconsistency was relatively low, ranging from 1.20 to 1.24.
As for the corresponding dimensions, we observed strong correlation between the EQ-5D-5L and 15D usual activities dimensions (0.619) (Table 3). The EQ-5D-5L anxiety/depression correlated stronger with 15D depression (0.690) than with 15D distress (0.642). Moderate correlation was found between the two mobility dimensions (0.558), as well as between the EQ-5D-5L dimension pain/discomfort and the 15D dimension discomfort and symptoms (0.583). The non-corresponding dimension pairs were correlated weakly to moderately, ranging from 0.115 (EQ-5D-5L mobility and 15D eating) to 0.541 (EQ-5D-5L pain/discomfort and 15D vitality). We observed moderate correlation between the EQ VAS and all EQ-5D-5L domains (except for self-care, where correlation was weak), while mostly weak and moderate connection with the 15D dimensions.
Analysis of the index values
The distributions of the EQ-5D-5L and 15D index values are presented in Fig. 1, while the main characteristics of the indices can be found in Table 4. Overall, 270 unique health states were observed for the EQ-5D-5L and 1030 for the 15D. The most common health state profile for both instruments was full health, accounting for 36.0% of the EQ-5D-5L answers and 21.0% of the 15D answers. As for the EQ-5D-5L, the second most common profile was slight pain or discomfort with no problems on the other dimensions (6.4%), while for the 15D, slight problems with sleeping and no other problems (3.2%).
In the total sample, the mean index value was the highest using the Danish 15D (0.91, SD = 0.11), followed by the Hungarian EQ-5D-5L (0.87, SD = 0.21), the Danish EQ-5D-5L (0.86, SD = 0.22), and the Norwegian 15D value set (0.81, SD = 0.22). The floor was negligible for 15D and not present for the EQ-5D-5L. For the Danish EQ-5D-5L, 1.4% of the index values were in the negative range, while for the Danish 15D, the theoretical minimum is higher than 0. However, 1.2% of the Hungarian EQ-5D-5L and 0.9% of the Norwegian 15D index values were negative. When the index value range was split with a bin width of 0.05, the Norwegian 15D showed the best relative informativity (J′) (0.63), followed by the Danish EQ-5D-5L (0.53), the Hungarian EQ-5D-5L (0.49), while the lowest J′ was demonstrated by the Danish 15D (0.44) (Table 4).
Poor agreement was found between the Danish EQ-5D-5L and 15D index values with an ICC of 0.363 (95% confidence interval: 0.342 to 0.385, p < 0.001) but a good agreement was found between the Hungarian EQ-5D-5L and Norwegian 15D index values with an ICC of 0.607 (95%CI 0.516–0.677, p < 0.001). The Bland–Altman plot indicated that 93.3% of the points lay within the 95% limits of agreement between the Danish EQ-5D-5L and 15D (94.2% between the Hungarian EQ-5D-5L and Norwegian 15D). Differences between the EQ-5D-5L and 15D index values increased at lower mean values for both value set pairs (Fig. 2).
Using the Danish value sets, a strong correlation was found between the EQ-5D-5L and 15D index values (0.671), and the EQ-5D-5L index value and EQ VAS value (0.604), while a moderate correlation was found between the 15D index value with the EQ VAS (0.534). The EQ-5D-5L index value demonstrated a strong correlation with its dimensions, except for self-care, where the correlation was moderate (− 0.482). By contrast, correlation coefficients between 15D dimensions and the EQ-5D-5L index value were ranging from − 0.596 (vitality) to − 0.176 (eating). 15D index value correlated moderately or strongly with most of its dimensions, while only weakly with the eating dimension (− 0.346). Considering the EQ-5D-5L dimensions with the 15D index value, the strongest correlation was observed for the pain/discomfort dimension (− 0.629), while the weakest for self-care (− 0.369). The convergent validity results were confirmed by the sensitivity analysis (Table 3).
Both the Danish EQ-5D-5L and 15D index values were able to discriminate between all chronic condition groups with moderate or large effect sizes (ranging from 0.688 to 3.810 for the EQ-5D-5L and from 0.623 to 3.018 for the 15D) (Table 5). Overall, the EQ-5D-5L was able to discriminate more effectively between 38/41 (93%) known-groups (RE > 1). Nevertheless, the bootstrap analysis suggested that results were significant in only five condition groups, dementia (RE = 1.465), other physical health conditions (RE = 1.448), bipolar depression (RE = 1.385), thyroid diseases (RE = 1.269), and gastroesophageal reflux disease (RE = 1.251). Using the Hungarian EQ-5D-5L and the Norwegian 15D value sets, effect sizes were large in all condition groups, and RE was > 1 in 36/41 (88%) known-groups. However, according to the results of the bootstrap analysis, the difference was only significant in four condition groups: dementia (RE = 1.672), chronic kidney disease (RE = 1.456), other physical health conditions (RE = 1.454), and urinary incontinence (RE = 1.302) (Table 6).
The subgroup analysis for the physical and mental health condition subgroups yielded similar results to those of the total sample. Lower ceiling was observed both in the mental (18.7%) and physical health conditions subgroups (25.5%) compared to the total sample (36.0%) for the EQ-5D-5L, while the floor was 0% in both subgroups. Similarly, for the 15D, the ceiling was reduced to a greater extent in the mental health condition subgroup (10.1%) than in the physical health condition subgroup (12.3%) against the total sample (21.0%) (Table 1). In line with previous results, J′ was greater for the EQ-5D-5L than for the 15D in both subgroups (Table 2). The average size of inconsistency was similar for physical and mental health conditions (Additional file 1: Supplementary materials 10–11). The correlation between the Danish EQ-5D-5L and Danish 15D index values was higher in both the physical and mental health condition subgroups (0.736 and 0.702) than in the total sample (0.671). The ICC stood at 0.311 (95% CI 0.285–0.338, p < 0.001) for the physical health conditions subgroup, while reached 0.336 (95% CI 0.302–0.371, p < 0.001) for the mental health subgroup. As for the corresponding dimensions, correlations between dimensions were, in general, higher in both subgroups than in the total sample (Additional file 1: Supplementary materials 12–13). The sensitivity analyses (Additional file 1: Supplementary materials 6–9, 12–13) with the Hungarian EQ-5D-5L and Norwegian 15D value sets mostly supported these results; however, the agreement was good in both the physical (ICC = 0.653, 95% CI 0.561–0.722, p < 0.001) and mental (ICC = 0.632, 95% CI 0.495–0.725, p < 0.001) health condition subgroups.
To our knowledge, this is the first study to compare the measurement properties of the EQ-5D-5L and 15D instruments in a general population sample. The sample showed good representativeness across demographic characteristics and allowed conducting subgroup analyses for physical and mental health conditions. EQ-5D-5L dimensions showed a substantially lower ceiling than those of the 15D in all but one corresponding dimension pairs. We identified a considerably larger ceiling in the EQ-5D-5L index value than the 15D index value, which corroborates earlier findings in various patient populations [26, 28, 29, 31, 32]. The ceiling decreased notably in both the physical and mental conditions subgroups compared to the total sample concerning both indices. The EQ-5D-5L demonstrated better overall relative informativity. Strong correlations were seen between the index values, which can be confirmed by previous research [30, 31]. Differently from our expectations , the anxiety/depression composite dimension correlated stronger with 15D depression than with 15D distress. Both the EQ-5D-5L and 15D were able to discriminate effectively between the healthy and non-healthy respondents with moderate or large effect sizes; however, EQ-5D-5L produced larger effect sizes in most groups regardless of the value set used.
Both instruments were able to effectively discriminate between the healthy and non-healthy groups of respondents. However, it is worth mentioning that although the index values in the healthy subgroup were reasonably similar for both the EQ-5D-5L and 15D using the Danish value sets, the mean index values of the EQ-5D-5L were substantially lower than those for the 15D in respondents with health conditions. On the contrary, the sensitivity analysis suggests that the Norwegian 15D index values were sizeably lower in 15/41 health conditions than the Hungarian EQ-5D-5L index values, while the difference was negligible in the rest. This is mainly attributable to the different value sets of the 15D. The range of the Danish value set is considerably narrower than that of the Norwegian, which has a utility of − 0.516 for the worst possible state that is more comparable to either EQ-5D-5L value sets used in this study. Therefore, there is less space for potential improvement using the Danish 15D value set and for this reason, the index values of more severe health states are already relatively high. The difference between the value sets is also well indicated by the fact that the ICC is poor between the Danish EQ-5D-5L and 15D index values, but good between the Hungarian EQ-5D-5L and Norwegian 15D.
A few 15D dimensions demonstrated (very) weak correlations with all EQ-5D-5L dimensions, such as vision, hearing, eating, speech, excretion, and mental function, which may indicate potential room for EQ-5D-5L bolt-ons. This is in line with earlier research that acknowledged these health areas as potentially not captured by the EQ-5D and proposed bolt-ons for these, including vision, hearing, speech, and cognition [34, 37, 57,58,59]. As bolt-on identification, development and testing are recommended to be based on mixed-methods evidence from multiple investigations and populations , our results support these efforts by informing future EQ-5D bolt-on development studies.
The following limitations should be considered. Firstly, due to the cross-sectional design of our study, we could not test the responsiveness or the test–retest reliability of the instruments. Secondly, according to census data, 48.0% of the Hungarian general population reported having chronic illness , whereas in our sample this proportion reached 71.6%. This difference is likely due to the fact that our questionnaire was rather detailed regarding questions about different health conditions and considered addictions (e.g. smoking) as well. Thirdly, clinical data including information on disease severity were not available from our survey, which would have allowed a more comprehensive known-groups validity testing. Finally, we have to acknowledge some linguistic specificities of our findings. For instance, in English, the mobility dimensions of both instruments use the phrase ‘walking’, while the Hungarian version of the 15D uses a different translation with a meaning of ‘moving about’ (‘közlekedés’) that could be responsible for the relatively high proportion of inconsistent response pairs between these two dimensions (7.9%).
In conclusion, our findings may contribute to the discussion of which generic PAM to use in decision-making and provide useful and broad information for health economic evaluations. Despite having 10 fewer dimensions, the EQ-5D-5L performed better than the 15D in many aspects. However, certain 15D dimensions (e.g. vision, hearing, mental function) showed a relatively weak relationship with the dimensions of EQ-5D-5L, which signals room for potential EQ-5D-5L bolt-on dimensions. Future research is recommended to assess the added value of such bolt-on dimensions and compare their measurement properties to other PAMs that include these health areas among their dimensions (e.g. 15D, AQoL). Additionally, longitudinal studies are needed to test the responsiveness of these instruments in relevant patient populations.
Availability of data and materials
All data of this study are available from the corresponding author upon reasonable request.
Assessment of Quality of Life
Composite time trade-off
Discrete choice experiment
Fifth edition of the Diagnostic and Statistical Manual of Mental Disorders
European Health Interview Survey
Health Utilities Index
Intraclass correlation coefficient
Shannon’s Evenness index
Quality Adjusted Life Years
Visual analogue scale
Finch AP, Brazier JE, Mukuria C. What is the evidence for the performance of generic preference-based measures? A systematic overview of reviews. Eur J Health Econ. 2018;19(4):557–70.
Kennedy-Martin M, Slaap B, Herdman M, van Reenen M, Kennedy-Martin T, Greiner W, et al. Which multi-attribute utility instruments are recommended for use in cost-utility analysis? A review of national health technology assessment (HTA) guidelines. Eur J Health Econ. 2020;21(8):1245–57.
Rencz F, Gulácsi L, Drummond M, Golicki D, Prevolnik Rupel V, Simon J, et al. EQ-5D in central and eastern Europe: 2000–2015. Qual Life Res. 2016;25(11):2693–710.
Richardson J, Iezzi A, Khan MA. Why do multi-attribute utility instruments produce different utilities: the relative importance of the descriptive systems, scale and “micro-utility” effects. Qual Life Res. 2015;24(8):2045–53.
The EuroQol Group. EuroQol—a new facility for the measurement of health-related quality of life. Health Policy. 1990;16(3):199–208.
Herdman M, Gudex C, Lloyd A, Janssen M, Kind P, Parkin D, et al. Development and preliminary testing of the new five-level version of EQ-5D (EQ-5D-5L). Qual Life Res. 2011;20(10):1727–36.
Feng YS, Kohlmann T, Janssen MF, Buchholz I. Psychometric properties of the EQ-5D-5L: a systematic review of the literature. Qual Life Res. 2021;30(3):647–73.
Longworth L, Singh J, Brazier J. An evaluation of the performance of EQ-5D: a review of reviews of psychometric properties. Value Health. 2014;17(7):A570.
Sintonen H. The 15D instrument of health-related quality of life: properties and applications. Ann Med. 2001;33(5):328–36.
Sintonen H. 15D instrument. http://www.15d-instrument.net/15d/. Accessed 24 May 2022.
Sintonen H. The 15D-measure of health-related quality of life. II. Feasibility, reliability and validity of its valuation system. National Centre for Health Program Evaluation, Working Paper 42, Melbourne. 1995.
Wittrup-Jensen KU, Pedersen KM. Modelling Danish weights for the 15D quality of life questionnaire by applying multi-attribute utility theory (MAUT). Odense: Syddansk Universitet; 2008.
Michel YA, Augestad LA, Rand K. Comparing 15D valuation studies in Norway and Finland-challenges when combining information from several valuation tasks. Value Health. 2018;21(4):462–70.
Michel YA, Augestad LA, Barra M, Rand K. A Norwegian 15D value algorithm: proposing a new procedure to estimate 15D value algorithms. Qual Life Res. 2019;28(5):1129–43.
Norwegian Medicines Agency. Guidelines for the submission of documentation for single technology assessment (STA) of pharmaceuticals. https://legemiddelverket.no/Documents/English/Public%20funding%20and%20pricing/Documentation%20for%20STA/Guidelines%20151018.pdf. Published 2018. Accessed 6 Dec 2022.
Ministerio de Salud de Chile. Guía Metodológica para la Evaluación Económica de Intervenciones en Salud en Chile [Methodological Guide for the Economic Evaluation of Health Interventions in Chile]. https://www.orasconhu.org/case/sites/default/files/files/EE_FINAL_web.pdf. Published 2013. Accessed 6 Dec 2022.
Rissanen P, Aro S, Sintonen H, Asikainen K, Slätis P, Paavolainen P. Costs and cost-effectiveness in hip and knee replacements. A prospective study. Int J Technol Assess Health Care. 1997;13(4):575–88.
Linna M, Taimela E, Apajasalo M, Marttila RJ. Probabilistic sensitivity analysis for evaluating cost-utility of entacapone for Parkinson’s disease. Expert Rev Pharmacoecon Outcomes Res. 2002;2(2):91–7.
Räsänen P, Krootila K, Sintonen H, Leivo T, Koivisto AM, Ryynänen OP, et al. Cost-utility of routine cataract surgery. Health Qual Life Outcomes. 2006;4:74.
Kantola T, Mäklin S, Koivusalo AM, Räsänen P, Rissanen A, Roine R, et al. Cost-utility of molecular adsorbent recirculating system treatment in acute liver failure. World J Gastroenterol. 2010;16(18):2227–34.
Åberg F, Mäklin S, Räsänen P, Roine RP, Sintonen H, Koivusalo AM, et al. Cost of a quality-adjusted life year in liver transplantation: the influence of the indication and the model for end-stage liver disease score. Liver Transpl. 2011;17(11):1333–43.
Pohjolainen V, Räsänen P, Roine RP, Sintonen H, Koponen S, Karlsson H. Cost-effectiveness of anorexia nervosa in QALYs. Nord J Psychiatry. 2017;71(1):67–71.
Gulbrandsen N, Wisløff F, Nord E, Lenhoff S, Hjorth M, Westin J. Cost-utility analysis of high-dose melphalan with autologous blood stem cell support vs. melphalan plus prednisone in patients younger than 60 years with multiple myeloma. Eur J Haematol. 2001;66(5):328–36.
Ross S, Robert M, Lier D, Eliasziw M, Jacobs P. Surgical management of stress urinary incontinence in women: safety, effectiveness and cost-utility of trans-obturator tape (TOT) versus tension-free vaginal tape (TVT) five years after a randomized surgical trial. BMC Womens Health. 2011;11:34.
Bastani P, Kiadaliri AA. Cost-utility analysis of adjuvant therapies for breast cancer in Iran. Int J Technol Assess Health Care. 2012;28(2):110–4.
Stavem K, Bjørnaes H, Lossius MI. Properties of the 15D and EQ-5D utility measures in a community sample of people with epilepsy. Epilepsy Res. 2001;44(2–3):179–89.
Stavem K, Frøland SS, Hellum KB. Comparison of preference-based utilities of the 15D, EQ-5D and SF-6D in patients with HIV/AIDS. Qual Life Res. 2005;14(4):971–80.
Lunde L. Can EQ-5D and 15D be used interchangeably in economic evaluations? Assessing quality of life in post-stroke patients. Eur J Health Econ. 2013;14(3):539–50.
Saarni SI, Härkänen T, Sintonen H, Suvisaari J, Koskinen S, Aromaa A, et al. The impact of 29 chronic conditions on health-related quality of life: a general population survey in Finland using 15D and EQ-5D. Qual Life Res. 2006;15(8):1403–14.
Vartiainen P, Mäntyselkä P, Heiskanen T, Hagelberg N, Mustola S, Forssell H, et al. Validation of EQ-5D and 15D in the assessment of health-related quality of life in chronic pain. Pain. 2017;158(8):1577–85.
Vainiola T, Pettilä V, Roine RP, Räsänen P, Rissanen AM, Sintonen H. Comparison of two utility instruments, the EQ-5D and the 15D, in the critical care setting. Intensive Care Med. 2010;36(12):2090–3.
Heiskanen J, Tolppanen AM, Roine RP, Hartikainen J, Hippeläinen M, Miettinen H, et al. Comparison of EQ-5D and 15D instruments for assessing the health-related quality of life in cardiac surgery patients. Eur Heart J Qual Care Clin Outcomes. 2016;2(3):193–200.
García-Gordillo M, del Pozo-Cruz B, Adsuar JC, Sánchez-Martínez FI, Abellán-Perpiñán JM. Validation and comparison of 15-D and EQ-5D-5L instruments in a Spanish Parkinson’s disease population sample. Qual Life Res. 2014;23(4):1315–26.
Longworth L, Yang Y, Young T, Mulhern B, Hernández Alava M, Mukuria C, et al. Use of generic and condition-specific measures of health-related quality of life in NICE decision-making: a systematic review, statistical modelling and survey. Health Technol Assess. 2014;18(9):1–224.
Brazier J. Is the EQ-5D fit for purpose in mental health? Br J Psychiatry. 2010;197(5):348–9.
Mulhern BJ, Sampson C, Haywood P, Addo R, Page K, Mott D, et al. Criteria for developing, assessing and selecting candidate EQ-5D bolt-ons. Qual Life Res. 2022;31:3041–8.
Geraerds A, Bonsel GJ, Janssen MF, Finch AP, Polinder S, Haagsma JA. Methods used to identify, test, and assess impact on preferences of bolt-ons: a systematic review. Value Health. 2021;24(6):901–16.
Hungarian Central Statistical Office. Tehetünk az egészségünkért – ELEF2019 gyorsjelentés. https://www.ksh.hu/docs/hun/xftp/idoszaki/elef/te_2019/index.html. Accessed 11 March 2022.
American Psychiatric Association. Diagnostic and statistical manual of mental disorders. 5th ed. Washington, DC; 2013.
Jensen CE, Sørensen SS, Gudex C, Jensen MB, Pedersen KM, Ehlers LH. The Danish EQ-5D-5L value set: a hybrid model using cTTO and DCE data. Appl Health Econ Health Policy. 2021;19(4):579–91.
Roudijk B, Donders ART, Stalmeier PFM. Cultural values: can they explain differences in health utilities between countries? Med Decis Making. 2019;39(5):605–16.
Rencz F, Brodszky V, Gulácsi L, Golicki D, Ruzsa G, Pickard AS, et al. Parallel valuation of the EQ-5D-3L and EQ-5D-5L by time trade-off in Hungary. Value Health. 2020;23(9):1235–45.
Rencz F, Brodszky V, Janssen MF. A direct comparison of the measurement properties of EQ-5D-5L, PROMIS-29+2 and PROMIS Global Health instruments and EQ-5D-5L and PROPr utilities in a general population sample. Value Health. 2023 [In press]
Brazier J, Roberts J, Tsuchiya A, Busschbach J. A comparison of the EQ-5D and SF-6D across seven patient groups. Health Econ. 2004;13(9):873–84.
Janssen MF, Bonsel GJ, Luo N. Is EQ-5D-5L better than EQ-5D-3L? A head-to-head comparison of descriptive systems and value sets from seven countries. Pharmacoeconomics. 2018;36(6):675–97.
Janssen MFB, Birnie E, Bonsel GJ. Evaluating the discriminatory power of EQ-5D, HUI2 and HUI3 in a US general population survey using Shannon’s indices. Qual Life Res. 2007;16(5):895–904.
Shannon CE. A mathematical theory of communication. Bell Syst Tech J. 1948;27(3):379–423.
Shannon C, Weaver W. The mathematical theory of communication. Urbana: Univ. of Illinois Press; 1949. p. 104–7.
Janssen MF, Birnie E, Haagsma JA, Bonsel GJ. Comparing the standard EQ-5D three-level system with a five-level version. Value Health. 2008;11(2):275–84.
Koo TK, Li MY. A Guideline of selecting and reporting intraclass correlation coefficients for reliability research. J Chiropr Med. 2016;15(2):155–63.
Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet. 1986;1(8476):307–10.
Shrout PE, Fleiss JL. Intraclass correlations: uses in assessing rater reliability. Psychol Bull. 1979;86(2):420–8.
Cicchetti DV. Guidelines, criteria, and rules of thumb for evaluating normed and standardized assessment instruments in psychology. Psychol Assess. 1994;6(4):284.
Swinscow TDV, Campbell MJ. Statistics at square one. London: BMJ; 2002.
Cohen J. A power primer. Psychol Bull. 1992;112(1):155–9.
Rencz F, Janssen MF. Analyzing the pain/discomfort and anxiety/depression composite domains and the meaning of discomfort in the EQ-5D: a mixed-methods study. Value Health. 2022;25:2003–16.
Krabbe PF, Stouthard ME, Essink-Bot ML, Bonsel GJ. The effect of adding a cognitive dimension to the EuroQol multiattribute health-status classification system. J Clin Epidemiol. 1999;52(4):293–301.
Finch AP, Brazier JE, Mukuria C, Bjorner JB. An Exploratory study on using principal-component analysis and confirmatory factor analysis to identify bolt-on dimensions: the EQ-5D case study. Value Health. 2017;20(10):1362–75.
Finch AP, Brazier JE, Mukuria C. Selecting bolt-on dimensions for the EQ-5D: examining their contribution to health-related quality of life. Value Health. 2019;22(1):50–61.
Open access funding provided by Corvinus University of Budapest. The data collection was supported by the Higher Education Institutional Excellence Program 2020 of the Ministry of Innovation and Technology in the framework of the Financial and Public Services research project (TKP2020-IKA-02) at the Corvinus University of Budapest. AN’s work was supported by the Higher Education Institutional Excellence Program 2020 of the Ministry of Innovation and Technology in the framework of the Financial and Public Services research project (TKP2020-IKA-02) at the Corvinus University of Budapest. FR’s work was supported by the János Bolyai Research Scholarship of the Hungarian Academy of Sciences (BO/00304/21) and the New National Excellence Program of the Ministry for Innovation and Technology from the source of the National Research, Development and Innovation Fund (ÚNKP-22-5-CORVINUS-4).
Ethics approval and consent to participate
All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards. Ethical approval was obtained from the Research Ethics Committee of the Corvinus University of Budapest (no. KRH/166/2021). Informed consent was obtained from all participants included in the study.
Consent for publication
VB has nothing to disclose. AN reports grants from the Ministry of Innovation and Technology during the conduct of the study. MFJ reports personal fees from the EuroQol Research Foundation outside the submitted work. FR reports grants from Hungarian Academy of Sciences, grants from Ministry of Innovation and Technology during the conduct of the study and grants from EuroQol Research Foundation, outside the submitted work. MFJ and FR are active members of the EuroQol Group. Views expressed in the article are those of the authors and are not necessarily those of the EuroQol Research Foundation.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Nikl, A., Janssen, M.F., Brodszky, V. et al. A head-to-head comparison of the EQ-5D-5L and 15D descriptive systems and index values in a general population sample. Health Qual Life Outcomes 21, 17 (2023). https://doi.org/10.1186/s12955-023-02096-z