Utility value estimates in cardiovascular disease and the effect of changing elicitation methods: a systematic literature review

Objective Identify the most recent utility value estimates for cardiovascular disease (CVD) via systematic literature review (SLR) and explore trends in utility elicitation methods in the last 6 years. Methods This SLR was updated on January 25, 2018, and identified studies reporting utilities for myocardial infarction (MI), stroke, angina, peripheral artery disease (PAD), and any-cause revascularization by searching Embase, PubMed, Health Technology Assessment Database, and grey literature. Results A total of 375 studies reported CVD utilities (pre-2013 vs post-2013: MI, 38 vs 32; stroke, 86 vs 113; stable angina, 8 vs 9; undefined/unstable angina, 23 vs 8; PAD, 29 vs 13; revascularization, 54 vs 40). Median average utilities for MI, stroke, and revascularization increased over time (pre-2013 vs post-2013: MI, 0.71 vs 0.79; stroke, 0.63 vs 0.64; revascularization, 0.76 vs 0.81); angina and PAD showed a decrease in median values over time (stable angina, 0.83 vs 0.72; undefined/unstable angina, 0.70 vs 0.69; PAD, 0.76 vs 0.71). The proportion of utility estimates from trials increased across health states (pre-2013 vs post-2013: 22.5% vs 37.2%), as did the proportion of trials using the EuroQol Five Dimensions Questionnaire (EQ-5D; pre-2013 vs post-2013: 73.8% vs 91.4%). Use of methods such as the standard gamble, time trade-off, and Health Utilities Index has declined. Conclusions Health state utilities for cardiovascular health states have changed in the last 6 years, likely due to changes in the types of studies conducted, the patient populations evaluated, and possibly changing utility elicitation methods. The EQ-5D has been used more frequently.


Introduction
Cardiovascular disease (CVD) is the leading cause of mortality worldwide and imposes a significant clinical and economic burden on society. In 2016, CVD accounted for approximately 17.9 million deaths worldwide, representing 31% of all global cases [1]. In 2010, the estimated global cost of CVD was $863 billion, and it is estimated to rise to $1044 billion by 2030 [2,3]. CVD causes long-term disability, affecting the healthrelated quality of life (HRQoL) of patients [4].
Although conventional treatments reduce the risk of CVD, exploring new drugs that provide clinical and economic value-given the current clinical and economic burden-is an ongoing need. Economic evaluations, such as cost-effectiveness analyses (CEAs), are important for comparing new and existing treatments; they are used to determine a treatment's economic value and to demonstrate that value to patients, physicians, and third-party payers [5,6]. These analyses often inform healthcare reimbursement decisions [7]. The preferred outcome measure of one type of CEA-cost-utility analyses-is quality-adjusted life-years (QALYs) gained, where the quality of life adjustment is based on utility values [8]. In order to generate QALYs, health utilities are constructed with values that are usually anchored at 0 and 1, which represent the strength of preferences for health states (ie, 1 represents full health and 0 represents dead). Some methods allow for health states to be regarded as worse than death and have negative valuations [9,10].
Health state utilities are generated using direct methods, indirect methods, or a combination of the two. The main difference between these methods is that direct methods are used to elicit patients' preferences to health states, whereas indirect methods evaluate patients' current quality of life and apply population preferences to weight these scores to obtain a utility estimate [4,11]. Smith et al. 2013 previously conducted a systematic literature review (SLR) in 2012 (including studies published between September 1992 and September 2012) that found utility values were lower in patients who experienced cardiovascular (CV) events than in patients who did not [12]. Furthermore, the authors suggested that the utility estimates for each individual CV event varied greatly, likely due to differences in assessment methodologies and patient populations. The goal of this current systematic review was to update and expand upon the review by Smith et al., which evaluated utilities for myocardial infarction (MI), angina, and stroke to identify the most recent utility values for these health states, as well as revascularization and peripheral artery disease (PAD) and to gain insight into changing trends in utility elicitation methods and values, which can be used to inform/calculate QALYs. In addition, this review identified the methods used to elicit utilities and examine variability among utility values for a given CV health state, and how those values may be impacted by factors such as type of respondent, study design, and geographic location.

Search strategy
The methodology of this SLR update was consistent with the original SLR presented in Smith et al. [12]. The SLR was designed in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) standards.  vided in Supplementary Tables 1, 2, and 3. Only papers published in English, pertaining to humans, and indexed during the search period (September 1992 to January 2018) were eligible for inclusion. To supplement the database searches, grey literature (reports and conference abstracts) presented by relevant scientific organizations or health technology assessment (HTA) body websites within the last 6 years (2012-2018) were also searched using the same main keywords. Supplementary Table 4 provides an overview of the scientific conferences and HTA websites that were examined for this review.
To understand the changing trends in utility value estimates (median and interquartile ranges [IQRs]) and methods used for utility elicitation, studies from the previous review and updated SLR were compared. This was done by assessing trends in the last 6 years by stratifying the period as pre-2013 (1992-2012) vs post-2013 (2013-2018).

Study review and selection
All abstracts were manually reviewed by a single reviewer, who used prespecified inclusion and exclusion criteria (Supplementary Table 5; participants, interventions, comparators, outcomes, and study design [PICOS] criteria) to select primary studies and systematic reviews that reported utilities for CV health states. All papers accepted during abstract screening were reviewed in full text by 2 independent reviewers, who also used the same prespecified inclusion and exclusion criteria. Any discrepancies in the decisions were reviewed and resolved by a third, senior reviewer.
This review was not limited by the type of utility measure; however, simple visual analogue scale (VAS) methods, which represent a direct approach, were not considered as valid utility elicitation methods [15] and were excluded.

Data extraction
The following data and characteristics were captured from all articles included in the systematic review: publication year, study design, interventions (if applicable), country, CV health state, utility methods, utility values, population of respondents, and sample size. The studies on angina that did not specify whether the patients had stable or unstable angina, or reported on a mixed angina population, were grouped with the unstable angina studies. This was because the utility values for unspecified and unstable angina were similar to each other (average of 0.71 for both), whereas those for stable angina appeared to be slightly higher (average of 0.77).

Qualitative synthesis
Studies that were published before 2013 (1992-2012) were compared with studies published after 2013 (2013-2018) in terms of utility value estimates and utility elicitation methods in order to explore trends over time by means of a qualitative synthesis. As only a qualitative synthesis was prospectively planned, no statistical inference was performed. Additionally, due to the wide range and skewed utility value distributions in the identified studies, median and IQRs were generated from average utility values reported (as either mean or median) in the literature for both time periods.

Definition of minimally important differences
Although HRQoL is currently recognized as an important endpoint in clinical trials, the meaningfulness of HRQoL scores may not be apparent to patients, clinicians, or researchers [16]. Minimally important differences (MIDs) for health state utilities vary by measure and are not well established. It has been suggested that differences among health state utilities of at least 0.05 can be considered clinically important [17].

Results
A total of 11,035 citations were identified across the databases. After removing duplicates, 8768 unique citations were eligible for abstract screening. Of these, 1905 were included for full-text screening, during which 1549 articles further were excluded, as described in the PRISMA diagram ( Fig. 1). In total, 375 publications reported qualitative and quantitative utility values in MI, stroke, angina, revascularization, and/or PAD and were included in the SLR.  Table 2 presents the estimated median (IQR) utility values by method of elicitation, type of disease, and publication year. When looking across CV events, regardless of the utility measure used, the median utility values were lowest for stroke compared with the other CV events of interest. When comparing median (IQR) utility These are interview-based and used to capture values that patients or the general public assign to a health state [13]. During the interviews, individuals (patients or members of the general public) identify their preferences for either their current health or scenarios (also called vignettes) that describe various health states by engaging in choice-based tasks [4].

Trends in utility values over time by type of CV event
Indirect methods • HUI mark 2 and 3 • EQ-5D • SF-6D These questionnaires typically evaluate domains such as disability, mental health, and pain.
Responses are converted to utilities by means of "tariffs" or "weights." Published tariffs are used to weigh the scores of each domain based on the importance of that domain to that population or country. Tariffs are available as a result of separate and previous exercises in which various possible health states have been calibrated by means of a trade-off, SG, or well-known preference-based methods, such as EQ-5D, HUI mark 2, and SF-6D, from a sample of the general population [14]. The indirect measures differ in what dimensions their questionnaires include, how many response levels each question has, and the direct valuation method used to create the tariff.   Figure 2 presents the distribution of utility values for MI and stroke. Most studies reported utilities between > 0.7 and 0.8 for MI in both time periods and values from > 0.6 to 0.7 for stroke. These findings varied by method of utility elicitation (Fig. 3). Across CV health states, utilities measured by the EQ-5D and the HUI typically improved over time when comparing studies published before 2013 to those published after 2013, whereas utilities measured by the direct methods (SG and TTO) consistently worsened over time. However, the HUI, SG, and TTO were only used by a limited number of studies within the last 6 years.

Trends in utility values over time by method of utility elicitation
When comparing direct elicitation methods with indirect elicitation methods across CV health states, direct methods yielded, on average, the highest utilities within studies published prior to 2013; in contrast, direct methods yielded the lowest utility value estimates among studies published after 2013. This was largely driven by a reduction in the average utilities generated from direct methods over time (pre-2013 vs post-2013, respectively: 0.85 vs 0.50), as the change in utilities generated via indirect methods was less extreme (0.69 vs 0.73).     , which was driven primarily by its increased incorporation for estimating utilities for MI, stable angina, PAD, and revascularization, as its use has actually decreased for stroke and unstable/unspecified angina.
The increase in use of the EQ-5D coincides with an increase in trials generating utility data ( Supplementary  Fig. 1). The proportion of utility estimates coming from trials increased from 22.5% (pre-2013) to 37.2% (post-2013) across all CV health states. Prior to 2013, prospective cohort studies were the most commonly used study design to derive utilities from CV health states (35.8%), whereas over the last 6 years this decreased to 31.4%. Trials were more likely to measure utilities via the EQ-5D, with 31 of 42 (73.8%) trials and 93 of 145 (64.1%) other  using the EQ-5D. The increase in use of the EQ-5D can be attributed not only to the increase in the proportion of trials publishing utilities, but also that trials currently use the EQ-5D more often, with 64 of 70 (91.4%) trials publishing CV utility values in the last 6 years using the EQ-5D. An increase in the use of the EQ-5D among surveys and prospective cohorts was also observed in the last 6 years, with 25 of 39 (64.1%) surveys and 54 of 59 (91.5%) prospective cohorts publishing CV utility values. The use of direct methods declined specifically for studies eliciting utilities from patients, but otherwise remained steady (Supplementary Fig. 2). The proportion of studies using indirect methods that evaluated respondents other than patients (general population, caregivers, or mixed) increased slightly from 1.5% among studies published before 2013 to 2.8% among more recent studies. However, it should be noted that only a few studies elicited values from respondents other than patients. Across CV health states, most studies that reported utilities derived them from patients in both time periods

Trends in utility over time by geographical region
Across CV health states and in both time periods, most studies were conducted in Europe, followed by the United States (US) and Canada ( Supplementary  Fig. 3). There was a decrease in studies conducted in the US and Canada in the last 6 years (pre-2013 vs post-2013, respectively: 28.3% vs 14.4%). In contrast, more utility studies emerged from Asia. Among the studies conducted in Asia, the vast majority of data reported were for stroke, and there were a limited number of studies reporting utility values for the other CV health states.

Discussion
The goal of this SLR was to update and expand upon the review conducted by Smith et al. [12] to identify the most recent utility values for stroke, MI, angina, PAD, and revascularization and to gain insight into changing trends in utility values over time and corresponding elicitation methods. The results of the SLR are summarized qualitatively, to depict the variation in utility values observed across studies in this broad SLR. The decision not to conduct a meta-analysis was further confirmed by a 2015 review of utility value meta-analyses, which found substantial differences when direct vs indirect methods were compared, and noted that meta-techniques may not be appropriate given substantial heterogeneity among utility methods [18]. In another systematic review on the EQ-5D utility values in CVD, the authors attempted to conduct a meta-analysis but deemed it was inappropriate to further estimate pooled utility scores via meta-analytic techniques due to the substantial observed heterogeneity with respect to both study design and patient characteristics. Given this observed heterogeneity, effect sizes obtained via meta-analytic techniques are not generalizable to other methods or health states [19].
This SLR reports utility values consistent with values used in several large-scale economic evaluations in CVD [20,21], in particular for MI and stroke. For example, in the study conducted by Ara et al. [20], the authors used the Health Surveys for England (HSEs) conducted in 2003 and 2006 to elicit EQ-5D scores for stroke, heart attack, and angina; the mean EQ-5D score for patients with heart attack was 0.74, which was in line with the median ( In comparing the 187 CV utility studies published during or before 2012 with the 188 published during or after 2013, we observed changes in recent years with respect to the actual values being published in CV utility studies; average utility values for MI, stroke, PAD, and revascularization increased over this period, whereas utilities for angina declined. This likely represents changes in the types of populations and health states being measured. Improvements in healthcare over time may have also contributed to the observed changes. Disease characteristics, disease severity, and duration of disease all contribute to substantial variation in utility scores [12,22]. As we did not control for sample characteristics, the higher utility estimates observed for several instruments could have been influenced by the population under evaluation rather than the specific utility method. These changes in recent years underscore the necessity of selecting utility values that precisely represent the health state of interest for a cost-utility analysis. However, the observed changes in CV values, particularly the increases observed for MI and stroke, may be confounded by changes in the methods used. This review found that the EQ-5D is the most common measure across types of CV health states, its use is increasing, and it yielded higher utilities than direct methods in the last 6 years. In the last 6 years, the average values for indirect measures have risen, whereas the average values for direct methods have declined. This increase in the use of the EQ-5D appears to be related to the general increase in trials estimating utility values; however, even among trials, the EQ-5D was utilized more frequently in the last 6 years. The National Institute for Health and Care Excellence (NICE) and other payers, such as the Scottish Medicines Consortium (SMC), recommended the EQ-5D as the preferred utility in the reference case for cost-utility analyses in 2004 (to encourage comparability across studies) [23] and clarified recommendations in 2013 [24], and it is possible that these trends reflect increasing uptake of those recommendations in the last 6 years. It is also likely that trials are measuring utility values more often, as the need for cost-utility assessment, and consequently utility values representing the precise patient population in question, has grown in the current health reimbursement market. Our review also observed a substantial increase in the number of CV utility studies conducted in Asia. This likely reflects the growing HTA trend in this part of the world [25,26].
Furthermore, our review observed that the implementation of direct utility elicitation methods has declined substantially in CVDs in recent years. This could be due to the ease of implementation and lower cost of a standardized questionnaire, such as the EQ-5D, compared with direct methods, which often require bespoke design. In addition, it is likely that investigators are also relying more frequently on the EQ-5D as they have become increasingly comfortable with the validity of the measure. However, given that indirect methods represent an estimation of utility values and do not measure the patient preference, direct methods should still be considered a valuable tool. We did not include VAS utility measures in our SLR because of their potentially limited use for measuring preferences for health states [15]. Moreover, others have raised concerns regarding the validity of VAS for this purpose [27][28][29].
The differences in utility values across methods observed in our review is well documented in the literature [30][31][32][33][34][35]. However, the relationship between direct and indirect utility measures has not been as thoroughly documented. There is a widely held impression among health economists that direct methods tend to yield higher utilities (reflecting better reported health) for given health states compared with indirect methods, regardless of the type of direct or indirect method used (eg, TTO vs SF-6D or EQ-5D vs SF-6D) [36][37][38][39][40][41][42]. Different methods of utility elicitation can result in varying scores, even for the same patient population assessed. For example, Hallan et al. [43] used both SG and TTO methods and found significantly higher scores for both minor and major stroke health states using SG compared with TTO assessments. Utility scales also vary in sensitivity, which may further hinder comparisons of utility values across measures. For example, the EQ-5D index score has been shown to have a ceiling effect, and the SF-6D has been observed to have a floor effect [31,44,45]. However, the EQ-5D scale has been reported to be more sensitive than the SF-6D in monitoring values for HRQoL, particularly at the lower end of the scale for patients with chronic obstructive airways disease, osteoarthritis, irritable bowel syndrome, lower back pain, or leg ulcers, and for postmenopausal women and healthy elderly individuals (aged 75+ years) [31]; this trend was also observed in the current SLR, although this is not usually the case for CVD. In addition, the HUI focuses on physical and emotional health and does not include questions on social functioning or satisfaction [46].

Conclusion
This review found that health utility values for MI, stroke, angina, PAD, and revascularization have changed substantially when comparing different time periods (pre-2013 vs post-2013), likely due to changes in the types of studies being conducted (increase in trials eliciting utilities) and the patient populations being evaluated (in particular, changes in disease severity and duration of disease). Changing utility methods may also partially explain the observed changes in utility values. The EQ-5D has been used more frequently, with an increasing number of trials using this measure. Additionally, an increasing number of studies in Asia estimating CV utilities has been observed. With varying values observed across utility methods used and populations, care should be taken when choosing utility values to use in economic evaluations of new technologies. Future analyses that assess changes in utilities by duration of disease and/or treatment could be useful to identify any trends for patients with early vs late stage disease and help inform the choice of utility values for use in economic evaluations of new cardiovascular technologies.