Estimating EQ-5D and OAB-5D health state utilities for patients with overactive bladder

Background Limited utility data on patients suffering from overactive bladder (OAB) are available in the literature. The objectives of this study were to estimate utility values in patients with OAB using the generic EQ-5D questionnaire and the OAB-5D disease specific questionnaire, to investigate the relationship between utilities and symptoms, and to evaluate the sensitivity of the two instruments to changes in symptom severity. Methods Analyses were based on pooled data from three large multicenter randomized 12-week placebo-controlled trials (SCORPIO, ARIES, CAPRICORN). Patients completed a micturition diary, EQ-5D and OAB-q (a quality of life questionnaire from which OAB-5D is derived) at baseline and at weeks 4, 8 and 12. Time trade-off tariffs elicited from UK population were applied to obtain utilities from both instruments. Repeated measures regressions were used to estimate EQ-5D and OAB-5D utilities by micturition frequency and incontinence severity level. As a test of sensitivity of the instruments, utility changes from baseline to week 12 were estimated by symptomatic response (improvement, stable or worsening). Results The sample included 4427 patients. Mean utilities (± standard deviation) across all visits were 0.82 (±0.21) for EQ-5D and 0.86 (±0.09) for OAB-5D. Correlation between EQ-5D and OAB-5D was 0.34 (p < 0.0001). Both OAB-5D and EQ-5D utilities increased as OAB symptoms improved. Utility values were similar for severe levels of symptoms, but higher with OAB-5D than with EQ-5D for mild cases. Micturitions and incontinence had similar impact on EQ-5D utilities, but micturitions had greater impact on OAB-5D utilities than incontinence. Changes from baseline in OAB-5D utilities differed significantly according to symptomatic response. Changes in EQ-5D utilities were not significantly associated with changes in micturition frequency and weakly associated with changes in incontinence severity among patients with mild symptoms at baseline. Conclusions This study showed that both EQ-5D and OAB-5D can detect changes in severity of OAB, especially in severe cases. However, OAB-5D is more sensitive than EQ-5D in measuring differences between treatments in milder cases. Both OAB-5D and EQ-5D–although leading to different results–may be useful to derive utilities from clinical trial data and perform cost-effectiveness analyses. Trial registration Clinical Trials NCT00689104, NCT00662909, NCT00912964.


Background
Overactive bladder (OAB) is a chronic condition defined by the International Continence Society as urinary urgency, with or without urgency incontinence, usually with increased daytime frequency and nocturia [1]. Urgency is a sudden, compelling and difficult to defer desire to pass urine. Symptoms can have significant impact on a person's health related quality of life (HRQoL), interfering with daily routines, causing embarrassment as well as lowered self-esteem. OAB is also associated with an increased risk and prevalence of co morbidities, including falls and fractures, urinary tract and skin infections, and depression [2].
Previous reports from the European Prospective Investigation into Cancer and Nutrition (EPIC) study estimated the overall prevalence of OAB in Europe and Canada to be 12.8% among women and 10.8% among men [3].
Estimates of health utilities associated with different states of OAB are required for economic evaluations of OAB treatments. Utilities represent individual's preferences for specific health states. They are measured on a scale where 0 represents death and 1 perfect health. According to the methodological guidelines of the UK National Institute for Health and Clinical Excellence (NICE) [4] and the US Panel on Cost-Effectiveness in Health and Medicine [5], quality adjusted life years (QALYs) should be obtained by assigning values elicited from the general public, to health states described by persons with OAB, by means of a generic classification of health states. NICE specifically advises the utilisation of EQ-5D, a widely used generic HRQoL instrument. Sets of utilities were generated for EQ-5D health states in many countries using choice-based methods, such as time trade-off (TTO) or standard gamble.
Only few data on utilities in OAB are available. EQ-5D TTO utilities were obtained in a Swedish willingness to pay survey among patients with urge incontinence [6,7]. This survey showed a correlation between EQ-5D utility and urinary symptoms, measured as the sum of the numbers of micturitions and leakages over one day. However, Yang et al. considered OAB as a condition for which the generic dimensions of EQ-5D might be irrelevant or insensitive for capturing small though important clinical changes [8]. EQ-5D has a number of potential limitations, such as the lack of sensitivity in specific disease contexts [9][10][11].
To overcome these limitations, Yang et al. developed a 5-dimensional preference-based disease-specific measure, named OAB-5D [see Additional file 1] and derived from the Overactive Bladder Questionnaire (OAB-q) [8,12]. The OAB-q is a validated and widely used questionnaire to assess symptom bother as well as HRQoL in OAB patients [13][14][15]. Even though OAB-q was not originally designed to measure utilities, Yang et al. built a model for deriving utilities from the 5 items forming the OAB-5D system, based on a survey in the UK general population, using the TTO method.
The objectives of this study were to estimate utility values in patients with OAB using the generic EQ-5D questionnaire and the OAB-5D disease specific questionnaire, to investigate the relationship between utilities and symptoms, and to evaluate the sensitivity to the changes in OAB symptoms of both preference-based instruments based on data from three pooled clinical trials.

Patients and study design
This post-hoc analysis was based on pooled data from three 12-week, randomized, double-blind, parallel group, placebo controlled, multi-centre studies (SCORPIO [16], ARIES [17] and CAPRICORN [18]) assessing the efficacy against placebo of mirabegron, a first in class beta 3 adrenoreceptor agonist, in the treatment of patients with symptoms of OAB. The three clinical trials had similar inclusion and exclusion criteria. Four thousand six hundred and twenty-two (4622) patients were randomized in the three trials conducted across Australia, Europe, United States of America and Canada. Male and female patients ≥18 years of age were screened for enrollment into the studies if they had OAB symptoms for at least 3 months. Patients were excluded if they had clinically relevant stress incontinence or mixed stress/urgency incontinence with stress as the predominant factor; an indwelling catheter; evidence of a symptomatic urinary tract infection, chronic inflammation, bladder stones, previous pelvic radiation therapy, previous or current malignant disease of the pelvic organs; or severe hypertension. At baseline, patients must have experienced an average of ≥8 micturitions/24 hours and ≥3 urgency episodes with/without incontinence over a 3-day period. Patients underwent a 2-week placebo run-in period followed at randomization by a 12-week phase where 1384 patients were administered placebo, 433 patients mirabegron 25 mg, 1379 patients mirabegron 50 mg, 931 patients mirabegron 100 mg, and 495 the anti-muscarinic tolterodine. A 3-day micturition diary, a generic HRQoL questionnaire (EQ-5D) and a disease-specific HRQoL questionnaire (OAB-q) were completed at baseline and weeks 4, 8 and 12. These analyses were based on pooled full analysis sets, which consisted of all randomized patients taking at least one treatment medication dose.

Clinical measures and severity of symptoms
Patients completed a micturition diary over the 3 days preceding each study visit. From the diary, the following endpoints were derived: mean number of micturitions per 24 hours, mean number of urgency episodes (grade 3 and 4) per 24 hours, mean number of incontinence episodes per 24 hours and mean volume voided per micturition. For micturitions frequency and incontinence episodes, 5 severity levels were defined roughly based on the distribution (quintiles) of event observed in the pooled trials, using all visits combined. For the severity incontinence, class 1 (less severe) was composed of continent patients (no incontinence episode reported in the 3-day micturition diary).

Preference based instruments
The EQ-5D instrument consists of 5 questions, relating to mobility, self-care, usual activities, pain/discomfort and anxiety/depression, with 3 possible answers for each item, corresponding to "no problem", "moderate problems" or "extreme problems". It can be used to calculate a utility index score on a scale on which 0 represents a health state equivalent to death and 1 represents full health. The UK TTO tariff was used to derive EQ-5D utilities in this study [19].
The OAB-q is a 33-item questionnaire consisting of an 8-item Symptom Bother scale and a 25-item HRQoL scale that has four subscales: coping, concern, sleep, and social interaction [13]. Responses are based on a six-point Likert scale. The OAB-q has demonstrated good internal consistency, reliability, test-retest reliability, concurrent validity, discriminate validity, and responsiveness to treatment-related change [13][14][15].
To value OAB health states, five items were chosen from the OAB-q to construct a new health state classification system (OAB-5D) namely: urge to urinate, urine loss, sleep impact, coping strategy, and concern with OAB, where each dimension had five levels of severity with level 1 denoting no problem and level 5 indicating an extreme problem [8]. A total of 3125 health states can be defined by the OAB-5D classification system (compared to 243 with EQ-5D), and all OAB-q data which contain these five items can be mapped to a specific OAB-5D health state.

Statistical analysis
Utility scores derived from EQ-5D and OAB-5D were first described as means and standard deviations and then estimated by symptom severity. The Spearman correlation coefficient between EQ-5D and OAB-5D utilities was estimated.
For each type of symptom-micturition frequency and incontinence episodes-a linear regression model was used to estimate EQ-5D and OAB-5D utilities by symptom level, adjusting for gender, age and study as fixed effects, and geographical region as random effect. The correlation between utility values for one individual at different assessment visits was taken into account by means of a random patient effect. Using a similar method, differences between OAB-5D and EQ-5D utilities were estimated by symptom level, and tested for the null hypothesis of equal mean OAB-5D and EQ-5D utilities.
In addition, repeated measures linear models were used to predict mean OAB-5D and EQ-5D utilities according to severity levels of the 2 symptoms, with study identifier and several clinical and socio-demographic covariables. The listed covariates were removed from the models if not significant at 5%: gender, age, race, body mass index (BMI), type OAB (urgency incontinence, frequency or mixed), prior OAB surgery, previous OAB drug and duration of OAB symptoms.
Tobit and beta regression models were also estimated with repeated measurements random effect and the same covariates as above mentionned. Minimizing the Root Mean Squared Error (RMSE) was used as criterion to define the best model.
The purpose of this model was to provide a way to derive utilities from the micturition diary data, which are usually collected in trials of OAB treatments.
Finally, as a test of sensitivity of the instruments, utility changes from baseline to week 12 were estimated according to response in symptoms, defined as 3 possible items: "improvement higher than 1 level of symptoms", "stable" or "worsening higher than 1 level of symptoms". Linear models with covariables gender, age, study and response as fixed effects, and geographical region as random effect were estimated to provide adjusted means (SD) of utility changes from baseline to week 12 by response level. Models were run for overall population, patients with mild symptoms at baseline (level 1 to 3 in micturitions frequency for response in micturitions frequency, or level 1 to 3 in incontinence episodes for response in incontinence) and patients with severe symptoms at baseline (level 4 or 5 in micturitions frequency, or in incontinence episodes).
Mean EQ-5D and OAB-5D utilities steadily decreased with increasing symptom severity. OAB-5D provided higher values than EQ-5D, whereas SD estimates were lower for OAB-5D ( Table 2). Ranges of utilities from the most severe symptom level to the least were slightly larger for OAB-5D (0.10 for both micturitions and incontinence versus 0.07 and 0.09 for micturitions and incontinence levels, respectively for EQ-5D).
Adjusted estimates of utility by symptom frequency level are presented in Figure 1. For severe levels of micturition frequency (4 and 5), difference in utilities between the two instruments was not statistically significant. However, in the lower levels of severity (1 and 2), utilities generated by OAB-5D were higher and differences with EQ-5D were statistically significant (p < 0.0001) ( Table 2).
Similar trends were observed for incontinence ( Figure 1B). Utility values were not significantly different between EQ-5D and OAB-5D for the most severe levels of incontinence (3, 4 and 5). Differences in utilities between the two instruments became highly significant in less severe levels (1 and 2) (p-value < 0.001) ( Table 2).  Equations to predict EQ-5D and OAB-5D utilities from symptom levels Age, gender, BMI, previous OAB treatment (yes/no) and study identifier were selected as adjustment variables in the regression models. For EQ-5D utility modeling, the estimated RMSE for the linear model, the Beta model and the Tobit model were respectively 0.20829, 0.23819 and 0.26237. For OAB-5D utility modeling, the resulting RMSE for the linear model, the Beta model and the Tobit model were respectively 0.08000, 0.09245 and 0.09085. The linear model was consequently selected as the best model for both utility scores.
The estimated regression coefficients confirmed that utilities decreased with symptom severity after adjusting for patient characteristics (Table 3). Micturitions and incontinence appeared to have similar impact on EQ-5D, but micturitions had greater impact on OAB-5D utilities than incontinence. Improvement of incontinence level from 5 to 1 (i.e. from >3 incontinence episodes per day to 0) was associated with an increase of 0.052 in EQ-5D utility (p < 0.0001) and 0.076 in OAB-5D utility (p < 0.0001), at equal micturitions symptoms. Similarly, improvement of micturitions frequency from level 5 to 1 (i.e. from >7 episodes per day to to ≤1) was associated with an increase of 0.056 in EQ-5D utility (p < .0001) and 0.088 in OAB-5D utility (p < 0.0001), at equal incontinence symptoms.
Utility values were estimated by the model for the least severe OAB state (level 1 for both symptoms) and the most severe (level 5 for both symptoms) OAB state and for average age (59.4 years), average BMI (28.9 kg/m2), gender (71.8% female), previous OAB treatment (52.0%) and study (43.1% SCORPIO, 28.7% ARIES, 28.3% CAPRICORN). These were respectively 0.870 and 0.762 based on EQ-5D (a difference of 0.108) and 0.923 and 0.759 based on OAB-5D (difference of 0.164).
The proportions of variance explained by the models (R 2 ) were 26.1% for OAB-5D and 5.1% for EQ-5D.

Responsiveness of HRQoL and clinical symptoms
More than half of the patients (56.4%) reported an improvement of at least 1 level in micturitions while 31.1% of them remained stable (all treatment arms combined). For incontinence episodes, patients who experienced improvement and those who remained stable represented similar proportions (46.3% and 45.6% respectively). Symptom worsening occurred in respectively 12.5% and 8.1% of patients for micturitions and incontinence episodes (Table 4A).
In the overall population, mean changes from baseline in utility were positive in all groups of patients, those with improving, stable or worsening symptoms, and were greater in responders than in non-responders. There were no significant differences in change from baseline utility between worsening response and stable response for EQ-5D (p-values of 0.7291 and 0.0662 for micturitions and incontinence episodes respectively). However, differences became significant when using OAB-5D (p-value < 0.0001 for each symptom) (Table 4A).
In patients with mild micturition frequency (levels 1 to 3), the change from baseline in EQ-5D utility did not significantly differ between improvement and stable response (p-value of 0.3518) (Table 4B). When using OAB-5D, differences in utility were significant between improvement and stable response (p-value < 0.0001) as well as between worsening response and stable response) (p-value < 0.0001).
In severe cases, changes in micturition frequency and incontinence episodes were significantly associated with changes in EQ-5D utility. Thus, differences were significant between improvement and stable response (p = 0.0004 for both symptoms). Similar results were observed for OAB-5D (p < .0001 for both symptoms). No significant differences were observed between worsening response and stable response for both EQ-5D and OAB-5D (p-values of 0.3865 and 0.7158 for micturitions and incontinence episodes, respectively) (Table 4C).
Overall changes in utility from baseline values were greater with OAB-5D for all symptoms, in both improving and stable patients. In addition, differences in utility changes between improving and stable patients were higher with OAB-5D for each symptom (+0.037 with OAB-5D and +0.019 with EQ-5D for micturitions; +0.039 and +0.025 for incontinence) (Table 4A).

Discussion
This study demonstrated that both utilities derived from EQ-5D and OAB-5D increased as OAB symptoms improved. Utilities were similar between the two instruments for severe levels of symptoms, but OAB-5D utilities were greater than EQ-5D utilities for lower levels of severity. Thus, the OAB-5D had a greater range of variation according to symptom severity. It is not surprising that, on average, EQ-5D utilities are lower than OAB-5D utilities since the generic instrument EQ-5D captures effects of patients' co-morbidities on HRQL. The fact that incremental utilities between symptom levels are different between EQ-5D and OAB-5D is more noteworthy, considering that both instruments are expected to provide values on the same scale (where 1 represents perfect health and 0 a health state equivalent to death),  and both sets of values are based on the time trade-off. The correlation between OAB-5D and EQ-5D in general population was moderate (r = 0.34; p < 0.0001). This is likely attributable to the noise around EQ-5D utilities, corroborated by the larger standard deviation around EQ-5D utilities, which may reflect natural variations in health status not related to OAB. UK population norms for EQ-5D from a nationally representative interview survey of 3395 men and women aged 18 or over living in the UK were 0.81 (±0.26) for females and 0.78 (±0.28) for males aged from 55 to 64 years [20]. In the US, population norms for EQ-5D were 0.84 (SE of 0.01) for females and 0.86 (SE of 0.01) for males, in the same age group, 55 to 64 years [21]. In the studied population, the mean EQ-5D utility, based on UK TTO tariff, was 0.84 (±0.21). It is thus slightly higher than general population.
In a cost-effectiveness model evaluating tolterodine compared to no treatment in Sweden, Kobelt et al. [22] used a Markov model with five health states according to the number of micturitions and leakages per day, where state 1 was considered "mild" and state 5 "severe". They obtained utility weights for the states by a linear regression analysis of the correlation between urinary symptoms and EQ-5D scores in a Swedish willingnessto-pay survey [7]. study, a model was provided to predict EQ-5D utilities for different health states, defined according to frequency of micturitions and incontinence episodes. EQ-5D utilities were higher, overall, ranging from 0.762 for the most severe state to 0.870 for the least severe states, but the difference in utilities of 0.108 between the two extreme OAB states was comparable to that reported by Kobelt et al. In addition, an equation was provided to generate utilities for OAB states based on OAB-5D, with adjustment on age and gender. The two equations, based on EQ-5D and OAB-5D, can be used to derive utilities from clinical trials for which no utility instrument is provided. It should be noted that the predictive power of the EQ-5D equation is very low, whereas it appears acceptable for OAB-5D. This implies that a very large number of patients would be required to detect a difference in utility between different groups of patients using the EQ-5D equation. Another limitation of those regression models is that they do not capture potential effects of non-urological dimensions. Measuring the impact of adverse events on utilities using trial data was difficult since the periods during which patients experienced adverse events rarely coincided with dates of assessment visits. Furthermore, the OAB-5D system has no dimension related to the impact of adverse events of OAB treatments on HRQoL, whereas the EQ-5D, as a generic instrument, is expected to capture deteriotations in HRQoL related to adverse events.
An important finding based on the regression analyses performed in this study is that micturition frequency does affect the utility scores, even when adjusting on incontinence episodes. The interaction between micturition frequency and incontinence episodes was found not to be significant within the utility linear regression models (p = 0.1283), meaning that the numbers of micturition frequency and incontinence episodes have effects on utilities independently of each other. The effects of different symptoms were then assumed to be additive.
Both instruments were able to differentiate improving patients from stable patients. However the change in EQ-5D utility did not differ significantly between patients with improving or stable micturitions symptoms, among those with mild micturition frequency at baseline. Moreover, EQ-5D did not detect any significant difference between patients with stable symptoms and those with worsening symptoms in overall population. These findings highlight the greater responsiveness of OAB-5D as compared to EQ-5D, especially for patients with mild OAB. Additionnaly, this analysis revealed that utilities improved in stable patients and even in patients with worsening symptoms, among those with mild symptoms at baseline. Possible explanations for this unexpected trend would be that patients felt quite unwell at the time of entering the trial, and might have learned to better cope with their condition over time.

Conclusions
In conclusion, this study showed that both EQ-5D and OAB-5D can detect changes in severity of OAB, especially in severe cases. Estimated differences in utilities between different OAB states are greater with OAB-5D, thus the choice of the instrument matters. Using EQ-5D to measure utilities has the advantage to generate values that can be compared to other conditions, and it can capture the impact of adverse events. However, large samples of patients are required to detect differences between different treatments or patient groups. EQ-5D may not be as sensitive as OAB-5D in measuring differences between treatments in milder cases of OAB. Contrary to EQ-5D, OAB-5D detects the difference between patients with worsening and stable symptoms. OAB-5D and EQ-5D can both be useful to derive utilities from clinical trial data and perform cost-effectiveness analyses in OAB, although they will lead to different incremental costeffectiveness ratios.