Skip to main content

Development of the Incontinence Utility Index: estimating population-based utilities associated with urinary problems from the Incontinence Quality of Life Questionnaire and Neurogenic Module



Generic utility instruments may not fully capture the impact and consequences of urinary problems. Condition-specific preference-based measures, developed from previously validated disease-specific patient-reported outcomes instruments, may add relevant information for economic evaluations. The aim of this study was to develop a condition-specific preference-based measure, the Incontinence Utility Index (IUI), for valuing health states associated with urinary problems.


A two-step process was implemented. First, an abbreviated health state classification system was developed from the Incontinence Quality of Life Questionnaire (I-QOL) and Neurogenic Module by applying Rasch modelling, classical psychometrical testing and expert criteria to data from two pivotal trials comprised of neurogenic detrusor overactivity (NDO) patients. Criterion, convergent validity and concordance with the original instrument was assessed in the abbreviated version. Then, a multi-attribute utility function (MAUF) was estimated from a representative sample of the UK non-institutionalized adult general population. Visual analogue and time-trade off (TTO) evaluations were applied in the elicitation process. Predictive validity of the MAUF was tested comparing estimated and direct utility scores.


The abbreviated health state classification system generated from the NDO sample contained 5 attributes with 3 levels of response and had adequate psychometrical properties: significant differences in scores according to the reduction in the frequency of urinary incontinence episodes [UIE] (p < 0.001); Spearman correlation coefficient with number of daily UIE = −0.43; p < 0.01 and Intraclass Correlation Coefficient (ICC, 95% CI) with the original version = 0.90 (0.89-0.91; p < 0.001). Next, 442 participants were interviewed (398 cases were valid, generating 2,388 TTO evaluations) to estimate the social preferences for derived health states. Mean age was 44.75 years (interquartile range 33.5-55.5) and 60.1% were female. An overall algorithm for the IUI was estimated and transformed onto a dead = 0.00 and full health = 1.00 scale. Model fits were acceptable (R-squared = 0.923 and 0.978). Predictive validity was adequate: ICC (95% CI) = 0.928 (0.648-0.985) and Mean of Absolute Differences = 0.038.


The newly developed IUI is a preference-based measure for urinary problems related to NDO that provides general population-based utility scores with adequate predictive validity.

Trial registration NCT00461292, NCT00311376.


Urinary problems, particularly when accompanied with urinary incontinence (UI), have been shown to significantly impact different domains of health-related quality of life (HRQoL) such as emotional well-being, performance of daily activities and social interaction [1], and have also been associated with economic burden [2],[3] and lower productivity [1],[4]. Neurogenic detrusor overactivity (NDO) is an etiology of UI that is caused by conditions such as multiple sclerosis (MS) or spinal cord injury (SCI). Detrusor overactivity is an involuntary bladder contraction during the filling phase of cystometry [5]. As a result of a disruption in the regulation of the micturition reflex, NDO patients frequently suffer from urinary symptoms including urgency and urinary urgency incontinence, which negatively affect their HRQoL [6],[7].

A practical approach to evaluating the health states derived from a disease is through administration of existing generic preference-based instruments such as the EQ-5D [8],[9], the Health Utility Index –Mark 2 or Mark 3- (HUI2 or HUI 3, respectively) [10],[11] or the SF-6D [12],[13]. These instruments are suitable across patient populations, regardless of the disease, allowing investigators to describe and compare important aspects of HRQoL and produce preference-based or utility scores. Although there is evidence to suggest that there is an underlying basic construct measured by the three generic instruments, it is well established that they produce different values and are not interchangeable [14]-[18]. Furthermore, there is controversy about their discriminative ability and sensitivity to detect clinically important changes in varying patient populations and consequently, these measures may not be the best choice for certain conditions [19], including urinary incontinence-related problems [20]-[23].

There are a number of condition-specific instruments available for patients with lower urinary tract symptoms and UI. Some commonly-used measures in clinical trials and outcomes research are the Overactive Bladder Questionnaire (OAB-q) [24], the King's Health Questionnaire (KHQ) [25] and the Incontinence Quality of Life Questionnaire (I-QOL) [26]-[28]. These instruments have good psychometric properties in terms of reliability, construct and discriminant validity, and responsiveness [6],[24],[27],[29]. Recently, new efforts have been focused on estimating utilities related to health states derived from these tools by means of surveying different samples of patients or general population from Europe and the US [30]-[32]. However, out of all these measures, only the I-QOL questionnaire includes a specific module for NDO patients developed from the needs-based model [26],[27]. In addition, the validity of the I-QOL has been demonstrated in patients with neurogenic urinary incontinence [6]. Consequently, the overall aim of this research was to generate a preference-based measure from the I-QOL and its neurogenic set, the Incontinence Utility Index (IUI), by means of surveying a representative sample of the general population. This new instrument would represent a more comprehensive measure for valuing health states associated with urinary problems from a range of different etiologies.



The I-QOL Questionnaire and Neurogenic Module

The I-QOL is a self-administered disease-specific instrument comprised of 22 questions (5 point Likert scale) addressing three main domains: Avoidance and Limiting Behavior, Psychosocial Impact, and Social Embarrassment [26],[27]. A global scale score is obtained by summing up the responses to all items and transforming the raw total score to a 0–100 scale (0-worst/100-best HRQoL). As noted before, the I-QOL was developed and validated among patients with stress UI and overactive bladder (OAB) [27] and has since been successfully tested on other patient populations, such as NDO patients [6] or patients with urgency UI who had not been adequately managed with anticholinergic therapy [28]. The additional module for patients with neurogenic bladder consists of 5 items about limiting caffeine drinks, worry about long-term effects of catheterization, accessibility and privacy in public toilets, bother associated with catheterizing, and bother associated with the use of pads or diapers.

A 2-stage process was used to develop the IUI from the I-QOL. The first stage was to use Rasch analysis and classical psychometrical tests to derive an abbreviated health state classification from the I-QOL and Neurogenic Module that is suitable for preference elicitation. The second stage was to conduct a preference elicitation survey to allow the estimation and validation of a multi-attribute utility function (MAUF) for the IUI.

Deriving an abbreviated health state classification system from the I-QOL and the neurogenic module

Multi-Attribute Utility Theory (MAUT) is an approach that assigns utility weights to different outcomes by considering multiple attributes and the associated preferences reported by a given population, and then combining individual values into an overall utility measure. This process involves specifying a particular form for the utility function and the possible preference interactions among the attributes [10],[11],[33]. For MAUF estimation, it is important to include a range of aspects describing relevant consequences of a given disease on patients' lives to ensure accuracy and sensitivity to change. The attributes should not be too large, however, so as not to increase respondents' cognitive burden and make data collection impractical. The I-QOL, along with Neurogenic Module, generates a total of 527 different health states. Hence, a psychometric analysis was required to extract a minimum but valid set of health states [34]. To this end, Rasch analysis and statistics from Classical Test Theory (CTT) were combined.

Rasch analysis is a scaling methodology that allows the examination of the hierarchical structure, unidimensionality and additivity of HRQoL measures [35]. Rasch methods may be used to identify and select items in an instrument that best cover the entire continuum of the underlying construct and remove redundant items [34],[36],[37]. Data dimensionality was investigated using the approach suggested by Linacre (1998) [38] and item responses were analysed using the Partial Credit Model [35], considering model fit, item and category locations, and differential item functioning (DIF) regarding sex, age group and etiology (i.e. MS or SCI). Model fit in the range 0.5-1.5 was considered acceptable, and items with centered locations and ordered categories as spread as possible were preferred. DIF was considered relevant when it was statistically significant and the difficulty difference between groups was over 0.5 logits. Additionally, an expert panel including the developer of the I-QOL and other experts in urology and psychometrics was convened to review the best items according to the results of the analyses. Finally, internal consistency (Cronbach's α > 0.8), criterion validity (statistically significant differences in HRQoL according to the reduction in the frequency of UI episodes) and agreement with the original I-QOL (Intraclass correlation coefficient –ICC- ≥0.75) were tested to ensure that the abbreviated version met an acceptable standard in these properties. The sample used in this first stage has been described elsewhere [39],[40]. Briefly, we pooled data from two randomized trials of onabotuliniumtoxinA (BOTOX®) 200U or 300U vs. placebo in adult patients with UI due to NDO. A total of 691 patients were enrolled in these two trials, 44.9% of whom had SCI and 55.1% of whom had MS. The primary time point was week 6; patients could request a second treatment after week 12 and were followed up to week 52.

Weighting the health states defined by the new abbreviated health state classification system

Elicitation survey

A cross-sectional observational study was conducted between October and December 2012 to survey a representative sample (n = 442) of English-speaking, non-institutionalized adults from the general population in United Kingdom (UK). Participants were eligible to participate if they were willing to complete the interview process and to endorse their compliance with the quality standards of the survey. Written informed consent was obtained prior to participation. Exclusion criteria included cognitive impairment, suspicion of being under the effects of alcohol or narcotics use during the study visit, and any concurrent medical condition limiting their capacity to complete the evaluation.

Sample size was set to a minimum of 338 responders to enable the estimation of mean values with a confidence interval of ±0.032 points, and a standard deviation of 0.3 points, assuming a normal distribution of scores with a confidence interval of 95% (t-value = 1.96) [41]. However, given the complexity of the elicitation process and previous experiences [42], it was estimated that a maximum of 30% of the respondents would not be able to successfully complete all the proposed rating exercises. Hence, a total of 440 participants were interviewed.

Sampling was carried out in two steps: Cluster random sampling was first applied based on UK regions/postcodes. Subjects were then randomly selected from each region/postcode while monitoring other key socio-demographic variables (age, gender, education and employment status). The uniqueness of each participant's identity was verified at recruitment and before the interview using their name and address. Single visits for interviews were face to face and conducted using a Computer Assisted Personal Interviewing methodology, at either a central location in major hubs across the UK, or by visiting respondents at home at an agreed time and date. Respondents were offered £20 for participating in the survey, plus an additional £5 for travel expenses if they were asked to come to a central location to participate.

A total of 10 professional interviewers with required qualifications and relevant experience in conducting face to face interviews collaborated in this research. They received intensive instruction, including role-playing sessions to ensure the quality of interviews. Interviewers were required to record the time needed to complete each interview immediately following each survey, and also to rate the degree of understanding and cooperation from participants along with the overall quality of the interview.

Opinion Health© was in charge of data collection which was conducted according to the Code of Conduct of the Market Research Society [43], European Pharmaceutical Market Research Association [44] and qualitative recruitment best practice outlined by The Association of Qualitative Research [45].

All procedures and materials were tested in a pilot study (n = 13) to identify any practical problems with data collection and to validate instruments and materials used prior to the study. Following the pilot study, additional debriefing sessions were conducted with the interviewers in order to ensure that interviews would be conducted in a systematic way to minimize possible sources of bias.

Modelling space of the new abbreviated health state classification system

All health states were carefully chosen to allow estimation of the 5 single-attribute utility functions, each of the attribute weights in the multi-attribute utility function, and the interaction term (see MAUF estimation below). A total of 16 different states were evaluated (5 single-attributes and 11 multi-attributes).

Respondents were asked to assume a time horizon range of 30 years to better reflect the chronic nature of the health states presented. This period of time is close to the average expected years of life for a middle-age person according to UK life expectancy tables. The time horizon and the chronicity of states were discussed with all the participants before proceeding with the interview.

In addition to basic socio-demographic variables, each participant responded to the following evaluation exercises required for MAUF estimation:

  • Visual Analogue Scale (VAS) rating of single-attribute utility functions. For each attribute (n = 5), VAS rating of the intermediate level was conducted on a thermometer feeling scale anchored at 0- the least desirable or the worst level at each attribute and 100- the most desirable or best level at each attribute.

  • VAS rating of multi-attribute health states: A total of 5 corner states, 3 intermediate or marker states and 3 anchoring states were performed. Intermediate or marker states were chosen to ensure the evaluation of a wide range of levels within the 5 targeted attributes, enhancing the precision of estimations. Anchoring references were: 0- the least desirable health state defined by the attributes (health state W) or dead - and 100-the most desirable health state or perfect health, defined as the conjunction of the top level at each attribute (health state P) [11]. It is important to note the lowest anchor states were chosen depending on each participant's preferences. Therefore, for those respondents who declared that being dead was preferable to health state W, being dead was measured on a scale ranging from 0-health state W to 1- the most desirable/perfect health scale. In contrast, for those respondents who valued being dead as worse than health state W, health state W was then valued on a scale ranging from 0-dead to 1-the most desirable/perfect health scale.

  • Time Trade-Off rating (TTO) for power function estimation and for evaluation of MAUF predictive validity. A total of 6 different states were assessed: 3 corner states and all the intermediate or marker states previously described. During the elicitation process, a "ping-pong" presentation was used to converge on an indifference point between the alternatives (Figure 1). All participants were requested to think about the differences between the health states compared in each exercise, always keeping in mind that all other important, broader factors would remain constant (family, job, friends, income, etc.) under all the presented scenarios.

Figure 1
figure 1

Presentation diagram of the time trade- off technique. Life P= The most desirable health state/Full health/The best health state imaginable. Life A= A given health stated derived from the abbreviated health state classification system.

MAUF estimation

A person-mean utility approach was used to estimate a general utility function based on community responses [10],[11],[42]. MAUF forms include the additive form, the multiplicative form and the multi-linear form [10]. A detailed introduction to the principles for MAUF estimation can be found in the literature [10],[11],[33]. Considering the reduced version of the I-QOL comprises 5 health domains, the multiplicative MAUF is reasonable and has empirical support [10],[33]. Since it is easier for respondents to imagine corner states (with one attribute at its worst level and the rest of attributes at their best level), this function is normally expressed in terms of disutility (u) which is just the complement of the utility:

u - = 1 c Π j = 1 5 1 + c c j u - j Π 1


1+c= Π j = 1 5 1 + c c j

cj is equal to the disutility of the corner state for attribute j and represents the weight attached to that attribute. If the sum of all c j equals 1 then the additive model holds. c is the interaction term and results from solving Equation 2. The methods applied in this research are similar to those followed to develop the HUI-3 [11]. A complete description of the statistical approach can be found in the Additional file 1. Briefly, after confirming the consistency of respondents' ratings, the sample was split in 2 groups according to the health state they considered less preferable (dead or the worst health state possible in the abbreviated health state classification system). Hence, two separate power functions [46] were calculated to convert VAS values (v) into utilities (u) and the adjusted overall person mean scores were calculated with utility values ranging from Dead = 0.00/P = 1.00 scale. Next, the relative weight of each parameter (cj in disutility terms or wj in utility terms) and its interaction form were studied for each group and for the overall sample and finally, the IUI algorithm was estimated.

Predictive validity of the IUI

The accuracy of the algorithm was analysed by comparing estimated and directly measured utilities (TTO) on intermediate states. A number of statistics were computed:

  • Sum of total differences (S differences): S differences = S (predicted u j – observed Person-Mean u j )

  • Mean of differences (MD): MD = [S (predicted u j – observed Person-Mean u j )/nj]

  • Mean of absolute differences (MAD): MAD = [S |(predicted u j – observed Person-Mean u j )|/nj]

  • Overall standard deviation (OSD) of differences: OSD = [(S (predicted u j – observed Person-Mean u j )2)/(nj-1)]

  • ICC between estimated and directly measured scores [47].

For these metrics, values as close to zero as possible are preferred, except for the ICC, interpreted as any correlation. The statistical packages WinSteps software version 3.72.3 and Stata10 along with the spreadsheet Excel 2007 (Microsoft) were used for the analyses presented in this manuscript [48],[49].


I-QOL reduction

Outputs from Rasch analysis are presented in Table 1. No age related DIF was identified. Items 1, 3, 4 and 10 in the I-QOL and 2 and 4 in the Neurogenic Module had etiology related DIF. Item 10 from the I-QOL and 2 and 4 from the Neurogenic Module had sex related DIF. These six items were removed from the selection based exclusively on the results of the Rasch analysis. Additionally, as previously stated, an expert panel then proceeded to consider the results of the analysis jointly with the item content, to reach the final selection of 5 items considered to represent a set of complementary attributes. The 5 response categories were collapsed into 3 to simplify health state valuation, yielding the abbreviated health state classification system (Table 2). This final version proved to be internally consistent and valid for NDO patients according to the psychometric analyses presented in Table 3: at week 6, the abbreviated health state classification proved to have adequate ability to detect changes in those patients who showed a reduction in incontinence episodes (responders), and the association between daily incontinence episodes and health state classification scores was considered adequate. Furthermore, the level of agreement between the original I-QOL and the abbreviated health classification system was high (ICC = 0.90; 95% CI: 0.89-0.91) and statistically significant (p < 0.001).

Table 1 Summary of Rasch outputs
Table 2 The abbreviated health state classification system derived from the I-QOL and Neurogenic Module
Table 3 Psychometric properties of the abbreviated health state classification system in neurogenic detrusor overactivity patients

Weighting the health states derived from the I-QOL

Complete descriptions of the multi-attribute health states and the sample are presented in Tables 4 and 5. A total of 442 interviews were completed, however, 44 cases were withdrawn because they presented at least one inconsistency in their ratings: if VAS values for a given corner state (health states A to E, Table 4) were lower than the VAS values of comparable marker states (M1 to M3, Table 4), n = 50 (please note some participants provided more than one inconsistent answer); or if the value of any corner or marker states were lower than the VAS value of the least desirable health state, n = 24. Only those participants successfully completing all the rating exercises were included, n = 398, generating a total of 2,388 TTO evaluations. With respect to interview quality, 97.7% were performed with full cooperation of the respondent, 84.7% of participants thought carefully before answering, and 84.9% experienced very little or no problems completing the survey. Moreover, mean time required to complete the survey was 30.2 minutes (Standard deviation -SD- 10.9) and the vast majority of interviewers rated the quality as good or very good (94.2%) with less than 1% of interviews being considered of inferior quality.

Table 4 Multi-attribute health states used in preferences elicitation
Table 5 Description of participants in the elicitation survey (valid cases, n= 398)

A majority of respondents were female (60.1%), mean age was 44.75 years (SD14.6); 60.8% had at least a diploma education (2 years of college) and a similar percentage were employed (59.8%). With respect to their health status, 31.7% reported a chronic illness and 8.8% an acute disease. Regarding previous experience, 36.4% declared they had suffered symptoms associated with OAB or UUI and 48.0% recognized some of these problems in their relatives or friends.

With respect to participants' preferences about the worst state described by the abbreviated health state classification system and dead, most of them (n = 294, 73.9%) stated they would prefer living the next 30 years in health state W (Group B), while the rest (n = 104, 26.1%) preferred being dead to living in health state W (Group A).

MAUF estimation and final algorithm of the Incontinence Utility Index

Trimmed values (10%) were fitted separately for each group based on power functions (Equation 3 in Additional file 1) and natural log transformations (Equation 4 in Additional file 1) to convert mean VAS (v) into utility scores (u). Regression models yielded good fit (R2 group A = 0.923 and R2 group B = 0.978) and power functions resulted as follows: Group A, u = 1-(1-v)1.229 and Group B, u = 1-(1-v)0.841. Estimates of the relative weight of each attribute fitted in the perfect health = 0 and worst state = 1 for Group A were: c1 = 0.393, c2 = 0.450, c3 = 0.387, c4 = 0.562 and c5 = 0.283 (Σcj = 2.076; c = −0.911). For Group B: c1 = 0.636, c2 = 0.640, c3 = 0.616, c4 = 0.775 and c5 = 0.490 (Σcj = 3.158; c = −0.994). From these results it was seen that the multiplicative form was an appropriate form.

Final utilities were calculated based on the prevalence proportion in Person-Mean A and Person-Mean B groups (both in W = 0.00/P = 1.00 scale): uj = (104* Person-Mean A uj + 294 * Person-Mean B uj -re-scaled-)/398. A positive linear transformation was applied to re-scale the utilities into a dead = 0.00 / P = 1.00 scale to facilitate comparisons with other utility measures. Table 6 shows utility weights estimated for the multi-attribute health states defined in Table 4. The disutility weights estimated for each attribute from the overall sample were: c1 = 0.470, c2 = 0.484, c3 = 0.456, c4 = 0.590 and c5 = 0.358 (Σcj = 2.357; c = −0.951). Once again the results rejected the linear additive form and showed that all attributes were preference complements. The five single attribute utility coefficients and the overall MAUF are presented in Table 7 with possible scores ranging from 0.036 (worst health state) to 1 (perfect health).

Table 6 Estimated overall utility scores
Table 7 Single and Multi-attribute utilities

Predictive validity of the MAUF

Mean utility scores of marker states directly elicited on the TTO were compared against those estimated by the MAUF to test its predictive validity. The results were as follows: Σ differences = −0.038; MD = −0.013; MAD = 0.038; OSD = 0.004 and ICC (95% CI) = 0.928 (0.648-0.985). Thus, the calculated MAUF showed a very slight tendency to underpredict directly elicited utilities. Moreover, the level of agreement found between both methods (ICC) was good and only 7.2% of variability could not be attributed to subjects.


In this study, a new utility index, the IUI, has been estimated from the abbreviated health state classification system derived from I-QOL and its neurogenic module by means of eliciting preferences from a representative sample of UK adult general population [50]. The abbreviated I-QOL version was internally consistent and able to capture clinically important differences in clinical status of NDO patients with UI (i.e. changes in HRQoL according to reductions in the average number of IU episodes per week). Furthermore, a high level of agreement was found between the reduced version and the original I-QOL, confirming the appropriateness of the abbreviated health state classification system of 5 domains and its modelling space for utility estimation. Moreover, all the psychometric procedures undertaken to reduce the I-QOL have been successfully applied previously [30],[36],[51] and have been recently recommended [34].

Regarding the elicitation process, methods applied are consistent with those used to develop one of the most widespread and robust generic utility measures, the HUI [10],[11]. As has occurred in previous publications, the additive model was rejected in this study [10],[11],[42] and attributes were preference complements: for instance, the perceived limitation associated with being depressed and not having bladder control is greater than the separate effect of being depressed and not having bladder control, but smaller than the sum of these two problems.

In addition, predictive validity of IUI scoring algorithm was confirmed after comparing the direct utility values and those estimated for the final algorithm. Recognizing that IUI algorithm showed a slight tendency to underpredict the directly elicited utilities, error size was small and comparable to those errors reported for other utility instruments [11]. What is more, the ICC between direct and indirect values showed an adequate level of agreement.

Generic preference-based indices have historically been the most commonly used means of estimating utilities across a variety of conditions. However, substantial research has been conducted which shows the limitations of these instruments in different conditions [19]-[22], as well as the lack of concordance between the utility values obtained from their application [14]-[16],[18],[52],[53]. As a result, the development of condition-specific preference-based measures has been gaining ground in recent years [30]-[32],[54].

There are published studies focused on obtaining utility scores from condition-specific instruments for urinary problems. An algorithm has been generated to derive utilities from the KHQ by eliciting preferences from a sample of UI patients [31]. Kay et al. (2013) [32] mapped EQ-5D utility scores from the I-QOL among patients with neurogenic and idiopathic OAB using cross-sectional data from Europe and the US. Finally, Yang et al. estimated a population's preference-based index from the OAB-q, the OAB-5D [30]. Consequently, although the IUI was derived from the I-QOL and its specific module for neurogenic patients, the OAB-5D is the most similar instrument because its modelling space was also obtained from applying Rasch, preference elicitation involved TTO evaluations, and also incorporates general population preferences. Nevertheless, relevant differences lay in the characteristics of the samples used in the reduction process (we specifically used NDO patients) and in the estimation models applied to derive the utility scores since OAB-5D followed the methods described previously for the SF-6D [12] and we computed a MAUF in accordance with the HUI latest versions [10],[11]. Despite these differences, mean absolute error/differences in both measures are comparable (OAB-5D: 0.044 versus IUI: 0.038). Hence, additional research is needed to compare performance of each respective measure in the same populations (i.e. criterion validity, responsiveness and influence on cost-effectiveness ratios).

Despite the fact that the MAUF has proven robust, there are a number of limitations in this research. It should be noted that we used TTO evaluations instead of the Standard Gamble (SG). Although SG is considered the preferred technique to collect subjects' preferences, TTO is a legitimate and extensively used technique, generally considered easier to understand and less time consuming [30],[55]. Preferences were elicited from a UK-specific population, so caution should be used before applying the algorithm to other countries, especially if the population is expected to perceive urinary problems differently. Additionally, as a condition-specific preference-based measure, the IUI may suffer from some potential risks in terms of comparability of results [34]. The risk of focusing effects (i.e. cognitive bias that occurs when participants place too much importance on the problems associated with the health states presented to them compared with other conditions) was obviated as best as possible by clearly stating throughout the preference elicitation process that, apart from the health states described by the reduced version of the I-QOL, other important aspects of life (i.e. family, economic situation, friends, job, etc.) would remain constant.

Another source of limitations referred to as anchoring (defining a specific upper anchor that could make comparability across other preference-based instruments problematic) was also anticipated. Consequently, the upper limits during the evaluation process were defined as the most desirable health state, the best health state imaginable, or full health to best facilitate comparisons with other scales. Finally, while the 30-year time horizon was set to illustrate the chronicity of health states, this time frame may not have been the most appropriate for participants under 30 years of age (18.3%) or, particularly for those older than 60 years (17.6%). Thus, this time horizon may result in some over/underestimations during TTO exercises with these subsamples [56],[57].


The I-QOL and the IUI are valid-in-population measures for measuring HRQoL and utilities, respectively, associated with urinary problems. Although the IUI is the first utility measure that has been developed for a specific subset of patients with urinary symptoms (NDO population), it is important to note that the final selection of attributes included in the IUI is from the original I-QOL, with no items utilized from the Neurogenic Module. Hence, investigators may test its applicability in other relevant subsamples. It is worth noting that the use of a representative sample of general population to value its health states may ease the application of this instrument in new subsets of patients suffering from urinary problems. New research is currently underway to confirm the soundness of the IUI modelling space on idiopathic OAB patients and to study the responsiveness and the minimally important differences of the IUI in both NDO and idiopathic OAB populations. These insights will be of value to future researchers using the IUI instrument which is intended to complement utility estimates provided by generic instruments to support decision-making with reliable, valid and understandable information presented on a similar scale.

Additional file





Classical test theory


Differential item functioning


EuroQol-5 dimension


Health-related quality of life


Health utility index


Intraclass correlation coefficient


Incontinence quality of life questionnaire


Incontinence utility index


King's health questionnaire


Mean of absolute differences


Multi-attribute utility function


Mean of differences


Multiple sclerosis


Neurogenic detrusor overactivity


Overactive bladder


Overactive bladder-5 dimensional


Overactive bladder questionnaire


Overall standard deviation of differences


Principal component analysis


Spinal cord injury


Short form-6 dimension


Time-trade off


Urinary incontinence


United Kingdom


Visual analogue scale


  1. Coyne KS, Sexton CC, Kopp ZS, Ebel-Bitoun C, Milsom I, Chapple C: The impact of overactive bladder on mental health, work productivity and health-related quality of life in the UK and Sweden: results from EpiLUTS. BJU Int 2011, 108: 1459–1471. 10.1111/j.1464-410X.2010.10013.x

    Article  PubMed  Google Scholar 

  2. Ganz ML, Smalarz AM, Krupski TL, Anger JT, Hu JC, Wittrup-Jensen KU, Pashos CL: Economic costs of overactive bladder in the United States. Urology 2010, 75: 526–532. 532 10.1016/j.urology.2009.06.096

    Article  PubMed  Google Scholar 

  3. Reeves P, Irwin D, Kelleher C, Milsom I, Kopp Z, Calvert N, Lloyd A: The current and future burden and cost of overactive bladder in five European countries. Eur Urol 2006, 50: 1050–1057. 10.1016/j.eururo.2006.04.018

    Article  PubMed  Google Scholar 

  4. Wu EQ, Birnbaum H, Marynchenko M, Mareva M, Williamson T, Mallett D: Employees with overactive bladder: work loss burden. J Occup Environ Med 2005, 47: 439–446. 10.1097/01.jom.0000161744.21780.c1

    Article  PubMed  Google Scholar 

  5. Abrams P, Artibani W, Cardozo L, Dmochowski R, van Kerrebroeck P, Sand P: Reviewing the ICS 2002 terminology report: the ongoing debate. Neurourol Urodyn 2009, 28: 287.

  6. Schurch B, Denys P, Kozma CM, Reese PR, Slaton T, Barron R: Reliability and validity of the Incontinence Quality of Life questionnaire in patients with neurogenic urinary incontinence. Arch Phys Med Rehabil 2007, 88: 646–652. 10.1016/j.apmr.2007.02.009

    Article  PubMed  Google Scholar 

  7. Tapia CI, Khalaf K, Berenson K, Globe D, Chancellor M, Carr LK: Health-related quality of life and economic impact of urinary incontinence due to detrusor overactivity associated with a neurologic condition: a systematic review. Health Qual Life Outcomes 2013, 11: 13. doi:10.1186/1477–7525–11–13. 13–11.

  8. EuroQol–a new facility for the measurement of health-related quality of life Health Policy 1990, 16: 199–208. 10.1016/0168-8510(90)90421-9

  9. Dolan P: Modeling valuations for EuroQol health states. Med Care 1997, 35: 1095–1108. 10.1097/00005650-199711000-00002

    Article  CAS  PubMed  Google Scholar 

  10. Torrance GW, Feeny DH, Furlong WJ, Barr RD, Zhang Y, Wang Q: Multiattribute utility function for a comprehensive health status classification system. Health Utilities Index Mark 2. Med Care 1996, 34: 702–722. 10.1097/00005650-199607000-00004

    Article  CAS  PubMed  Google Scholar 

  11. Feeny D, Furlong W, Torrance GW, Goldsmith CH, Zhu Z, DePauw S, Denton M, Boyle M: Multiattribute and single-attribute utility functions for the health utilities index mark 3 system. Med Care 2002, 40: 113–128. 10.1097/00005650-200202000-00006

    Article  PubMed  Google Scholar 

  12. Brazier J, Roberts J, Deverill M: The estimation of a preference-based measure of health from the SF-36. J Health Econ 2002, 21: 271–292. 10.1016/S0167-6296(01)00130-8

    Article  PubMed  Google Scholar 

  13. Brazier J, Usherwood T, Harper R, Thomas K: Deriving a preference-based single index from the UK SF-36 Health Survey. J Clin Epidemiol 1998, 51: 1115–1128. 10.1016/S0895-4356(98)00103-6

    Article  CAS  PubMed  Google Scholar 

  14. Franks P, Hanmer J, Fryback DG: Relative disutilities of 47 risk factors and conditions assessed with seven preference-based health status measures in a national U.S. sample: toward consistency in cost-effectiveness analyses. Med Care 2006, 44: 478–485. 10.1097/01.mlr.0000207464.61661.05

    Article  PubMed  Google Scholar 

  15. O'Brien BJ, Spath M, Blackhouse G, Severens JL, Dorian P, Brazier J: A view from the bridge: agreement between the SF-6D utility algorithm and the Health Utilities Index. Health Econ 2003, 12: 975–981. 10.1002/hec.789

    Article  PubMed  Google Scholar 

  16. Conner-Spady B, Suarez-Almazor ME: Variation in the estimation of quality-adjusted life-years by different preference-based instruments. Med Care 2003, 41: 791–801. 10.1097/00005650-200307000-00003

    Article  PubMed  Google Scholar 

  17. Pinto AM, Kuppermann M, Nakagawa S, Vittinghoff E, Wing RR, Kusek JW, Herman WH, Subak LL: Comparison and correlates of three preference-based health-related quality-of-life measures among overweight and obese women with urinary incontinence. Qual Life Res 2011, 20: 1655–1662. 10.1007/s11136-011-9896-5

    Article  PubMed Central  PubMed  Google Scholar 

  18. McDonough CM, Tosteson AN: Measuring preferences for cost-utility analysis: how choice of method may influence decision-making. Pharmacoeconomics 2007, 25: 93–106. 10.2165/00019053-200725020-00003

    Article  PubMed Central  PubMed  Google Scholar 

  19. Papaioannou D, Brazier J, Parry G: How valid and responsive are generic health status measures, such as EQ-5D and SF-36, in schizophrenia? a systematic review. Value Health 2011, 14: 907–920. 10.1016/j.jval.2011.04.006

    Article  PubMed Central  PubMed  Google Scholar 

  20. Haywood KL, Garratt AM, Lall R, Smith JF, Lamb SE: EuroQol EQ-5D and condition-specific measures of health outcome in women with urinary incontinence: reliability, validity and responsiveness. Qual Life Res 2008, 17: 475–483. 10.1007/s11136-008-9311-z

    Article  PubMed  Google Scholar 

  21. Oh SJ, Ku JH: Is a generic quality of life instrument helpful for evaluating women with urinary incontinence? Qual Life Res 2006, 15: 493–501. 10.1007/s11136-005-2487-6

    Article  PubMed  Google Scholar 

  22. Finkelstein MM, Skelly J, Kaczorowski J, Swanson G: Incontinence Quality of Life Instrument in a survey of primary care physicians. J Fam Pract 2002, 51: 952.

  23. Davis S, Wailoo A: A review of the psychometric performance of the EQ-5D in people with urinary incontinence. Health Qual Life Outcomes 2013, 11: 20.

  24. Coyne K, Revicki D, Hunt T, Corey R, Stewart W, Bentkover J, Kurth H, Abrams P: Psychometric validation of an overactive bladder symptom and health-related quality of life questionnaire: the OAB-q. Qual Life Res 2002, 11: 563–574. 10.1023/A:1016370925601

    Article  CAS  PubMed  Google Scholar 

  25. Kelleher CJ, Cardozo LD, Khullar V, Salvatore S: A new questionnaire to assess the quality of life of urinary incontinent women. Br J Obstet Gynaecol 1997, 104: 1374–1379. 10.1111/j.1471-0528.1997.tb11006.x

    Article  CAS  PubMed  Google Scholar 

  26. Wagner TH, Patrick DL, Bavendam TG, Martin ML, Buesching DP: Quality of life of persons with urinary incontinence: development of a new measure. Urology 1996, 47: 67–71. 10.1016/S0090-4295(99)80384-7

    Article  CAS  PubMed  Google Scholar 

  27. Patrick DL, Martin ML, Bushnell DM, Yalcin I, Wagner TH, Buesching DP: Quality of life of women with urinary incontinence: further development of the incontinence quality of life instrument (I-QOL). Urology 1999, 53: 71–76. 10.1016/S0090-4295(98)00454-3

    Article  CAS  PubMed  Google Scholar 

  28. Patrick DL, Khalaf KM, Dmochowski R, Kowalski JW, Globe DR: Psychometric performance of the incontinence quality-of-life questionnaire among patients with overactive bladder and urinary incontinence. Clin Ther 2013, 35: 836–845. 10.1016/j.clinthera.2013.04.013

    Article  PubMed  Google Scholar 

  29. Reese PR, Pleil AM, Okano GJ, Kelleher CJ: Multinational study of reliability and validity of the King's Health Questionnaire in patients with overactive bladder. Qual Life Res 2003, 12: 427–442. 10.1023/A:1023422208910

    Article  PubMed  Google Scholar 

  30. Yang Y, Brazier J, Tsuchiya A, Coyne K: Estimating a preference-based single index from the Overactive Bladder Questionnaire. Value Health 2009, 12: 159–166. 10.1111/j.1524-4733.2008.00413.x

    Article  PubMed  Google Scholar 

  31. Brazier J, Czoski-Murray C, Roberts J, Brown M, Symonds T, Kelleher C: Estimation of a preference-based index from a condition-specific measure: the King's Health Questionnaire. Med Decis Making 2008, 28: 113–126. 10.1177/0272989X07301820

    Article  PubMed  Google Scholar 

  32. Kay S, Tolley K, Colayco D, Khalaf K, Anderson P, Globe D: Mapping EQ-5D Utility Scores from the Incontinence Quality of Life Questionnaire among Patients with Neurogenic and Idiopathic Overactive Bladder. Value Health 2013, 16: 394–402. 10.1016/j.jval.2012.12.005

    Article  PubMed  Google Scholar 

  33. Torrance GW, Furlong W, Feeny D, Boyle M: Multi-attribute preference functions. health utilities index. Pharmacoeconomics 1995, 7: 503–520. 10.2165/00019053-199507060-00005

    Article  CAS  PubMed  Google Scholar 

  34. Brazier J, Rowen D, Mavranezouli I, Tsuchiya A, Young T, Yang Y, Barkham M, Ibbotson R: Developing and testing methods for deriving preference-based measures of health from condition-specific measures (and other patient-based measures of outcome). Health Technol Assess 2012, 16: 1–114. 10.3310/hta16320

    Article  CAS  Google Scholar 

  35. Wright B, Masters G: Rating Scale Analysis. Rasch Measurement. MESA Press, Chicago; 1982.

    Google Scholar 

  36. Young T, Yang Y, Brazier JE, Tsuchiya A, Coyne K: The first stage of developing preference-based measures: constructing a health-state classification using Rasch analysis. Qual Life Res 2009, 18: 253–265. 10.1007/s11136-008-9428-0

    Article  PubMed  Google Scholar 

  37. Mulhern B, Smith SC, Rowen D, Brazier JE, Knapp M, Lamping DL, Loftus V, Young TA, Howard RJ, Banerjee S: Improving the measurement of QALYs in dementia: developing patient- and carer-reported health state classification systems using Rasch analysis. Value Health 2012, 15: 323–333. 10.1016/j.jval.2011.09.006

    Article  PubMed  Google Scholar 

  38. Linacre JM: Detecting multidimensionality: which residual data-type works best? J Outcome Meas 1998, 2: 266–283.

    CAS  PubMed  Google Scholar 

  39. Cruz F, Herschorn S, Aliotta P, Brin M, Thompson C, Lam W, Daniell G, Heesakkers J, Haag-Molkenteller C: Efficacy and safety of onabotulinumtoxinA in patients with urinary incontinence due to neurogenic detrusor overactivity: a randomised, double-blind, placebo-controlled trial. Eur Urol 2011, 60: 742–750. 10.1016/j.eururo.2011.07.002

    Article  CAS  PubMed  Google Scholar 

  40. Ginsberg D, Gousse A, Keppenne V, Sievert KD, Thompson C, Lam W, Brin MF, Jenkins B, Haag-Molkenteller C: Phase 3 efficacy and tolerability study of onabotulinumtoxinA for urinary incontinence from neurogenic detrusor overactivity. J Urol 2012, 187: 2131–2139. 10.1016/j.juro.2012.01.125

    Article  CAS  PubMed  Google Scholar 

  41. Torrance G: Social preferences for health states: an empirical evaluation of three measurement techniques. Socioecon Plann Sci 1976, 10: 128–136. 10.1016/0038-0121(76)90036-7

    Article  Google Scholar 

  42. Montejo AL, Correas-Lauffer J, Maurino J, Villa G, Rebollo P, Diez T, Cordero L: Estimation of a multiattribute utility function for the Spanish version of the TooL questionnaire. Value Health 2011, 14: 564–570. 10.1016/j.jval.2010.11.016

    Article  PubMed  Google Scholar 

  43. The Market Research Society M: MRS Code of Conduct. 2010. ., []

  44. European Pharmaceutical Market Research Association E: EphMRA Code of Conduct. 2014. ., []

  45. The Association for Qualitative Research A: Qualitative research recruitment guidelines. 2002. ., []

  46. Stiggelbout AM, Eijkemans MJ, Kiebert GM, Kievit J, Leer JW, De Haes HJ: The `utility' of the visual analog scale in medical decision making and technology assessment. is it an alternative to the time trade-off? Int J Technol Assess Health Care 1996, 12: 291–298. 10.1017/S0266462300009648

    Article  CAS  PubMed  Google Scholar 

  47. Deyo RA, Diehr P, Patrick DL: Reproducibility and responsiveness of health status measures. statistics and strategies for evaluation. Control Clin Trials 1991, 12: 142S-158S. 10.1016/S0197-2456(05)80019-4

    Article  CAS  PubMed  Google Scholar 

  48. Linacre J: Winsteps® Rasch measurement computer program., Beaverton, Oregon; 2012.

    Google Scholar 

  49. Stata Statistical Software: Release 10. StataCorp LP, College Station, Tx; 2007.

  50. Guide to the methods of technology appraisal. 2013. Process and methods guides. NICE, London; 2013.

  51. Young TA, Yang Y, Brazier JE, Tsuchiya A: The use of rasch analysis in reducing a large condition-specific instrument for preference valuation: the case of moving from AQLQ to AQL-5D. Med Decis Making 2011, 31: 195–210. 10.1177/0272989X10364846

    Article  PubMed  Google Scholar 

  52. Hatoum HT, Brazier JE, Akhras KS: Comparison of the HUI3 with the SF-36 preference based SF-6D in a clinical trial setting. Value Health 2004, 7: 602–609. 10.1111/j.1524-4733.2004.75011.x

    Article  PubMed  Google Scholar 

  53. Mulhern B, Meadows K: The construct validity and responsiveness of the EQ-5D, SF-6D and diabetes health profile-18 in type 2 diabetes. Health Qual Life Outcomes 2014, 12: 42.

  54. Brazier JE, Kolotkin RL, Crosby RD, Williams GR: Estimating a preference-based single index for the Impact of Weight on Quality of Life-Lite (IWQOL-Lite) instrument from the SF-6D. Value Health 2004, 7: 490–498. 10.1111/j.1524-4733.2004.74012.x

    Article  PubMed  Google Scholar 

  55. Bansback N, Tsuchiya A, Brazier J, Anis A: Canadian valuation of EQ-5D health states: preliminary value set and considerations for future valuation studies. PLoS One 2012, 7: e31115.

  56. van Nooten FE, Koolman X, Brouwer WB: The influence of subjective life expectancy on health state valuations using a 10 year TTO. Health Econ 2009, 18: 549–558. 10.1002/hec.1385

    Article  CAS  PubMed  Google Scholar 

  57. Dolan P, Roberts J: To what extent can we explain time trade-off values from other information about respondents? Soc Sci Med 2002, 54: 919–929. 10.1016/S0277-9536(01)00066-1

    Article  PubMed  Google Scholar 

Download references


Authors are indebted to Professor David Feeny for reviewing this research and providing valued contributions to this manuscript.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Jesús Cuervo.

Additional information

Competing interests

Financial support for this study was provided entirely by a contract with Allergan, Inc. The funding agreement ensured the authors' independence in designing the study, interpreting the data, writing, and publishing the report. The following authors were employed by the sponsor during the writing of this manuscript: KK, CW and DG. At the time of submission, CW and DG were not employed at Allergan, Inc. Furthermore, according to the purposes and scope of this work, authors hereby declare no other financial conflict of interest.

Authors' contributions

JC, NC designed the two phases of the study, coordinated the study, conducted and reviewed the statistical analysis and drafted the manuscript. DLP supervised the design of the study, critically reviewed the content of the reduced version of the I-QOL and the statistical reports and the manuscript. KK, CW and DG, participated in the design and coordination of this research, supervised the statistical analyses in both phases and collaborated drafting and critically reviewing the manuscript. All authors revised and approved this final version of the manuscript.

Electronic supplementary material


Additional file 1: Estimating a multi-attribute utility function (MAUF) for the abbreviated health state classification resulted from the I-QOL and its neurogenic module. (DOCX 27 KB)

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cuervo, J., Castejón, N., Khalaf, K.M. et al. Development of the Incontinence Utility Index: estimating population-based utilities associated with urinary problems from the Incontinence Quality of Life Questionnaire and Neurogenic Module. Health Qual Life Outcomes 12, 147 (2014).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: