Minimal clinically important differences for the EQ-5D and QWB-SA in Post-traumatic Stress Disorder (PTSD): results from a Doubly Randomized Preference Trial (DRPT)
© Le et al.; licensee BioMed Central Ltd. 2013
Received: 13 November 2012
Accepted: 26 March 2013
Published: 12 April 2013
To determine the minimal clinically important difference (MCID) for the health-utility measures EuroQol-5 dimensions (EQ-5D) and Quality of Well Being Self-Administered (QWB-SA) Scale in PTSD patients.
Research design and methods
Two hundred patients aged 18 to 65 years with PTSD enrolled in a doubly randomized preference trial (DRPT) examining the treatment and treatment-preference effects between cognitive behavioral therapy and pharmacotherapy with sertraline and completed the EQ-5D and QWB-SA at baseline and 10-week post-treatment. The anchor-based methods utilized a Clinical Global Impression-Improvement (CGI-I) and Clinical Global Impression-Severity. We regressed the changes in EQ-5D and QWB-SA scores on changes in the anchors using ordinary least squares regression. The slopes (beta coefficients) were the rates of change in the anchors as functions of change in EQ-5D and QWB, which represent our estimates of MCID. In addition, we performed receiver operating characteristic (ROC) curve analysis to examine the relationship between the changes in EQ-5D and QWB-SA scores and treatment-response status. The MCIDs were estimated from the ROC curve where they best discriminate between treatment responders and non-responders. The distribution-based methods used small to moderate effect size in terms of 0.2 and 0.5 of standard deviation of the pre-treatment EQ-5D and QWB-SA scores.
The anchor-based methods estimated the MCID ranges of 0.05 to 0.08 for the EQ-5D and 0.03 to 0.05 for the QWB. The MCID ranges were higher with the distribution-based methods, ranging from 0.04 to 0.10 for the EQ-5D and 0.02 to 0.05 for the QWB-SA.
The established MCID ranges of EQ-5D and QWB-SA can be a useful tool in assessing meaningful changes in patient’s quality of life for researchers and clinicians, and assisting health-policy makers to make informing decision in mental health treatment.
Clinical trial registration
Clinicaltrials.gov; Identifier: NCT00127673.
KeywordsEQ-5D QWB-SA Minimal clinically important difference PTSD Doubly randomized preference trial Prolonged exposure therapy Sertraline
Posttraumatic stress disorder (PTSD) is a chronic and debilitating condition, with lifetime prevalence rates ranging from 8%–14% of the US population . Moreover, given recent estimates of PTSD among Operation Iraqi Freedom (OIF) and Operation Enduring Freedom (OEF) veterans, there is an unprecedented need for empirically-supported PTSD treatment for military personnel and veterans . PTSD is associated with poor quality of life in multiple health domains [3–5] and also has a huge financial burden . Greenberg and colleagues (1999)  reported that through work impairment, hospitalization, and health visits, PTSD was more costly than any other anxiety disorder. Among the 1.64 million veterans returning from OEF and OIF, it is estimated that approximately 300,000 individuals currently suffer from PTSD or major depression, potentially costing $4.0 to $6.2 billion in a two-year time frame . These considerations highlight the substantial impact of PTSD and the need for reliable and valid measures of improved clinical outcomes.
Clinically, the PTSD Symptom Scale-Interview (PSS-I) , PTSD Checklist (PCL) , and Clinician-Administered PTSD Scale (CAPS)  have been the most commonly used measures for assessing symptomatic improvement/deterioration in clinical trials. In addition, patient-reported outcome (PRO) instruments have been increasingly utilized to supplement to the clinical measures and provide additional information on other health-related quality of life (HRQOL) domains (mobility, self-care, usual activities, pain/discomfort, social, emotional, and physical functions) as well as health utilities [5, 11–14]. For example, to justify the cost of a new intervention in PTSD, health-policy makers would need to determine not only whether the new intervention provides significantly clinical improvement but also whether the new intervention is cost-effective as compared to the current standard treatment. Incorporating generic health-utility measures such as the EQ-5D, QWB-SA, Health Utility Index Mark 3 (HUI3), or Short Form-6 dimensions (SF-6D) can allow comparisons of burden of disease across health conditions as well as the quality-adjusted live years (QALYs), a HRQOL measure used to evaluate the cost-effectiveness for healthcare interventions. Nevertheless, interpretation of a change in HRQOL score from pre- to post-treatment can be confusing to clinicians and other health professionals due to their unfamiliarity with the PRO instruments. In contrast, repeated experience and familiarity with clinical measures such as Body Mass Index (BMI) or blood pressure allow health professionals to make meaningful interpretation of the measures . Thus, by placing the magnitude of change in HRQOL score corresponding to a minimal clinically important difference would be helpful and meaningful for health professionals, patients, health-policy makers as well as other stakeholders .
In general, the minimal clinically important difference (MCID) of a PRO instrument is defined as smallest change in a PRO measure that is linked to a clinically relevant difference or change. In other words, MCID is the smallest difference that patients perceived as beneficial or harmful and that would result in a change in treatment . There are two broad methods for estimating the MCID of a PRO instrument: (1) anchor-based methods, which link changes in HRQOL scores to external indicators either clinical or patient-based such as laboratory or physiological measures, and clinician or patient ratings; and (2) distribution-based methods, which estimate MCIDs using small to medium effect sizes based on the distribution of HRQOL scores in a relevant sample . Nevertheless, since no single anchor is ideal and no single method is perfect, it is recommended that multiple approaches from both anchor- and distribution-based methods should be used to estimate the MCID for a PRO instrument .
Empirical work on MCIDs for the EQ-5D or QWB-SA has been done on several conditions [15, 18–23]; however it has not performed in mental health disorders, particularly in PTSD. In the current study, we estimated and compared the smallest changes in HRQOL utility scores of EQ-5D and QWB-SA that can be regarded as clinically important in PTSD patients using multiple anchor- and distribution-based approaches.
Patients and methods
HRQOL health-utility instruments
The EQ-5D is a five-item self-administered questionnaire and one of the most widely used generic preference-based measures for estimating health utilities. The measure has 5 health domains (mobility, self-care, usual activities, pain/discomfort, and anxiety/depression), each with three response-levels: no problems, some problems, and severe problems . The scoring system of EQ-5D used in this study was based on the U.S. population-based EQ-5D  ranging from -0.11 (all five ED-5D health domains reported extreme problems) to 1 or perfect health (no problems at all five EQ-5D domains), in which zero means dead and negative utility scores represent health states worse than dead.
The QWB-SA is also a common generic preference-based HRQOL measuring health utilities. Overall, the QWB-SA includes five parts: (1) Part I asks about 19 chronic symptoms or problems (yes/no question format), 25 acute physical symptoms and 11 mental health symptoms over the last 3 days (in the format of whether the symptom occurs “yesterday,” “2 days ago,” and/or “3 days ago”); (2) Part II uses a similar format but asks about self-care; (3) Part III asks about mobility; (4) Part IV ask about physical functioning; and (5) Part V asks about social activities. In all, the domain scores are combined into a single index score ranging from 0.09 (lowest possible health state) to 1 for perfect health, with zero means dead .
The Clinical Global Impression (CGI) is a brief clinician-rated instrument assessing: (1) severity of illness (CGI-S) using a 7-point Likert scale: 1 or “normal, not mentally ill,” 2 or “borderline mentally ill,” 3 or “mildly mentally ill,” 4 or “moderately mentally ill,” 5 or “markedly mentally ill,” 6 or “severely mentally ill,” and 7 or “among the most extremely mentally ill;” and (2) global improvement or change (CGI-I) also using a 7-point Likert scale: 1 or “very much improved,” 2 or “much improved,” 3 or “minimally improved,” 4 or “no change,” 5 or “slightly worse,” 6 or “much worse,” and 7 or “very much worse .
In addition to CGI-S and CGI-I, we also selected the PTSD Symptom Scale-Interview (PSS-I)  as our third anchor. Classification of treatment responder or non-responder at 10-week post-treatment was assessed using the PSS-I and CGI-I. The 17-item PSS-I uses DSM-IV symptom criteria; and each symptom is rated on a 0 (not at all) to 3 (5 or more times per week/very much) scale of frequency and/or severity . The absolute cutoff scores of 23 or less on the PSS-I and 3 or lower on the CGI-I define the clinically meaningful improvement [28–30].
To be included in this analysis, a patient had to baseline or pre-treatment and a follow-up assessment of EQ-5D, QWB-SA, CGI-S, CGI-I, and PSS-I. For patients who had multiple follow-up visits, the current analysis included the first follow-up assessment on which all measures completed. All analyses in the study were conducted using Stata release 12.0 (Stata Corp LP, College Station, TX, USA).
Correlation coefficients between changes in EQ-5D scores and changes in anchor measures were determined to confirm the usefulness of the anchors. A correlation coefficient of 0.30 or more is needed in order to be considered a good anchor .
For the CGI-I anchor, we grouped “very much improved” with “very much worse,” “much improved” with “much worse,” and “minimally improved” with “slightly worse;” and those on the worsening side of the scale, the sign of the change in HRQOL health-utility scores is reversed, i.e. negative sign to positive and vice versa. We regressed the changes of the EQ-5D and QWB-SA scores on the transformed CGI-I using ordinary least squares method. The slopes (beta coefficients of the regression lines) were the rates of change in the anchor CGI-I as functions of change in EQ-5D and QWB-SA scores, which represented the estimates of the MCID. This method helped to prevent few worsening responses that may adversely affect the slope of the regression line; thus the estimated MCIDs were more stable and applicable to the entire scores of the EQ-5D and QWB-SA as opposed to separate the scores into worsening and improvement . For the CGI-S anchor, to estimate the MCIDs, we simply regressed the changes of the EQ-5D and QWB-SA scores on the change of CGI-S between pre-treatment and follow-up visit.
In our second anchor-based approach, we analyzed the relationship between the changes in EQ-5D and QWB-SA scores and the treatment response status using receiver operating curve (ROC) analysis to estimate the MCIDs. The ROC curves were constructed by plotting the sensitivity (true-positive rate) against the one minus specificity (false-positive rate) at different cut-off points in the continuous HRQOL score changes that distinguished treatment responder and non-responder. The area under the ROC cure (AUC) can be interpreted as the probability of correctly discriminating between the treatment responder and non-responder [32–35]. The AUC ranges from 0.5 (corresponds to no discriminatory ability, i.e. random responding as with a coin flip to determine treatment-response status) to 1.0 (corresponds to perfect discriminatory ability, i.e. perfect prediction). Using ROC curve analysis, the MCIDs were determined based on the optimal cut-off points for the changes in HRQOL scores which maximized the sensitivity and specificity, i.e. point that best discriminated between patients who were treatment responders and those who did not respond to treatment [34, 35].
For distribution-based approach, the MCIDs can be estimated as one half the standard deviations of the pre-treatment EQ-5D and QWB-SA scores. The one half the standard deviation at baseline of a HRQOL measure (or moderate effect size) has been linked to establish the MCID and used widely in literature . Alternatively, a small effect size in terms of 0.2 the standard deviations at pre-treatment EQ-5D and QWB-SA scores were also utilized .
Baseline demographic and clinical characteristics
Number of Patients (%)
Age in years, mean (SD)
Education (College Educated) (%)
PTSD Severity (PSS-I), mean (SD)
Re-experiencing, mean (SD)
Avoidance, mean (SD)
Hyperarousal, mean (SD)
CGI-S, mean (SD)
EQ-5D, mean (SD)
QWB-SA, mean (SD)
Correlation coefficients between the HRQOL health-utility measures (EQ-5D and QWB-SA) and the clinical anchors (CGI-I, CGI-S, PSS-I, and treatment response status)
HRQOL health-utility instrument
Treatment response status
Changes in HRQOL health-utility scores and clinical anchors between pre-treatment and follow-up
Estimated Minimal Clinically Important Differences (MCIDs) and their 95% Confidence Intervals (CIs) for EQ-5D and QWB-SA using both anchor- and distribution-based approaches
Treatment response status†
Understanding changes in scores and how to interpret the changes are critical in the field of HRQOL measurement. Because there is no single gold-standard method for estimating MCID, multiple methods from both anchor- and distribution-based approaches and triangulation of all the methods to establish a possible range of MCID are recommended . Using data from a doubly randomized preference trial in post-traumatic stress disorder patients (the OPT trial), our analysis suggests that the plausible range of MCID values for the HRQOL health-utility EQ-5D and QWB-SA in the population of PTSD patients were between 0.04 and 0.10, and 0.02 to 0.05, respectively. Empirical works on MCIDs for the EQ-5D or QWB-SA has been done on several disease states and were ranged from -0.01 to 0.14 [15, 18–21]. However, those MCID estimates for the EQ-5D were based on the U.K. scoring algorithm or EQ-5D VAS instead of the U.S. scoring method used in the current study. Two studies using the U.S. population-based scoring model reported similar range of MCID values between 0.07 and 0.09 for the EQ-5D utility [18, 20]. For the QWB-SA, our range of MCID values was consistent with previous studies [22, 23].
The clinical anchors (CGI-I, CGI-S, and PSS-I for classifying treatment response status) used in our analysis were most appropriate as they were highly clinically relevant and strongly correlated with the HRQOL health-utility EQ-5D and QWB-SA. In addition, the anchor-based approach utilized well-established methods (OLS regression and ROC curve analysis) to estimate the MCIDs and produced rather similar results even if with different anchors used. The AUCs resulted from ROC analysis were rather large for both EQ-5D and QWB-SA indicating that the HRQOL health-utility measures had great ability to discriminate correctly treatment responders from non-responders. Although multiple methods are necessary to estimate a range of MCID values, Revicki and colleagues (2008)  further recommended that results from the anchor-based approach have the most weight due to their clinical advantages over the distribution-based approach. That is, it is more likely that the ranges of MCID values in the population of PTSD patients would be between 0.05 to 0.08 and 0.03 to 0.05 for the EQ-5D and QWB-SA, respectively.
Both EQ-5D and QWB-SA are assumed to measure the same underlying construct of overall HRQOL in terms of health utility. The primary use of HRQOL health-utility measures is to calculate the quality-adjusted life year (QALY), a function of both quantity and quality of life, which is used in health economic evaluations and decision models to help health policy makers to allocate resources effectively. Therefore, it is important to establish their MCIDs and then compare them between the EQ-5D and QWB-SA. Our results showed that the plausible range of MCID values for the EQ-5D was almost twice that of the range for the QWB-SA. It was more likely because the two HRQOL health-utility instruments: (1) measure different health state descriptive systems thus different number of possible health states (243 possible health states for the EQ-5D versus 945 for the QWB-SA), (2) assess preferences for the multiple health states using different methods, i.e. time-trade off method used for the EQ-5D and rating-scale for the QWB-SA, and (3) use different scoring functions.
There were, however, some limitations in the current analysis. First, we did not apply multiple imputation methods for the missing data. Instead, we assumed that any missing assessments of the clinical anchors and HRQOL health-utility measures were missing completely at random (MCAR), meaning that our results would be similar whether or not there were missing data. Secondly, as there were very few worsening cases, the anchor-based methods focused mainly on the responses of those who were clinically improved rather than those worsened. Future work to assess the MCIDs for those who are clinically worsened is in need. Nevertheless, more than often the MCID is used in the context of a treatment effect, thus the MCID results in our study can still be applied to detect minimal clinically improvement in score changes. Finally, in our study, CGI-I questions were given to patients at 10-week post-treatment. The main limitation of using anchor-based approach that relies on global ratings is that these retrospective ratings are potentially susceptible to recall bias. As discussed above, it is important to estimate a range of MCID values from several different methods rather than a point estimate.
Our analysis to determine the plausible ranges of MCID values for the EQ-5D and QWB-SA followed the recommendations by Revicki and colleagues (2008) : longitudinal data were obtained from the clinical trial, multiple anchors were used and they were highly clinically relevant and strongly correlated with the HRQOL instruments, methodologically sound methods utilizing OLS regression and ROC curve analysis were applied in the anchor-based approach, and triangulation of multiple methods using both anchor- and distribution-based approaches to produce plausible ranges of MCID values.
To our knowledge, this analysis is the first attempt to use multiple anchors-based approach as well as distribution-based approach to determine and compare the MCID ranges for the EQ-5D and QWB-SA in the population of PTSD patients. The information can be helpful in interpreting the EQ-5D and QWB-SA scores as well as in planning new trials when estimating power and sample sizes . Furthermore, the established MCID ranges of EQ-5D and QWB-SA can be a useful tool in assessing meaningful changes in patient’s quality of life for researchers and clinicians, and assisting health-policy makers to make informing decision in mental health treatment.
Written informed consent was obtained from the patient for publication of this report and any accompanying images.
This research was made possible by grants R01MH066347 and R01MH066348 from the National Institute of Mental Health (“Effectiveness of PTSD Treatment: Prolonged Exposure Therapy vs. Zoloft”).
Primary findings of this study were presented in part at the annual meeting of the Society of Medical Decision Making, Phoenix, Arizona, October 19, 2012.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.