Subjects and study design
This study is part of a multicenter randomized controlled trial comparing cognitive behavioural therapy and psychodynamic short therapy for SP (ISRCTN53517394). The trial is part of the Social Phobia Psychotherapy Research Network (SOPHO-NET) [15]. Design and results of the trial have been reported elsewhere [16].
Patients were recruited in five outpatient university clinics across Germany (Bochum, Dresden, Göttingen, Jena, and Mainz), from April 2007 until April 2009. The patient sample can be considered as clinically representative [16]. Inclusion criteria were: (I) diagnosis of SP according to the Structured Clinical Interview for DSM-IV [17] and a Liebowitz Social Anxiety Scale (LSAS) score higher than 30 points [18]; (II) age between 18 and 70; (III) SP being the primary diagnosis based on the severity disorder classification of the Anxiety Disorders Interview Schedule [19]. Exclusion criteria were: (I) psychotic and acute substance-related disorders; cluster A and B personality disorders; prominent risk of self-harm; (II) organic mental disorders; (III) severe medical conditions; (IV) concurrent psychotherapeutic or psychopharmacological treatment [16].
495 patients were randomized to one of the therapy groups (n = 416) or a waiting list group (n = 79). After treatments were completed in the therapy groups, waiting list patients were also randomized to one of the therapy groups and treated as well. Data were collected pre-treatment (T0, n = 495) and post-treatment (T1, n = 364), as well as 6 months (T2, n = 321), 12 months (T3, n = 262), and 24 months (T4, n = 183) after completed treatment (T1). The time interval between T0 and T1 was minimum 6 months but varied due to delays in administrative procedures, vacations, or illness of patients or therapists.
Due to missing data in EQ-5D questionnaires, we used a subsample of n = 445 (t0), n = 329 (t1), n = 288 (t2), n = 244 (t3) and n = 166 (t4) for our analyses.
Measures
EQ-5D
The EQ-5D contains three concepts of expressing HRQOL [20]: (I) The patient-reported “EQ-5D descriptive system” has five items, so called “dimensions” (“mobility”, “self-care”, “usual activities”, “pain/discomfort”, “anxiety/depression”). Each of them is recorded with an ordinal three level code (1: “no problems”, 2: “moderate problems”, 3: “severe problems”), resulting in 243 (35) possible health states. These can be expressed as 5-digit codes (e.g. “11233” refers to no problems in “mobility” and “self-care”, moderate problems in “usual activities”, and severe problems in “pain/discomfort” and “anxiety/depression”).
-
(II)
The 5-digit code can be transformed into a utility weight, the so called EQ-5D index. The EQ-5D index is based on a valuation of health states by the general population – indicating the preferences from the general population’s perspective. The EQ-5D index ranges to a maximum utility weight of 1 (full health). Death is valued with 0. The worst possible health state (“33333”) is -0.21 for the German EQ-5D index [21] and -0.59 for the British EQ-5D index [22], indicating health states valued worse than death. Both EQ-5D index scores were computed by regression analysis leading to a different valuation of the same health state. In our study we labelled the German EQ-5D index score “EQ-5D index-G” and the British EQ-5D index score “EQ-5D index-UK”. Although we analysed a German patient sample, we used both EQ-5D indexes (being aware of the limited comparability between both populations). The EQ-5D index-G was based on a small population sample (nGerman = 334 vs. nUK = 2997) and must thus be considered less precise for statistical reasons. The German EQ-5D index scores for all 243 EQ-5D health states were estimated from a sample of 36 health states using a regression model. In contrast to the British EQ-5D index, the German EQ-5D index is insensitive to a change from level 1 (“no problems”) to level 2 (“moderate problems”) in the dimension “anxiety/depression” due to omitted regression coefficients. Therefore, the EQ-5D index-G scores must be considered preliminary.
-
(III)
Patients are asked to rate their current health state on a visual analogue scale (EQ VAS) ranging from 0 (worst imaginable health state) to 100 (best imaginable health state). The EQ VAS represents the value of HRQOL from patients’ perspective.
Liebowitz social anxiety scale (LSAS)
The LSAS is a 24-item clinician-administered SP screening instrument, measuring anxiety and avoidance [23]. Both subscales (“LSAS avoidance score” and “LSAS anxiety score”) range from 0 to 72, leading to a total range of 0 to 144 (“LSAS total score”). LSAS total scores below 30 indicate remission of SP, scores between 30 and 59 indicate specific SP, and scores above 60 indicate generalised SP [18].
Social phobia and anxiety inventory (SPAI)
The SPAI is a self-reported SP screening instrument, measuring the disease severity level of SP [24]. The German version of SPAI contains 22 items [25]. Each item ranges from “never” (coded as 0) to “always” (coded as 6), leading to a 7 point Likert scale. The SPAI score as the mean of all 22 items ranges from 0 to 6 with an increasing severity level of SP.
Beck Depression Inventory (BDI)
The BDI is a screening instrument for depression [26]. Patients are asked to rate their feelings throughout the last week and today on 21 items. The items range from 0 to 3 with an increasing disease severity level and are added up to a total score ranging from 0 to 63.
Analysis
For statistical analysis, we collapsed “moderate problems” and “severe problems” of the EQ-5D descriptive system into one category “problems” (except for analysing discriminative ability related to the general population), because the number of patients indicating “severe problems” was small.
Discriminative ability reflects the ability of an instrument to discriminate between different health states [27]. We assumed that the EQ-5D discriminates between patients with SP and the general population and between different levels of disease severity in patients with SP. For the comparison with the general population, we used EQ-5D data from a representative survey (n = 3552) in Germany [28] adjusted for age and gender due to the young age in the patient sample. To distinguish between disease severity levels, we grouped patients into quartiles of the LSAS total score and its both subscales, and alternatively, into patients with specific SP (30 to 59 LSAS total score) and generalised SP (≥ 60 LSAS total score). We tested for significance by using χ2-test and Fisher’s exact test (EQ-5D descriptive system) and Mann-Whitney-U-test (EQ-5D index and EQ VAS).
Test-retest reliability reflects the ability of an instrument to produce similar results if the underlying construct has not changed [29, 30]. The LSAS total score was used as clinical anchor. We assumed that the score of both EQ-5D indexes and the EQ VAS stay constant if the change in LSAS total score stays within range of 0 ± 0.5 standard deviations (baseline) which has been recommended by [31, 32], corresponding to 11 LSAS total score points. Additionally, we split the “no change” group into patients with and without social phobia (< 30 LSAS total score points at both time points).
For the EQ-5D index scores and the EQ VAS score, we analysed test-retest reliability using the intraclass correlation coefficient (ICC) with a two way mixed model. We considered an ICC ≥ 0.7 as large [30].
Construct validity reflects how appropriately the instrument refers to the underlying construct [30]. We assumed that there is an association between the EQ-5D and instruments of psychopathology used as the underlying construct (LSAS total score, LSAS avoidance score, LSAS anxiety score, SPAI score, SPAI No. 22 score, and BDI score). Since both EQ-5D index scores and the EQ VAS score did not follow a normal distribution, we computed the non-parametric Spearman rank correlation coefficient (rs) for both EQ-5D index scores and the EQ VAS score. We defined a correlation as small for 0.1 ≤ |rs| < 0.3, moderate for 0.3 ≤ |rs| < 0.5, and large for |rs| ≥ 0.5 [33].
Responsiveness reflects the ability of an instrument to change, given a change in the underlying construct [30]. Again, the LSAS total score was used as clinical anchor. We assumed that both EQ-5D indexes and the EQ VAS score change over time if the LSAS total score has changed. We defined a relevant change as more than ± 0.5 standard deviations (baseline) which has been recommended by [31, 32], corresponding to 11 LSAS total score points. The responsiveness can be assessed in many different ways [34–38]. In our analysis we used the paired t-test statistics and computed the effect size (ES) to assess the association of change in both EQ-5D indexes and the EQ VAS with the change in the LSAS total score. According to Cohen, we defined scores of ES as trivial from ≥ |0.1| to < |0.2|, as small from ≥ |0.2| to < |0.5|, as medium from ≥ |0.5| to <|0.8|, and as large ≥ |0.8| [33]. Alternatively, we calculated the area under curve (AUC) of the receiver operating characteristic (ROC) curve. An AUC of 0.5 indicates that the instrument randomly detects the true change of patients’ health status. The closer the AUC equals 1.0 the more the instrument is able to detect the true change of patients’ health status [39, 40].
As we tested several hypotheses, a Bonferroni correction for the level of significance was computed. Six different instruments were used for analysing construct validity, leading to a corrected level of significance of α = 0.05/6 = 0.0083. Ten different chronological comparison-pairs were used for analysing reliability and responsiveness. Thus, the level of significance was defined as α = 0.05/10 = 0.005.
Statistical analyses were conducted using Statistical Package for the Social Sciences (version 18, SPSS Inc., Chicago, IL, USA).