Skip to main content

Table 5 Area under the receiver operating characteristic curve (AUC) for depression measures for detecting any improvement

From: Responsiveness of PROMIS and Patient Health Questionnaire (PHQ) Depression Scales in three clinical trials

Depression measure*

Average accuracy across trials

Accuracy for detecting any improvement

Retrospective Global Rating of Change (GRC)

Accuracy for detecting any improvement

Prospective Global Rating of Change (GRC)

Retro-spective GRC

Pro-spective GRC

CAMEO

SPACE

SSM

CAMEO

SPACE

SSM

AUC

(95% CI)

AUC

(95% CI)

AUC

(95% CI)

AUC

(95% CI)

AUC

(95% CI)

AUC

(95% CI)

PROMIS 4-item

.565

.658

.603

(.510–.697)

.586

(.512–.660)

.507

(.436–.579)

.646

(.553–.739)

.699

(.628–.770)

.630

(.544–.716)

PROMIS 6-item

.578

.660

.634

(.542–.727

.570

(.493–.646)

.530

(.457–.603)

.638

(.542–.734)

.712

(.642–.782)

.631

(.546–.715)

PROMIS 8-item

.574

.664

.623

(.530–.716

.575

(.499–.650)

.523

(.450–.596)

.646

(.550–.741)

.712

(.646–.785)

.633

(.550–.717)

PROMIS Short-form

.583

.671

.644

(.551–.736)

.577

(.501–.652)

.529

(.456–.602)

.638

(.543–.732)

.741

(.673–.808)

.634

(.551–.717)

PHQ-9

.617

.649

.697

(.608–.786)

.587

(.512–.662)

.566

(.493–.639)

.664

(.572–.755)

.644

(.570–.717)

.640

(.557–.723)

PHQ-2

.570

.627

.598

(.597–.690)

.576

(.505–.646)

.535

(.464–.606)

.576

(.483–.670)

.686

(.619–.752)

.620

(.537–.703)

SF-36 Mental Health

.627

(.534–.720)

.745

(.661–.829)

    
  1. *AUC is probability of correctly discriminating between patients who have improved and those who have not. Any improvement ≥ “a little better”; moderate improvement ≥ “moderately better”
  2. 6 month follow-up for CAMEO; 3 months for SPACE and SSM. The proportion of patients reporting any improvement by retrospective GRC was 57%, 40%, and 57% in CAMEO, SPACE, and SSM, respectively. The proportion reporting any improvement by prospective GRC was 40%, 39%, and 29% in CAMEO, SPACE, and SSM, respectively
  3. There were no significant differences at P < .01 (using Bonferroni’s correction for multiple comparisons) between any of the retrospective AUC’s. The prospective AUC was significantly lower for the PHQ-2 (P = .003) compared to the PROMIS Short-form in the CAMEO trial