- Open Access
Responsiveness of the short-form health survey and the Parkinson’s disease questionnaire in patients with Parkinson’s disease
Health and Quality of Life Outcomesvolume 15, Article number: 75 (2017)
The responsiveness of a measurement instrument is important for understanding its ability to detect changes in the progression of a disease. We examined and compared the internal and external responsiveness of the 36-item Short-Form Health Survey (SF-36) and the 39-item Parkinson’s Disease Questionnaire (PDQ-39) in patients with Parkinson’s Disease (PD).
Seventy-four patients with PD were evaluated using the SF-36 and PDQ-39 at baseline and again after one year. In addition, their motor signs, motor difficulties of daily living, and depressive symptoms were assessed as external criteria. The internal responsiveness was examined using effect size, standardized response mean, and the Wilcoxon signed rank test. The external responsiveness was examined using receiver operating characteristic curves, correlation analyses, and regression models.
Both instruments were partially sensitive to changes during the 1-year follow-up and able to discriminate between patients with improved versus deteriorated motor signs. In addition, both were similarly responsive to changes in the motor difficulties of daily living; the SF-36 appeared to be more sensitive than the PDQ-39 to changes in depressive symptoms.
The SF-36 and the PDQ-39 were acceptably internally and externally responsive during the 1-year follow-up.
Parkinson’s disease (PD) is a progressive neurodegenerative disorder; therefore, measuring the patient’s health-related quality of life (HRQoL) to reflect the progression of PD is important. HRQoL refers to the status of one’s health in the physical, emotional, and social well-being . Soh et al.  reported that the 36-item Short-Form Health Survey (SF-36) is the most frequently used generic HRQoL instrument, and that the 39-item Parkinson’s Disease Questionnaire (PDQ-39) is the most frequently used disease-specific HRQoL instrument. Both measures have good internal consistency, stability, and discriminant validity [3, 4], and both are recommended by the Movement Disorder Society [4, 5].
The responsiveness of a measurement instrument is important because it reflects the extent to which the instrument can detect changes in the progression of a disease and whether it can show longitudinal validity [6, 7]. Two major types of responsiveness have been recommended: internal and external . Internal responsiveness refers to the ability of an instrument to detect change at two different time points. External responsiveness refers to the ability of an instrument to change relative to the change of a reference measure. These two types of responsiveness provide different but complementary information: internal responsiveness reflects patient-level changes, and external responsiveness reflects a different view of patient status using another available measure. Although the responsiveness of HRQoL instruments has been examined in several studies [9, 10], few have examined the comparative responsiveness of the SF-36 and the PDQ-39 in patients with PD. Schrag et al.  reported that the PDQ-39 was acceptably responsive to small changes in PD progression during a 1-year follow-up, independent of any intervention or external measure. Using self-reported changes in health status as external criteria, Brown et al.  reported that the SF-36 was more responsive than was the PDQ-39 in an 18-month follow-up. However, neither study [11, 12] included clinician-judged motor evaluation or psychological measures as external criteria to evaluate the responsiveness of the HRQoL instrument.
We assessed and compared the internal and external responsiveness of the SF‐36 and PDQ-39 in patients with PD. Evaluations of motor signs and motor difficulties of daily living and a measure of depressive symptoms were used as external criteria. Many interventions for patients with PD have targeted motor problems and depression [13, 14], but few have examined the comparative responsiveness of the SF-36 and the PDQ-39 against these external criteria to determine which one to use in longitudinal outcome research in patients with PD.
Study design and participants
Between September 2014 and August 2015, participants were consecutively recruited from the neurology departments of two medical centers in southern Taiwan. The inclusion criterion was a diagnosis of PD by a neurologist. The exclusion criterion was a Saint Louis University Mental Status Examination (SLUMS)  score that indicated severe cognitive impairment (a score < 19 for people with less than a high school education and < 20 for those with a high school education or above). The SLUMS is a 30-point questionnaire used to screen for cognitive impairment; it includes items for orientation, memory, attention, and executive function. Participants were concurrently interviewed and evaluated face-to-face at baseline and after 12 months of follow-up. The notion that a 1-year follow-up was sufficient was based on a prior study . Most (80.4%) evaluations were carried out when the participants were in the “on” phase of the medication cycle.
This study followed the principles of the Declaration of Helsinki. The study protocol was approved by the Institutional Review Board of National Cheng Kung University Hospital (B-ER-101-171) and the Institutional Review Board of E-Da Hospital (EMRP-102-024). Written informed consent was obtained from each participant.
HRQoL was examined using the SF-36 and PDQ-39. The SF-36  includes eight domains: physical functioning, role limitations caused by physical problems, bodily pain, general health perceptions, mental health, role limitations caused by emotional problems, social functioning, and vitality. Each domain score and the summary score range from 0–100, with higher scores indicating a better HRQoL. The Taiwanese version of the SF-36 has been tested in middle-aged women  and stroke patients  and has been reported to be reliable and valid.
The PDQ-39  has eight domains: mobility, activities of daily living, emotional well-being, stigma, social support, cognition, communication, and bodily discomfort. It is composed of a summary index and eight domain scores, each of which ranges from 0 to 100. A higher score indicates a more frequent self-perceived difficulty in HRQoL. The Taiwanese version of the PDQ-39 has been reported to be reliable and valid in patients with PD .
Parts II and III of the Movement Disorder Society Revision of the Unified Parkinson’s Disease Rating Scale (MDS-UPDRS)  and the Geriatric Depression Scale (GDS)  were used as external criteria. The MDS-UPDRS Part II includes 13 items that assess motor difficulties of daily living, and Part III contains 18 items that examine motor signs. Each item uses a 5-point Likert scale (0 = “no problems identified” and 4 = “severe problems identified”), and item scores are summed for the total score of each part . The 30-item GDS is a yes/no screening scale for depressive symptoms. The GDS has good psychometric properties and has been reported as efficient in patients with PD [24, 25].
According to Revicki et al. , the external criteria for assessing the responsiveness might be patient-rated global improvement, clinical measures with established responsiveness, or some combination of clinical and patient-based outcomes. Because it is important to select external criteria that are relevant for the disease indication and to use multiple independent external criteria, we chose the MDS-UPDRS and GDS, which are both recommended by the Movement Disorder Society [22, 24] and frequently used for evaluating motor function and depression symptoms in patients with PD. In addition, the minimal clinically important difference (MCID) has been established for the MDS-UPDRS Part III, which served as a reference for evaluating responsiveness of HRQoL measures in the present study.
We used the MCID of the MDS-UPDRS Part III  as a criterion for classifying patients with PD into Improving, Stable, and Worsening groups during the 1-year follow-up. Patients with MDS-UPDRS Part III change scores < −3.25 were considered improving, and those with change scores > 4.63 were considered worsening. The internal responsiveness of the SF-36 and PDQ-39 was then calculated using effect size (ES), standardized response mean (SRM), and the Wilcoxon signed rank test . The latter was used because the change scores of the SF-36 and PDQ-39 did not follow the normal distribution. Significance was set at p < 0.05.
ES is a ratio of mean change scores between the baseline and follow-up measure divided by the standard deviation (SD) in the baseline measure. The SRM is the mean of the differences between the baseline and follow-up, divided by the SD of the changed scores. Negative values in the ES and in the SRM represent a worse HRQoL and positive values represent a better HRQoL. In the ES and SRM, a value of 0.2 represents a small sensitivity to change, 0.5 a moderate sensitivity, and 0.8 a large sensitivity .
External responsiveness was evaluated using receiver operating characteristic curves (ROCs), correlation analyses, and regression models. The ROCs were used to assess the ability of a measure to reflect a change or lack of change in the external criteria . The area under the ROC is calculated for the range from 0.5 (not accurate for distinguishing improvers from non-improvers) to 1.0 (perfectly accurate)  based on the external criteria. When calculating the ROC, the MCID of the MDS-UPDRS Part III was used to judge whether a patient improved, remained stable, or worsened. Correlation analyses of the change scores between the HRQoL measures and the GDS and MDS-UPDRS Part II were done using Spearman’s rank-correlation. Correlation coefficients of 0.25-0.49, 0.5-0.74, and ≥ 0.75 show small, moderate, and strong associations, respectively . Simple linear regressions were used for a relative estimate of the degree of variance in the change of the external criterion that could be explained by the change score of the HRQoL instrument. SPSS 17.0 was used for statistical analyses (IBM, Chicago, IL, USA).
At the 1-year follow-up, we had completed assessing 74 of the 95 enrolled patients with PD. Twenty-one (22.1%) patients were lost to follow-up: 2 had died, 7 could not be contacted, and 12 had dropped out. The baseline demographics and evaluation scores were not significantly different between the 74 patients who completed the study and the 21 who did not. Most of the patients were male (67.6%) and at Hoehn and Yahr stages I and II (75.6%) at baseline (Table 1). One year of follow-up showed that the severity of the PD and motor signs of the 74 patients had significantly increased.
The MCID of the MDS-UPDRS Part III showed that 16 of the 74 patients had improved and that 34 were worsening (Table 2). The improved group had significant differences between baseline and follow-up scores in the SF-36 general health and PDQ-39 mobility domains. The worsening group had significant differences between baseline and follow-up scores in the SF-36 social functioning, vitality, and total scores, and in the PDQ-39 emotional well-being, social support, communication, bodily discomfort, and summary index scores. Moderate to large responsiveness (ES or SRM > 0.5) was found for the SF-36 general health (ES = 0.88, SRM = 0.65) and PDQ-39 mobility (SRM = 0.72) domains in the improved group, and for the PDQ-39 social support (ES = −1.20, SRM = −0.51) and communication (SRM = −0.55) domains, and for the summary index (SRM = −0.55) in the worsening group.
The values of the area under the ROCs indicated that both the SF-36 and the PDQ-39 discriminated between patients with improved and with worsening motor signs (Table 3). In addition, the change in the MDS-UPDRS Part II scores between baseline and follow-up had similar degrees of association with the changes in the SF-36 (r = −0.40, p < 0.01) and PDQ-39 (r = 0.45, p < 0.01) scores. The changes in the GDS scores were moderately associated with the changes in the SF-36 (r = −0.53, p < 0.01) scores and only slightly associated with the changes in the PDQ-39 (r = 0.29, p < 0.05) scores.
Regression analyses showed that the changes in the SF-36 and PDQ-39 scores explained more than 20% of the variance in the change scores of the MDS-UPDRS Part II (Table 3). A one-unit change in the SF-36 and in the PDQ-39 scores, on average, corresponded with approximately −0.15 and 0.25 points, respectively, in the changes in the MDS-UPDRS Part II scores. In addition, the change in the SF-36 scores explained 23% of the variance in the change scores of the GDS. A one-unit change in the SF-36 score corresponded with a change of approximately 0.18 points in the GDS score.
We examined the internal and external responsiveness of the SF-36 and PDQ-39 in a 1-year follow-up of patients with PD. We found that both HRQoL instruments partially detected changes in patients with improved and worsening motor signs, and were able to discriminate between them. In addition, both instruments were similarly responsive to changes in the motor difficulties of daily living, and the SF-36 was more sensitive to change in depressive symptoms.
Our findings support the longitudinal validity of the SF-36 and PDQ-39 for measuring the long-term health outcomes of patients with PD. However, our finding that the PDQ-39 was as responsive as the SF-36 for detecting 1-year change is inconsistent with Brown et al. . The inconsistency might be attributable to the different criteria used: they used a subjective rating of overall HRQoL, but the present study used a clinician-judged evaluation of motor signs. The internal responsiveness of the HRQoL measures also seemed to depend upon the criteria used.
We examined the internal responsiveness of the domain scores to better understand the characteristics of these two HRQoL instruments. The significant and moderate changes in the social support and communication domains of the PDQ-39 in the worsening group signal the continuous decline in social interaction associated with worsening motor signs in patients with PD. The motor signs evaluated in the MDS-UPDRS Part III included speech, facial masking, rigidity, bradykinesia, gait, posture, and tremor. A quantitative study  reported motor signs as a significant predictor of communication in the PDQ-39. In addition, much qualitative research [31–33] has vividly described how patients with PD perceived their movement and communication difficulties to be distressing and socially embarrassing. Such stigmatized feelings might subsequently lead to decreased social interaction. Additional research is needed to quantitatively examine the relationships between motor signs, experienced stigma, and social interaction in patients with PD.
The results of the internal responsiveness of the two HRQoL scales might be related to the type of anchor used, the characteristics of the HRQoL scales, and the changes that occurred in our patients. We used the MICD of the MDS-UPDRS Part III as a criterion for classifying patients into improving, stable, and worsening groups. The post hoc calculation of the change scores of the other measures for these three groups indicated that while the stable group had smaller changes in the MDS-UPDRS Part II and PDQ-39 than did the other two groups, they had substantial changes in the GDS and SF-36 that were comparable to those of the other groups (Additional file 1).
Moreover, when we calculated the changes in domain scores (Additional file 2), we found that the stable group had smaller changes than did the other two groups in the SF-36 domains of physical functioning and role-physical, and in the PDQ-39 domains of mobility and ADL. In contrast, the stable group had a notable decline in the SF-36 domains of general health and role-emotional, and in the PDQ-39 domain of emotional well-being, which might be associated with increased depression as measured by the GDS.
Overall, the heterogeneous results among the domains of the HRQoL scales reflect the complex interplay between the physical, psychological, and social factors of an individual. Using the MDS-UPDRS Part III as the anchor has the strength of distinguishing those with changed motor signs, but the results might not be generalizable if other anchors (e.g., GDS) are used.
Using the MCID of the MDS-UPDRS Part III as a gold standard for evaluating external responsiveness, we found that both the SF-36 and the PDQ-39 discriminated between patients with improved and with deteriorated motor signs. In addition, when the MDS-UPDRS Part II was used as the external criterion, the SF-36 and PDQ-39 had similar degrees of sensitivity to changes in the motor difficulties of daily living. Our post hoc correlation analyses of the change scores between the HRQoL domains and the MDS-UPDRS Part II showed significant correlations in the SF-36 physical functioning, bodily pain, social functioning, and vitality domains (rs = −0.233 ~ −0.487) and significant correlations in the PDQ-39 mobility, activities of daily living, emotional well-being, social support, and bodily discomfort domains (rs = 0.300 ~ 0.375). This suggests that the motor difficulties of daily living in patients with PD contribute to compromised HRQoL across physical, psychological, and social domains.
When the GDS was used as an external criterion, the SF-36 appeared to be more responsive to changes in depressive symptoms than did the PDQ-39. This might be because the SF-36 has three domains (mental health, role-emotional, and vitality) that ask questions directly related to psychological health, while the PDQ-39 has only one (emotional well-being). Our post hoc correlation analyses of the change scores between the HRQoL domains and the GDS also found that SF-36 had more domains significantly correlated with the GDS than did the PDQ-39. Significant correlations were found in six of eight domains in the SF-36 (physical functioning, role-physical, general health, mental health, role-emotional, and vitality; rs = −0.237 ~ −0.532), while in the PDQ-39, significant correlations were found in only three of eight domains (mobility, emotional well-being, and social support; rs = 0.285 ~ 0.316).
It is noteworthy that while the change scores of the SF-36 domains were significantly correlated with the change scores of either the MDS-UPDRS Part II or the GDS, some domains (stigma, cognition, and communication) of the PDQ-39 were correlated with neither one. Because the PDQ-39 items were generated from in-depth interviews with patients as concerns of the effect of PD, future research should include other external criteria (e.g., social interaction) to validate these unique domains.
This study has some limitations. First, we used convenience sampling and recruited our participants from two medical centers. The generalizability of our findings might thus be limited to clinic-based patients with PD. In addition, 22.1% of participants at baseline were lost to the 1-year follow-up. Although their baseline demographics and evaluation scores were not significantly different from those of the patients who completed the study, the smaller sample size might affect the validity of our results and conclusion. Moreover, all patients were tested at a single point in time without controlling for the medication cycle phase during the interview or evaluation. We allowed medication effects to vary naturally, because the purpose of this study was not to test responsiveness to any intervention. However, this design might lead to some confounding of the evaluation results. Future research probably should control for participants’ medication status to arrive at a more precise estimation of intervention effectiveness. Finally, psychometric testing of the Taiwanese version of the SF-36 has been done only in middle-aged women and stroke patients [18, 19]. Future research probably should examine the psychometric properties of the SF-36 in Taiwanese patients with PD.
This is the first study that uses a clinician-judged evaluation of motor signs and a psychological measure as external criteria to evaluate the internal and external responsiveness of HRQoL measures in patients with PD. Overall, both the SF-36 and the PDQ-39 were partially sensitive to change in the 1-year follow-up and discriminated between patients with improved and with deteriorated motor signs. In addition, the SF-36 and the PDQ-39 were similarly responsive to changes in the motor difficulties of daily living, and the SF-36 was more sensitive than was the PDQ-39 to changes in depressive symptoms.
Geriatric depression rating scale
Health-related quality of life
Minimal clinically important difference
Movement Disorder Society Revision of the Unified Parkinson’s Disease Rating Scale
39-item Parkinson’s disease questionnaire
Receiver operating characteristic curves
36-item short form health survey
Standardized response mean
Centers for Disease Control and Prevention. Population assessment of health-related quality of life. Atlanta: CDC; 2000.
Soh SE, Morris ME, McGinley JL. Determinants of health-related quality of life in Parkinson’s disease: a systematic review. Parkinsonism Relat Disord. 2011;17:1–9.
Jenkinson C, Fitzpatrick R. The development and validation of the Parkinson’s disease questionnaire and related measures. In: Jenkinson C, Peters M, Bromberg MB, editors. Quality of life measurement in neurodegenerative and related conditions. New York: Cambridge University Press; 2011. p. 10–23.
Martinez-Martin P, Jeukens-Visser M, Lyons KE, Rodriguez-Blazquez C, Selai C, Siderowf A, et al. Health-related quality-of-life scales in Parkinson’s disease: critique and recommendations. Mov Disord. 2011;26:2371–80.
Den Oudsten BL, Van Heck GL, De Vries J. The suitability of patient-based measures in the field of Parkinson’s disease: a systematic review. Mov Disord. 2007;22:1390–401.
Beaton DE, Bombardier C, Katz JN, Wright JG. A taxonomy for responsiveness. J Clin Epidemiol. 2001;54:1204–17.
Terwee C, Dekker F, Wiersinga W, Prummel M, Bossuyt P. On assessing responsiveness of health-related quality of life instruments: guidelines for instrument evaluation. Qual Life Res. 2003;12:349–62.
Husted JA, Cook RJ, Farewell VT, Gladman DD. Methods for assessing responsiveness: a critical review and recommendations. J Clin Epidemiol. 2000;53:459–68.
Murawski MM, Miederhoff PA. On the generalizability of statistical expressions of health related quality of life instrument responsiveness: a data synthesis. Qual Life Res. 1998;7:11–22.
Wiebe S, Guyatt G, Weaver B, Matijevic S, Sidwell C. Comparative responsiveness of generic and specific quality-of-life instruments. J Clin Epidemiol. 2003;56:52–60.
Schrag A, Spottke A, Quinn NP, Dodel R. Comparative responsiveness of Parkinson’s disease scales to change over time. Mov Disord. 2009;24:813–8.
Brown CA, Cheng EM, Hays RD, Vassar SD, Vickrey BG. SF-36 includes less Parkinson disease (PD)-targeted content but is more responsive to change than two PD-targeted health-related quality of life measures. Qual Life Res. 2009;18:1219–37.
Keus SHJ, Bloem BR, Hendriks EJM, Bredero-Cohen AB, Munneke M, Practice Recommendations Dev G. Evidence-based analysis of physical therapy in Parkinson’s disease with recommendations for practice and research. Mov Disord. 2007;22:451–60.
Yang S, Sajatovic M, Walter BL. Psychosocial interventions for depression and anxiety in Parkinson’s disease. J Geriatr Psychiatry Neurol. 2012;25:113–21.
Tariq SH, Tumosa N, Chibnall JT, Perry MH, Morley JE. Comparison of the Saint Louis University mental status examination and the mini-mental state examination for detecting dementia and mild neurocognitive disorder—a pilot study. Am J Geriatr Psychiatr. 2006;14:900–10.
Fitzpatrick R, Peto V, Jenkinson C, Greenhall R, Hyman N. Health-related quality of life in Parkinson’s disease: a study of outpatient clinic attenders. Mov Disord. 1997;12:916–22.
Ware Jr JE, Sherbourne CD. The MOS 36-item short-form health survey (SF-36): I. Conceptual framework and item selection. Med Care. 1992;30:473–83.
Fuh JL, Wang SJ, Lu SR, Juang KD, Lee SJ. Psychometric evaluation of a Chinese (Taiwanese) version of the SF-36 health survey amongst middle-aged women from a rural community. Qual Life Res. 2000;9:675–83.
Shyu YIL, Lu JFR, Chen ST. Psychometric testing of the SF-36 Taiwan version on older stroke patients. J Clin Nurs. 2009;18:1451–9.
Peto V, Jenkinson C, Fitzpatrick R, Greenhall R. The development and validation of a short measure of functioning and well being for individuals with Parkinson’s disease. Qual Life Res. 1995;4:241–8.
Ma HI, Hwang WJ, Chen-Sea MJ. Reliability and validity testing of a Chinese-translated version of the 39-item Parkinson’s disease questionnaire (PDQ-39). Qual Life Res. 2005;14:565–9.
Goetz CG, Tilley BC, Shaftman SR, Stebbins GT, Fahn S, Martinez‐Martin P, et al. Movement disorder society-sponsored revision of the unified Parkinson’s disease rating scale (MDS-UPDRS): scale presentation and clinimetric testing results. Mov Disord. 2008;23:2129–70.
Yesavage JA, Brink TL, Rose TL, Lum O, Huang V, Adey M, et al. Development and validation of a geriatric depression screening scale-a preliminary report. J Psychiatr Res. 1983;17:37–49.
Schrag A, Barone P, Brown RG, Leentjens AFG, McDonald WM, Starkstein S, et al. Depression rating scales in Parkinson’s disease: critique and recommendations. Mov Disord. 2007;22:1077–92.
Williams JR, Hirsch ES, Anderson K, Bush AL, Goldstein SR, Grill S, et al. A comparison of nine scales to detect depression in Parkinson disease: which scale to use? Neurology. 2012;78:998–1006.
Revicki DA, Cella D, Hays RD, Sloan JA, Lenderking WR, Aaronson NK. Responsiveness and minimal important differences for patient reported outcomes. Health Qual Life Outcomes. 2006;4:70.
Horváth K, Aschermann Z, Ács P, Deli G, Janszky J, Komoly S, et al. Minimal clinically important difference on the motor examination part of MDS-UPDRS. Parkinsonism Relat Disord. 2015;21:1421–6.
Deyo RA, Centor RM. Assessing the responsiveness of functional scales to clinical change: an analogy to diagnostic test performance. J Chronic Dis. 1986;39:897–906.
Portney LG, Watkins MP. Foundations of clinical research: applications to practice. 3rd ed. Upper Saddle River: Prentice Hall; 2009.
Simpson J, Lekwuwa G, Crawford T. Predictors of quality of life in people with Parkinson’s disease: evidence for both domain specific and general relationships. Disabil Rehabil. 2014;36:1964–70.
Abudi S, Bar-Tal Y, Ziv L, Fish M. Parkinson’s disease symptoms-patients’ perceptions. J Adv Nurs. 1997;25:54–9.
Den Oudsten BL, Lucas-Carrasco R, Green AM, Whoqol-Dis G. Perceptions of persons with Parkinson’s disease, family and professionals on quality of life: an international focus group study. Disabil Rehabil. 2011;33:2490–508.
Dauwerse L, Hendrikx A, Schipper K, Struiksma C, Abma TA. Quality-of-life of patients with Parkinson’s disease. Brain Inj. 2014;28:1342–52.
We thank all of the participants.
The study was supported by grant NCKUEDA 10214 from E-Da Hospital, Kaohsiung, Taiwan. The funders had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Availability of data and materials
Data cannot be shared because our participants consented to have their data used only in this project.
XJT contributed to the study design, collected data, analyzed all data, and drafted the manuscript. WJH contributed to the study design and collected data. SPH collected data. HIM contributed to the interpretation and discussion of the data and made critical revisions to the manuscript. All authors read and approved the final manuscript.
The authors declare that they have no competing interests.
Consent for publication
Ethics approval and consent to participate
The study protocol was approved by the Institutional Review Board of National Cheng Kung University Hospital (B-ER-101-171) and the Institutional Review Board of E-Da Hospital (EMRP-102-024). Written informed consent was obtained from each participant.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.