The multiple sclerosis rating scale, revised (MSRS-R): Development, refinement, and psychometric validation using an online community

Background In developing the PatientsLikeMe online platform for patients with Multiple Sclerosis (MS), we required a patient-reported assessment of functional status that was easy to complete and identified disability in domains other than walking. Existing measures of functional status were inadequate, clinician-reported, focused on walking, and burdensome to complete. In response, we developed the Multiple Sclerosis Rating Scale (MSRS). Methods We adapted a clinician-rated measure, the Guy’s Neurological Disability Scale, to a self-report scale and deployed it to an online community. As part of our validation process we reviewed discussions between patients, conducted patient cognitive debriefing, and made minor improvements to form a revised scale (MSRS-R) before deploying a cross-sectional survey to patients with relapsing-remitting MS (RRMS) on the PatientsLikeMe platform. The survey included MSRS-R and comparator measures: MSIS-29, PDDS, NARCOMS Performance Scales, PRIMUS, and MSWS-12. Results In total, 816 RRMS patients responded (19% response rate). The MSRS-R exhibited high internal consistency (Cronbach’s alpha = .86). The MSRS-R walking item was highly correlated with alternative walking measures (PDDS, ρ = .84; MSWS-12, ρ = .83; NARCOMS mobility question, ρ = .86). MSRS-R correlated well with comparison instruments and differentiated between known groups by PDDS disease stage and relapse burden in the past two years. Factor analysis suggested a single factor accounting for 51.5% of variance. Conclusions The MSRS-R is a concise measure of MS-related functional disability, and may have advantages for disease measurement over longer and more burdensome instruments that are restricted to a smaller number of domains or measure quality of life. Studies are underway describing the use of the instrument in contexts outside our online platform such as clinical practice or trials. The MSRS-R is released for use under creative commons license.


Background
Multiple Sclerosis (MS) is a neurological condition characterised by lesions of myelin sheaths encapsulating the neurons of the brain, spine, and optic nerve, causing transient or progressive symptoms and disability. Measuring MS is challenging; objective measurement requires complex tools (e.g. MRI using an expensive and immobile device), experience (e.g. examination from a specialist neurologist), and/or significant time to complete (e.g. MS Functional Composite, 15 minutes of testing requiring special equipment [1]). Patient-perceived symptoms can fluctuate seasonally [2], daily, hourly, or even in response to variations in temperature [3]; they may be unmasked only on specific tasks, and they may involve complex systems such as vision, cognition, sexual function, and bladder function.
The PatientsLikeMe online data platform (www. patientslikeme.com) was built to allow patients with lifechanging illnesses to share data about their experiences of symptoms and disability through structured data collection [4]. Use of the system has shown benefit through improved health literacy, better communication with healthcare professionals, and development of a peer support network [5,6]. The platform has proved useful in developing other patient-reported outcomes (PROs) using patients' own language [7,8].
In expanding the platform in 2007 to include MS, a number of instruments were considered. The MS Impact Scale (MSIS-29, [9]) was not intended solely to measure MS disability; it also included health-related functional impact (e.g. limitations in social and leisure activities). The MS Walking Scale (MSWS-12) has the obvious limitation of focusing only on walking [10,11]. The North American Research Committee on Multiple Sclerosis (NARCOMS) patient registry has developed validated performance scales (PS) [12,13] in areas including walking, fatigue, cognition, and vision. The PS have clearly defined anchor points for each response; but the instrument is long (about 2,500 words), and the inconsistent response format requires close reading to avoid confusion and erroneous reportinga potential challenge for patients with cognitive issues and fatigue. NARCOMS has also used the patient-determined disease steps (PDDS) [14], which resembles a patient-reported form of the Expanded Disability Status Scale (EDSS) and may have some of the same limitations as that instrument [15].
In the absence of an agreed-upon "gold standard" PRO, we collaborated with an MS specialist to develop the MS Rating Scale (MSRS). This paper describes the development of the original MSRS, as well as work to revise the scale through cognitive debriefing to produce a revised version (MRSR-R), data on psychometric performance, and comparisons with other patient-reported MS scales.

MS Rating Scale (MSRS) Development
The objective of the MSRS was to accurately quantify the level of MS-relevant disability experienced by patients across a range of domains affected by demyelinating lesions. Observation of an MS specialist's clinic at King's College Hospital in London identified seven domains routinely asked about in clinical practice as part of a "top to toe" clinical interview. These domains were intended to reflect the degree of lesion burden for nervous system regions enervating the region of interest (e.g. optic nerves for "vision"). We adapted a modified scoring scheme from the Guy's Neurological Disability Scale (GNDS [18]), which took a relatively consistent approach to scoring each domain, using the first four levels of disability ("0 -Normal status", "1 -Symptoms causing no disability", "2 -Mild disability not requiring help from others", "3 -Moderate disability requiring help from others") and the final level (4 -"Total loss of function, maximal help required"). A consistent scoring scheme was preferred in order to minimize response burden and encourage repeated entry of data longitudinally. Total score was the sum of the 7 items, with a range of 0-28. For website display the patient profile rescales the score to a 0-100 scale ( Figure 1). Remaining domains from the GNDS, such as "mood" (split into anxiety and depression), bladder, bowel, sexual dysfunction, fatigue and spasms, were integrated into the existing PatientsLikeMe symptoms system and rated by patients as "none"(0), "mild" (1), "moderate" (2), or "severe" (3).
The resulting scale, the MSRS, was felt to have the advantage of tapping a range of important domains for MS patients, not solely focused on walking but including other aspects of function that might be important to monitor over time. Using the MSRS, patients are easily able to create a longitudinal record of their experience of MS to share with others and to help understand the impact of their treatments (Figure 1). At the time of survey invitation (Fall 2010), 15,219 users had completed at least one MSRS survey, for a total of 72,975 reports on the PatientsLikeMe system. Members join the site understanding that their de-identified data will be used for research as part of the terms of service.

Review and Revision of the MSRS Cognitive Interviews
To test that the instrument captured all domains considered relevant by patients and to identify areas for improvement, patients were recruited for interview by private message on the PatientsLikeMe platform. All patients were local to Boston, Massachusetts and were selected to represent diverse clinical experience. Patients were offered a $50 honorarium for their participation, which took approximately 2 hours. Feedback served as the basis for a revised MSRS (MSRS revised, MSRS-R).
The wording of MSRS response options was clarified, defining "disability" more clearly and changing the highest response category from "total" to "severe" disability, along with minor text changes. (see Table 1 for revisions). The domain "upper limb function" was clarified to "using your arms and hands", and "bowel or bladder" dysfunction was added as a domain of functional impairment (see Table 2 for revisions).

Online survey
The PatientsLikeMe survey system was used to test the psychometric properties of the MSRS-R (incorporating feedback from the cognitive debriefings). The survey consisted of the MSRS-R, a fixed list of MS symptoms, and a report about the patient's most recent relapse. If they chose to report on that relapse, they were asked for start-and end-dates of the relapse, to rate the severity of the relapse ("mild", "moderate", or "severe"), and whether the relapse had required hospitalization or treatment with steroids, or resulted in any permanent loss of function. The patient then used the MSRS-R to describe their level of disability when they were feeling worst during the relapse.
The remaining sections of the survey were composed of scales identified as being used as secondary outcome measures in clinical trials: the MSIS-29, NARCOMS PS, PDDS, and MSWS-12. Quality-of-life instruments such as the MSQOL-54 were not used because of their predominant focus on QOL as opposed to disability. The PRIMUS consists of 3 components -22 symptoms, 15 activities, and 22 quality-of-life statements. We made an error in implementing the quality-of-life component and included only the first 12 items, but fortunately these items sample the full range of item locations on the quality-of-life scale, as described by the PRIMUS developers [16]]. Those who had completed the survey within one week of initial invitation were asked to complete a 1-week retest, which included only the PatientsLikeMe measures. For this follow-up survey the patients were asked to respond retrospectively about how they were feeling at the time of the first survey. Another instrument, the PRIMUS, was also administered but is not reported here due to the instrument being fielded incorrectly.
Upon site registration, PatientsLikeMe users agree that they may be asked to participate in research; as a research study using only online questionnaires with minimal risk, IRB approval was not sought for this study. However, in accordance with the Declaration of Helsinki, participants were informed about the aims of the study, were given the option to opt-in without incentive and opt-out without any negative consequences. In order to target active users, we invited patients accessing their accounts during the 90 days prior to 24 August 2010. Five days after the initial invitation, a reminder was sent to all invited patients who had not yet completed the survey. At that point an invitation to the retest was also sent to all who had completed the baseline survey within the first six days of the field period. Retest participants also received a reminder 5 days after the retest invitation.

Statistical Analysis
Data were analyzed using Statistical Package for the Social Sciences (SPSS) V20. Descriptive statistics document the distribution of responses and measurement properties of the MSRS-R. Principal component analysis was used to identify the structure of the MSRS-R. The number of factors was left unconstrained, with eigenvalues >1 initially considered worthy of further interest. We also conducted a parallel analysis, using the procedure of Horn [19] and SPSS syntax [20], in order to compare the magnitude of observed eigenvalues against that generated by random arrangements of the same data.
Internal consistency was assessed using item-to-item Spearman correlations and Cronbach's alpha, which should be above 0.7 to be considered adequate. Concurrent validity was tested using correlations between the MSRS-R and other scales, in addition to subscales of the MSIS-29, MSWS-12, and PRIMUS. Test-retest reliability was assessed first with Spearman correlations and then with a Bland-Altman plot.
Known-group validity was assessed by comparing the MSRS-R scores of patients grouped by level of impairment on the self-reported PDDS, which is known to correlate highly with the EDSS, a widely used clinicianrated scale in MS trials. In addition, we used the patient's estimate of the number of relapses they had experienced in the past two years, on the basis that relapses in relapsingremitting MS contribute to a worsening burden of disability [21]. ANOVA (with Bonferroni corrected post-hoc tests) was used to compare MSRS-R scores in the groups, and it was hypothesized that patients in more severe PDDS groups or with a higher number of relapses in the past two years would have worse (higher) MSRS-R scores. Given the prominence of walking measurements in MS, we also performed similar analyses for the MSRS-R walking item. Other between-group differences were assessed using ANOVA, Student's t-test, and Kruskal-Wallis tests as appropriate. Clinician-assessed validity and responsiveness to change are the subjects of future investigations.

Responder characteristics
Data reported here describe patients who self-reported a diagnosis of relapsing-remitting MS (RRMS), but data  All prospective members of the site are invited to add their age and sex to their profile, but not all had chosen to do so by the time of survey. Using available profile data from non-completers, patients who completed the baseline survey were around 3 years older than noncompleters (see Table 3). The groups differed significantly on sex, although this difference became a non-significant trend after removing patients without ascertained sex (X 2 (1) = 3.476, p = .063). Baseline completes were also more likely to have reported more relapses on their profile in the 2 years prior to the survey, but this may represent different levels of engagement on the website rather than true disease severity; a similar explanation may underlie for the similar pattern for most recent MSRS score from the patients' profiles.
A little over half of patients reported a recent relapse (52%, N = 424/816). Duration since relapse was distributed between 33% (N = 140/424) reporting one within the month prior to survey completion, 23% (N = 97/424) reported a relapse between one and three months prior, and 44% (N = 187/424) reporting a relapse three or more months ago. The vast majority of relapses (98.7%) were reported from the past decade (2000-2010).
On 14 September, invitations for the 1-week retest were sent to the 391 patients who had completed the survey by this point. 192 RRMS patients (49.1%) completed the retest survey; 27 (6.9%) provided incomplete answers, and 10 (2.6%) opted out at this stage. The remaining 162 (41.4%) made no response to the retest survey invitation. Participants who completed the retest did not differ on sex, age, disease duration, or disease severity from other eligible participants who took the baseline survey (Table 3), or from those who had not completed the baseline survey in time to be eligible for the retest (not shown). Both surveys were closed to further participation on 23 September 2010.

MSRS-R Psychometric Characteristics
The revised MSRS-R measure added "bowel or bladder" as a functional area. The revised measure also characterized levels of impairment using more patient-friendly terms around "activity limitation" in contrast to "functional disability", and characterized the most disabled state as "severe" rather than "total disability". Although we did not design the study to compare severity of disability using the original MSRS and the MSRS-R, we did have a small number of respondents (n = 211) who had used the original MSRS to populate their site profile within a month of completing the baseline MSRS-R for this study. Table 4 shows somewhat greater use of the extreme disability category ("severe") for the MSRS-R compared with the extreme disability category ("total disability") for the original MSRS.

Factor analysis
We assessed the dataset for suitability for factor analysis. Given the baseline respondent sample size of 816, we had approximately 103 participants per variable. Correlations between items were all above ρ = 0.3. Bartlett's test of sphericity [22] was significant at p < 0.001, supporting the factorability of the correlation matrix. The Keyser-Meyer-Olkin value of sampling adequacy was 0.884, exceeding the recommended value of 0.6 [23,24]. PCA revealed a single factor with an eigenvalue of 4.2 accounting for 51.5% of variance; the second highest eigenvalue was 0.9. Further, the results of a parallel analysis [19] using the same dataset showed no components exceeding the corresponding criterion values (8 variables x 816 respondents x 100 replications); the highest eigenvalue produced by the parallel analysis was 2.0. Table 5 shows the MSRS-R item factor loadings.

Internal consistency
Cronbach's alpha (.86) indicated acceptable internal consistency. Item convergent validity as assessed by correlations between items and total score was also acceptable, ranging from Spearman's rho of ρ = 0.68 for walking to ρ = 0.77 for using arms and hands.

Test-retest
A Bland-Altman plot of the differences in total score at baseline and 1-week retest (N = 192) showed no systematic pattern and no outliers (see Figure 2). The plot showed 182 of 192 cases (95%) lie within two standard deviations of the mean (mean difference = 0.74, SD: 2.7, Limits of agreement (+/− 2SD): -4.6 -6.1). Examination of item-level differences using a Wilcoxon signed rank test revealed significant differences only for the "Numbness, Tingling, Burning Sensation, or Pain" item (z = −4.438, p < 0.001) with a small effect size (ρ = .23) and mean difference of 1.6 points (SD: 1.1).

Known group validity
We hypothesized that higher total MSRS-R scores (worse disability) would be observed for those with worsening PDDS status and a higher number of relapses. Between-groups differences on MSRS-R by PDDS levels were found to be significant using ANOVA (F(7,808) = 77.250, p < .001). Post-hoc Bonferroni tests of all pair-wise comparisons revealed significant differences between MSRS-R for normal and mild PDDS from all other PDDS levels (p < .001). "Moderate" and "Gait disability" PDDS levels were not significantly different from one another (p > 0.05). Higher levels of mobility impairment ("early cane", "late cane", "bilateral support") differed significantly from "normal", "mild", and "moderate" disability, but not from each other (p > 0.05). "Use of a wheelchair or scooter" differed only from "normal" or "mild" disability (p < 0.05) on the MSRS-R. No respondents endorsed the most severe category on the PDDS ("Bedridden"). Examination of individual MSRS-R items revealed a much stronger step-wise relationship between PDDS and the MSRS-R walking item than other items, although "bladder & bowel" shows a similar, though less marked pattern (Table 8).

MSRS-R in retrospectively reported relapses
Within MSRS-R scores for retrospectively reported relapses captured actively in the survey (as opposed to passively in the patient's site profile), Figure 3 shows the difference between MSRS-R at baseline and most recent relapse according to the perceived severity of the relapse ("mild", "moderate", or "severe", N = 424). The largest differences were reported, on average, for walking, upper limb function, and the sensations (numbness, tingling, burning, and pain), followed by vision, speaking, and then swallowing.

Discussion
The original MSRS was designed to minimize respondent burden, using a minimum number of items to cover relevant aspects of patient experience and simple, clear language in both the questions and response options. Patients have indicated in qualitative interviews that the questions are relevant to their experience, easy to understand, and easy to respond to, and provide an accurate profile of their experience of MS over time. Its deployment on PatientsLikeMe led to widespread use by thousands of MS patients, who report that using the site has produced a number of benefits including improved understanding of their condition and improved communication with their healthcare providers [5]. Following cognitive debriefing, a number of small modifications were made to produce the MSRS-R (Revised). After fielding in a survey, statistical analysis shows the MSRS-R exhibits desirable psychometric properties in terms of ceiling and floor effects, internal consistency, factor structure, test-retest, and known-group validity. Importantly, the MSRS-R correlated in expected ways with alternative measures in widespread use (MSIS-29, PDDS, NARCOMS PS, and MSWS-12), suggesting acceptable concurrent validity and potential use as a research tool.   The MSRS-R has the advantage of being more concise than any of the other instruments fielded in this study; for instance, the PDDS requires a patient to read approximately 360 written words to gauge walking disability; the MSRS-R walking item has 33 words and produces very similar results. Furthermore, our analysis confirmed that the PDDS, like the EDSS it is based upon, is predominantly focused on walking; by contrast, the entire MSRS-R covers eight domains but is only 53 words long and uses a consistent response format, which makes it less burdensome for patients to read and complete.
Currently, MS trial design focuses on the frequency of relapses but is uninformed by the nature of these relapses, and so an attack that leaves one patient unable to walk and another unable to see are counted the same. Analysis of retrospective relapses in the current study demonstrated that the nature of relapses experienced in this population could be characterized by changes from baseline within specific domains of function using the MSRS-R. This may be useful in improving our understanding of MS characterization, progression, and response to therapy. In addition, our known group validity analysis  suggested that the MSRS-R might be more sensitive to cumulative burden of disability resulting from recurrent relapses than the PDDS; further study could compare MSRS-R against other measures of cumulative burden such as magnetic resonance imaging. Following this psychometric validation and upgrade of the existing MSRS to the newer MSRS-R, passively collected profile data in the PatientsLikeMe platform could be studied as a form of observational registry combining demographic, social networking, treatment, and symptom data. Such data would extend to a larger number of patients than described here, and to MS disease types other than RRMS, and could illuminate the real-world impact of newer therapies for MS. For instance, it may be particularly interesting to retrieve the prospective data for patients who did not report any treatment at the time of creating their account and study the amount of MSRS-R change that triggers initiation of treatment, or to gauge the effectiveness of treatment in stabilizing or reducing disability relative to similar patients who did not start treatment.
With regard to administration, although we did not explicitly test for differences between, e.g., paper-andpencil questionnaires compared with online questionnaires, we expect that there would be no or minimal difference between data collected in these modes. The cognitive interviews did not suggest any significant difference between patients' responses on paper and how they would have responded (or how they had responded previously) using the report tools on the PatientsLikeMe platform. The online form is two-dimensional, and no wording or format changes are required to adapt the MSRS-R, symptoms, or relapse questions for paper-andpencil administration.
The limitations of this study are shared by many postal or internet-based questionnaire designs. We have no independent validation that respondents actually do have MS; however, as there was no incentive for participating, there would be little incentive to enter false data. Our analysis of responders found them to be a little older and more affected by MS than non-respondents; this is perhaps unsurprising given that sicker patients may be more inclined to return to PatientsLikeMe to seek support. One  8 -bedridden 0 n/a n/a n/a n/a n/a n/a n/a n/a n/a advantage of this data collection platform is that we can systematically describe the population of non-responders. It is likely that the entire PatientsLikeMe population may differ systematically from the broader MS population (see [8] for a comparison with the Sonya Slifka Longitudinal MS Study), therefore these findings should be generalized cautiously. That said, our findings on the response characteristics of the comparison instruments used proved similar to their own validation studies. The response rate (19%) was relatively low but was not atypical for a survey of this online community [5]. The most significant limitation was that we lacked independent clinical assessment of disease severity from a clinician experienced in the field; we are seeking to address this in future studies.
A further limitation that underscores the difference between passively collected profile data and actively sampled survey data is the mismatch between the 345 completers (43%) who had at least one relapse recorded on their profile and the 424 completers (52%) who reported having at least one relapse when polled on the survey. Passively collected data provides a large body of longitudinal data but suffers from attrition bias; actively collected data provides a more accurate cross-section at one or a few points in time but is more costly to collect and may suffer from responder bias.
Evaluating the quality of the test-retest with a Bland-Altman plot is difficult in the absence of a gold standard and the lack of agreed standards for measurement variability in MS. It is possible that in performing the testretest, some patients may not have clearly read the instructions to report retrospectively to the first time they completed the survey; if so, the degree of test-retest agreement reported here would be an underestimate and should be investigated further.
A copy of the MSRS-R is included as an appendix to this manuscript, and the instrument is distributed with a Creative Commons "Attribution-ShareAlike3.0 Unported" license, meaning it can be used freely (including commercially), altered, transformed, or built upon, so long as all derivative work is licensed in the same fashion and proper attribution is made (Additional file 1).

Conclusion
The MSRS-R has been shown to be a useful tool for measuring the impact of MS and may help patients and clinicians understand the course of disease, the impact of their treatments, side effects, and relapses. It is hoped that an enhanced understanding of these aspects of MS may help improve patients' outcomes.