Skip to main content

Using the Fatigue Severity Scale to inform healthcare decision-making in multiple sclerosis: mapping to three quality-adjusted life-year measures (EQ-5D-3L, SF-6D, MSIS-8D)



Fatigue has a major influence on the quality of life of people with multiple sclerosis. The Fatigue Severity Scale is a frequently used patient-reported measure of fatigue impact, but does not generate the health state utility values required to inform cost-effectiveness analysis, limiting its applicability within decision-making contexts. The objective of this study was to use statistical mapping methods to convert Fatigue Severity Scale scores to health state utility values from three preference-based measures: the EQ-5D-3L, SF-6D and Multiple Sclerosis Impact Scale-8D.


The relationships between the measures were estimated through regression analysis using cohort data from 1056 people with multiple sclerosis in South West England. Estimation errors were assessed and predictive performance of the best models as tested in a separate sample (n = 352).


For the EQ-5D and the Multiple Sclerosis Impact Scale-8D, the best performing models used a censored least absolute deviation specification, with Fatigue Severity Scale total score, age and gender as predictors. For the SF-6D, the best performing model used an ordinary least squares specification, with Fatigue Severity Scale total score as the only predictor.


Here we present algorithms to convert Fatigue Severity Scales scores to health state utility values based on three preference-based measures. These values may be used to estimate quality-adjusted life-years for use in cost-effectiveness analyses and to consider the health-related quality of life of people with multiple sclerosis, thereby informing health policy decisions.


Over the last two decades, various disease-modifying and symptomatic treatments have been developed for people with Multiple Sclerosis (MS). Meanwhile, increasing emphasis has been placed on achieving “value for money” within healthcare systems [1]. Clinical trials of interventions that target particular symptoms frequently use symptom-specific outcome measures in order to maximise sensitivity and responsiveness to change. Fatigue is the most common symptom experienced by people with MS, and has a considerable impact on quality of life [2]. The Fatigue Severity Scale (FSS) [3] is frequently used in clinical trials of interventions for fatigue in people with MS, including carnitine, amantadine, aspirin, modafinil and cognitive behavioural therapy [4,5,6,7]. Symptom-specific outcome measures, such as the FSS, provide a standardised means of describing “health states” that may be experienced by patients, but do not provide data in the format required by many decision-making bodies to assess cost-effectiveness [1].

The quality-adjusted life-year (QALY) is recommended for use as an outcome measure for cost-effectiveness analyses by several national decision-making bodies, eg the National Institute for Health and Care Excellence (NICE) [8,9,10]. QALYs combine quantity and quality of life in a single measure, by adjusting the number of life-years lived according to the quality-of-life experienced during those years [1]. In order to estimate QALYs, numerical values must be assigned to reflect the quality of life experienced when living in particular health states. These values are commonly obtained using preference-based measures (PBMs) of health-related quality of life [11].

However, many clinical trials do not include a PBM, limiting the ability to conduct economic evaluations. In such cases, statistical procedures may be used to “map” scores on non-preference based outcome measures, such as the FSS, to health state utility values (HSUVs) derived from PBMs. “Mapping” involves regression analysis, using a dataset containing responses to both measures from the same sample, to derive an algorithm that can be used to convert data from non-preference-based measures into HSUVs. Over recent years, the use of mapping has increased considerably [11]. Previous studies have reported on mapping from MS-specific outcome measures including the Multiple Sclerosis Impact Scale and the Multiple Sclerosis Walking Scale-12 [12,13,14]. However, no approach has been reported that uses fatigue measures to map to HSUVs in the context of MS.


This paper uses statistical techniques to map from the FSS (the “source measure”) to HSUVs derived from three preference-based measures: the EQ-5D, SF-6D and MSIS-8D (the “target measures”). The aim is to derive algorithms to convert FSS scores into HSUVs for use in assessing the cost-effectiveness of treatments for fatigue in people with MS. The statistical approach presented here is based on good practice methodology, and is consistent with the recommendations regarding mapping methods from NICE in the UK [15] and the international ISPOR Good Practices for Outcomes Research Task Force [16].


The Fatigue Severity Scale (FSS) has acceptable reliability, internal consistency, sensitivity and responsiveness for people with MS [3, 17,18,19,20,21]. It comprises nine statements, describing the severity and impact of fatigue, with a scale of possible responses ranging from 1 (“strongly disagree”) to 7 (“strongly agree”). FSS total scores are usually reported as the mean score over the nine items; a higher score indicates greater severity.

The EuroQoL EQ-5D-3L has five dimensions (mobility, self-care, usual activities, pain/discomfort, anxiety/depression) with three response levels per dimension - no problems, some problems or extreme problems/confined to bed. HSUVs were derived from the preferences of a representative sample of the UK general population, using a variant of the time trade-off (TTO) technique, and range from − 0.594 to 1.000 [22]. The EQ-5D is widely used in economic evaluations, particularly in the UK, where NICE recommend it as the preferred measure of health outcomes for cost effectiveness analyses [8].

The Short-Form 6D (SF-6D) enables HSUVs to be estimated from a popular non-preference based measure of health-related quality of life (HRQoL), the Short-Form 36 (SF-36). It consists of six dimensions (physical functioning, role limitations, social functioning, pain, mental health, vitality) with between four and six response levels. Preferences were elicited from a representative sample of the UK general population using the standard gamble technique and values range from 0.301 to 1.000 [23]. The dataset used for analysis includes responses to Version 1 of the SF-36 from earlier waves of data collection, before this was replaced by SF-36 Version 2, which was developed to address concerns about the structure and wording of some items [24]. Given that the component items of the SF-6D classification system differ between the two versions, we only included responses to Version 2 of the SF-36 in this analysis, in order to ensure consistency.

The Multiple Sclerosis Impact Scale 8D (MSIS-8D) enables HSUVs to be estimated from responses to a MS-specific outcome measure, the Multiple Sclerosis Impact Scale (MSIS-29). It includes eight dimensions (physical function, social and leisure activities, mobility, daily activities, mental fatigue, emotional well-being, cognition, depression) with four response levels each [25]. HSUVs were derived from a TTO survey with a sample of the UK general population. Values range from 0.079 to 0.882. It was not assumed that the best health state described by the MSIS-8D classification system (ie “no problems” on all dimensions) was equivalent to perfect health, therefore the value of this health state was not constrained to 1 [26]. The MSIS-8D was derived from Version 2 of the MSIS-29 [21], which has four response levels per item, rather than Version 1 of the MSIS-29, which has five response levels [27]. Therefore, although earlier waves of data collection used Version 1 of the MSIS-29, only responses to Version 2 were included in this analysis.


The South West Impact of Multiple Sclerosis (SWIMS) project is a longitudinal cohort study of people with MS aged 18 or over, living in Devon and Cornwall. Respondents complete six-monthly questionnaires, including several patient-reported outcome measures alongside clinical and demographic characteristics. The study was approved in the UK by the Cornwall and Plymouth and South Devon Research Ethics Committees, and written informed consent is obtained from all participants.

This analysis used SWIMS data received between August 2004 and October 2012. Only data collected at baseline were included, as this is the only point at which the FSS, EQ-5D, SF-36 and MSIS-29 are completed simultaneously. A random sample of 75% of the baseline data were used as the estimation dataset (n = 1056), with the remaining 25% constituting the validation dataset (n = 352) [11, 28]. As Table 1 shows, there were no significant differences (p < 0.05) between the datasets in terms of mean FSS total scores, mean HSUVs, or recorded demographic or clinical characteristics. The mapping algorithms were derived using data from respondents who provided answers to all questions required to produce both a FSS total score and a HSUV from the target PBM: 1023 respondents for the EQ-5D, 607 for the SF-6D and 650 for the MSIS-8D (response numbers are lower for the SF-6D and the MSIS-8D as only version 2 of these questionnaires were included). All statistical analysis was undertaken in Stata 14.

Table 1 Summary of respondent characteristics, comparison of estimation and validation datasets

Preliminary assessment of measures

Two key conditions must be met for mapping: there should be conceptual overlap between the source and target measures, and the target measure should demonstrate discriminative validity with respect to the severity of the condition captured by the source measure [11, 29]. To assess conceptual overlap, the FSS items and the dimensions of the PBMs were allocated to a multi-dimensional conceptual framework, which was developed for this study in order to provide a structure for comparing the content of the measures. The measurement concept underpinning the three PBMs is (HRQoL) [22, 23, 25]. Therefore, the conceptual framework was structured around the commonly agreed key dimensions of HRQL, which comprise physical and mental domains alongside a third domain relating to social and role function and participation [30,31,32]. The framework was constructed based on a systematic literature review of qualitative research into the impact of fatigue on people with MS (details of this review are included as Additional file 1: A).

Pearson correlation coefficients were assessed between the total FSS score and HSUVs from each of the PBMs, while Spearman correlation coefficients were assessed between FSS total scores and individual dimension scores for each PBM, and between HSUVs and individual FSS item scores. Assuming that these instruments measure distinct but related concepts, we expected to find relationships of moderate strength, ie correlation coefficients between 0.3 and 0.6 [33]. To assess the discriminative validity of the PBMs, respondents were categorised into fatigue severity groups: “mild/ no fatigue” (FSS total ≤ 35), “moderate fatigue” (36 ≤ FSS total ≤ 52) and “severe fatigue” (FSS total ≥ 53). The definition of “mild/ no fatigue” was based on the published cut-off point for the FSS [17]. The ability of the PBMs to differentiate between the three groups was investigated using ANOVA and standardised effect sizes. Effect sizes can be assessed as small (0.20–0.49), moderate (0.50–0.79) or large (0.80 or over) [34].

Development of mapping algorithms

Exploration of model specifications

The relationships between the source and target measures were examined using statistical conventions reported in the mapping literature [29, 35]. The distribution of scores on each of the measures was explored by the production of histograms and, the relationship between each of the PBMS and the FSS total score was investigated by production of scatterplots. Five regression models were estimated for each PBM. HSUVs were regressed on the:

  • Total FSS score (Model A);

  • Total FSS score and total FSS score squared (Model B);

  • Total FSS score, age and gender (Model C);

  • FSS item scores (Model D);

  • FSS item scores, age and gender (Model E).

The majority of mapping studies estimate algorithms using ordinary least squares (OLS) models [35]. However, OLS models can predict values outside the possible range for a PBM, and can lack predictive accuracy for extreme HSUVs. To address this, Tobit models were also considered, specifying an upper limit of 1 [29]. OLS and Tobit models rely on an assumption of no heteroscedasticity. Where this assumption was violated according to White’s test for heteroscedasticity, the ‘vce(robust)’ option was used in conjunction with the ‘regress’ command for the OLS analyses, and Censored Least Adjusted Deviation (CLAD) estimation methods [36] were used instead of Tobit models, employing the ‘clad’ command with a specified upper limit of 1.

Predictive ability was assessed using the following estimation errors: mean absolute error (MAE), root mean squared error (RMSE) and the proportions of estimates that fell within 0.05, 0.10 and 0.25 of the observed HSUV. MAE was selected as the primary criterion for selection of the preferred models [11]. However, if coefficients had unexpected signs these models were not selected. In instances where model MAEs were the same, the model with the best profile of estimates falling within 0.05, 0.10 and 0.25 of the observed HSUV was selected.

Two researchers decided independently which models to would take forward for validation. Where discrepancies arose, these were resolved through discussion until consensus was reached. Demographic variables may not be included in the datasets from which HSUVs are to be estimated. Therefore, where the best performing models included demographic variables, the best performing model without demographic variables was also selected.

Validation and model selection

Estimation errors were assessed according to the severity of the health state. The selected models were applied to the validation dataset and their performance was assessed using the criteria outlined above.


Preliminary assessment of measures

The conceptual framework that was developed to assess conceptual overlap between the measures is illustrated in Fig. 1. Most of the themes that had been identified in the original qualitative research studies fitted into the three domains of HRQoL that were defined a priori. There were two notable exceptions. Several of the themes described the experience of fatigue itself, rather than its effect on HRQoL. This experience was clearly of great importance to the people with MS who contributed to the original research, and underpinned the ways in which fatigue impacts upon HRQoL. Therefore, an additional domain was added: “Descriptions of fatigue”. In terms of the links between themes, a clear relationship emerged between “functioning and participation” and “psychological well-being”. People with MS specifically identified negative effects on their psychological well-being that were caused by the impact of their fatigue on their functioning and participation. These stood alongside, but distinct from, the direct impact of fatigue on psychological well-being. Therefore, this became a domain in its own right.

Fig. 1
figure 1

Conceptual framework

In terms of conceptual overlap, the FSS and all PBMs cover the three primary domains of the conceptual framework (Physical, Mental and Participation Effects) (Table 2). Coverage of Participation Effects is strong across all four measures. The FSS, SF-6D and MSIS-8D capture a wide range of Physical Effects, whereas the EQ-5D includes only specific dimensions for pain/discomfort and mobility. In terms of Mental Effects, the FSS includes one item relating to motivation, while the PBMs describe other specific symptoms eg depression or anxiety. Only the MSIS-8D includes cognitive effects. The MSIS-8D and SF-6D include dimensions relating specifically to fatigue or vitality.

Table 2 Comparison of measures against conceptual framework

Significant (p < 0.0001) moderate correlations were evident between the FSS total score and HSUVs derived from the EQ-5D (r = − 0.455) and the MSIS-8D (− 0.590). There was a large significant correlation (p < 0.0001) between the FSS total score and HSUVs derived from the SF-6D (− 0.647). The FSS total score was significantly correlated with all individual dimensions of the PBMs, and HSUVs derived from each of the PBMs were significantly correlated with all individual items of the FSS (p < 0.0001). Most correlations were moderate, as anticipated, and all had the expected negative sign, ie higher FSS scores are related to lower HSUVs (Table 3).

Table 3 Correlations between Fatigue Severity Scale and preference-based measures

28.4% of respondents with a valid FSS total score were in the “mild/ no fatigue” category, 36.6% were in the “moderate fatigue” category and 35.0% were in the “severe fatigue” category. All PBMs discriminated significantly between fatigue severity groups (p < 0.0001). The SF-6D performed particularly well, with large standardised effect sizes (≥0.80). Overall, standardised effect sizes were higher for the MSIS-8D than for the EQ-5D (Table 4).

Table 4 Discriminative validity

As a result of the preliminary assessments, it was judged that conceptual overlap and discriminative validity were sufficient to proceed with the estimation of mapping models. Overall, the SF-6D and MSIS-8D provide a better fit with the FSS.

Results of mapping analysis

Exploration of model specifications

In order to allow for heteroscedasticity, skewness and kurtosis identified in the data, we fitted robust OLS models and used a CLAD rather than a Tobit specification. (The distribution of scores on each of the measures, and the relationships between scores on the PBMs and the FSS total score is shown in the Additional file 2 B and Additional file 3: C). Thirty models were considered, with Models A to E estimated for each PBM, using both OLS and CLAD specifications.

There was little difference between the predictive ability of the models based on FSS total scores and individual FSS items. In all models, item FSS-08 had a significant coefficient with an unexpected sign, and a majority of the FSS items (ranging from five to seven of the nine items) were not significant predictors of HSUVs. Furthermore, data on individual FSS items may not be available in all potential applications of the mapping algorithms. Therefore selection was restricted to algorithms based on the FSS total score.


CLAD C had the lowest MAE and the highest proportion of individuals with small prediction errors. We also selected CLAD A, as the model which did not include demographic variables with the lowest MAE.


OLS B and CLAD B had coefficients with unexpected signs and were, therefore, not selected. We selected CLAD C as it had the next lowest MAE, and OLS A and CLAD A, as they did not include demographic variables.


CLAD B and OLS B had the lowest MAEs, however these had unexpected signs for FSS total, and so were not selected. The model with the next lowest MAE and highest proportion of individuals with small predictions errors was CLAD C. As this model included demographic variables, we also selected the model with the next lowest MAE (0.117), CLAD A.

Details of the selected models are presented in Table 5. All model results are provided in Additional file 4: D.

Table 5 Models mapping from FSS total to PBMs using estimation dataset

Validation and model selection

The validation dataset was used to assess estimation errors for the selected models (Table 6). Table 7 shows MAEs for ‘poor’ and ‘good’ health states by model. The models predicting HSUVs for the EQ-5D and MSIS-8D had larger MAEs for poorer health states, indicating that these models performed less well at estimating scores for those in poorer health states. The opposite was true for the SF-6D models, although the difference in MAEs here was less marked. (Please see Additional file 5: E and Additional file 6: F).

Table 6 Models mapping from FSS total to PBMs using validation dataset
Table 7 Mean absolute errors by severity group


Here we describe and demonstrate a method for converting responses to the FSS, a frequently-used measure of fatigue severity, into HSUVs, which can be used to estimate QALYs for use in cost-effectiveness analyses, and hence to inform decision-making regarding the availability of treatments for MS-related fatigue. According to the Oxford Health Economics Research Centre’s Mapping Database, last updated in April 2019 [37], no previous published studies have attempted mapping from the FSS. In addition, we have found no previous studies which have investigated correlations between the FSS and the SF-6D or the FSS and the MSIS-8D, and just two which have explored the relationship between the FSS and the EQ-5D [38, 39]. Rosa et al. [39] correlated FSS total scores with participants’ scores on the EQ-5D visual analogue scale, rather than with the EQ-5D HSUVs that are relevant for mapping, and Tremmas et al. [38] found no statistically significant correlation between the FSS and EQ-5D scores of people with lung cancer.

The ability of the models selected in the current study to predict SF-6D and MSIS-8D values is in keeping with results reported in other mapping studies [35]. There are currently no guidelines regarding acceptable limits for estimation errors [13], but MAEs ranging from 0.0011 to 0.19 have been previously described [35]. In the current study, the SF-6D MAEs of 0.078 and 0.077 and the MSIS-8D MAEs of 0.117 and 0.116, fall well within this range and, specifically in the context of MS, they are in keeping with the MAE of 0.058 reported by Hawton et al. [12] when the MSIS-29 was mapped to the SF-6D.

Results for the EQ-5D algorithms were less convincing. The prediction errors of 0.175 and 0.173 are towards the higher end of MAEs reported in previous mapping studies [35], and are also high in the context of MS mapping studies. Versteegh et al. [13] mapped from the version 1 of the MSIS-29 to the EQ-5D, with resulting MAEs of 0.13 and 0.16, and Hawton and colleagues [12] mapped from version 2 of the same measures to the EQ-5D with a MAE of 0.147. In addition, when testing the external validity of the Versteegh et al. [13] algorithm, Ernstsson et al. [40] reported a MAE of 0.12.

Information is inevitably lost in the process of mapping, as the resulting algorithm will only reflect the areas of content that overlap between the starting and target measures. This information loss is accentuated when a domain-specific, condition-specific measure, such as the FSS, is mapped to a generic, multi-dimensional measure, such as the EQ-5D. Therefore, greater predictions errors might be anticipated when mapping from such a uni-dimensional scale as the FSS than when mapping from a multi-dimensional scale such as the MSIS-29 [41]. However, this does not appear to hold in the MS mapping literature to date, with Hawton et al. [14] reporting a MAE of 0.148 when they mapped from the MS Walking Scale-12 (a mobility-specific, MS-specific measure) to the EQ-5D, and Sidovar et al. [42] described an error statistic of 0.109 when mapping to/from these same measures.

In the current study, the EQ-5D algorithms were particularly problematic for HSUVs below 0.65. They did not predict any values below 0.54 (assuming an age of 50 years and female gender for CLAD Model C), which is of particular concern for a measure with a minimum value of − 0.594.

On the basis of the statistical assessments reported here, the qualitative assessments of conceptual validity, and setting our findings in the context of other mapping studies in MS and mapping studies more generally, we suggest the use of the following algorithms for mapping from the FSS to HSUVs.

SF-6D estimate = 0.897–0.006*FSS total score

MSIS-8D estimate = 1.084–0.008*FSS total score – 0.001*age – 0.024*gender [0 male, 1 female] or if age and gender are not available:

MSIS-8D estimate = 0.985–0.007*FSS total score

Based on these same assessments, we suggest the EQ-5D algorithms are far less likely to produce accurate or valid estimates of EQ-5D scores.

There are a number of potential limitations of this work. Firstly, the SWIMS data were collected prior to the development and use of the EQ-5D-5L and the mapping algorithms were based on the ‘older’ EQ-5D-3L. It may have been expected that the EQ-5D-5L would supersede the EQ-5D-3L as it was developed with five, rather than the original three, levels in an attempt to improve its responsiveness. However, the English HSUV set for the EQ-5D-5L is not in common use, and if using the EQ-5D-5L descriptive system, the current ‘position statement’ of NICE is to use a cross-walk algorithm to provide HSUVs from the EQ-5D-3L value set. Secondly, the SF-6D value set is based on the use of standard gamble to elicit preferences for health states. This may result in higher HSUVs (than the EQ-5D), as respondents tend to be risk adverse. Thirdly, we did not explore the performance of some of the ‘newer’ mapping model specifications, such as limited dependent variable mixture models or beta-based regression, which may have better accounted for the bi-modal nature of the EQ-5D data. There is some empirical evidence in support of these models, but the ISPOR Task Force report [16] does not advocate any specific regression approach for mapping, recognising that the performance of different methods will vary dependent on a number of factors including the nature of the starting/target measures, the disease, and the patient population. The report suggests it is wise to use a model type for which there is existing evidence of good performance. In the context of MS, mapping algorithms which have used the same regression approaches that we have used here have been reported with MAEs of 0.058 [12], 0.13 and 0.16 [13], 0.147 [12], 0.12 [40], 0.148 [14] and 0.109 [42]. Brazier et al.’s [35] systematic review of mapping studies reported MAEs of 0.0011 to 0.19. Therefore, the regression approaches in the current paper have a track record of use and acceptability in the context of MS. The MAEs reported here for the SF-6D and MSIS-8D are in keeping with those reported in these other mapping studies. The poor performance of the EQ-5D algorithms is likely to be a function of the limited conceptual overlap between the EQ-5D and the FSS. The limited shared conceptual content of these measures will not be altered by using a different form of regression analysis. Thirdly, algorithms to predict HSUVs from individual FSS items, rather than the total score, were not generated by this study. This was, in part, due to an anomaly affecting item FSS-08 (Fatigue is among the most disabling of my symptoms). While the item correlated negatively (as expected) with HSUVs when considered in isolation, it had a positive coefficient when included as an independent variable in regression analysis. Further research would be required to understand the mechanisms behind this; in the meantime, it is not possible to determine whether this item is suitable for inclusion in a mapping algorithm.

A particular strength of this study is the nature of the SWIMS dataset. It has provided comprehensive data on which to base the estimation and validation of these mapping algorithms. Importantly, the cohort is comparable with other UK-based samples of people with MS in terms of age, gender, relapse rates and duration of illness [8, 43,44,45,46,47], meaning the algorithms should apply generally to people with MS, rather than just to specific sub-groups. In addition, the work undertaken to explore the content overlap between the measures provided a form of ‘triangulation’ in assessing the appropriateness of the mapping algorithms. Drawing on good quality qualitative research findings regarding the impacts of fatigue on HRQoL and developing a conceptual framework, provided unique insights into why the measures did and did not map well.

It is acknowledged that mapping methods are a second-best option to directly collected HSUVs for estimating QALYs [29, 41, 48]. Use of mapping increases the uncertainty and error around estimates of HSUVs [29], and is particularly problematic when there is little content overlap or relationship between the measures being mapped to and from [41]. However, when PBM data are not collected directly in a trial, empirically-evidenced mapping algorithms may be used. With the exception of the EQ-5D, the algorithms reported here can be used to support improvements in decision-making where primary PBM data are unavailable.


We present statistical algorithms that allow data from the FSS, a fatigue-specific patient-reported outcome measure, to be used in the estimation of QALYs, which are a suitable and policy-relevant measure for use in cost-effectiveness analyses. This will enable the results of studies using the FSS to inform decision-making in a health technology assessment context.

Availability of data and materials

The data that support the findings of this study are available from SWIMS Data-Sharing Committee.



Censored least absolute deviation


EuroQoL EQ-5D-3L


Fatigue Severity Scale


health state utility value


multiple Sclerosis


Multiple Sclerosis Impact Scale-8D


National Institute for Health and Care Excellence


Ordinary least squares


preference-based measure


quality-adjusted life-year

SF-6D :

Short-Form 6D


South West Impact of MS study


Time trade-off


  1. Brazier J, Ratcliffe J, Salomon J, Tsuchiya A. Measuring and valuing health benefits for economic evaluation. Oxford: Oxford University Press; 2007.

    Google Scholar 

  2. Zajicek J, Freeman J, Porter B. Multiple sclerosis care: a practical manual. Oxford: Oxford University Press; 2007.

    Book  Google Scholar 

  3. Flachenecker P, Kümpfel T, Kallmann B, Gottschalk M, Grauer O, Rieckmann P, et al. Fatigue in multiple sclerosis: a comparison of different rating scales and correlation to clinical parameters. Mult Scler. 2002;8:523–6.

    Article  CAS  Google Scholar 

  4. Tomassini V, Pozzilli C, Onesti E, Pasqualetti P, Marinelli F, Pisani A, et al. Comparison of the effects of acetyl l-carnitine and amantadine for the treatment of fatigue in multiple sclerosis: results of a pilot, randomised, double-blind, crossover trial. J Neurol Sci. 2004;218:103–8.

    Article  CAS  Google Scholar 

  5. Shaygannejad V, Janghorbani M, Ashtari F, Zakeri H. Comparison of the effect of aspirin and amantadine for the treatment of fatigue in multiple sclerosis: a randomized, blinded, crossover study. Neurol Res. 2012;34:854–8.

    Article  CAS  Google Scholar 

  6. Rammohan K, Rosenberg J, Lynn D, Blumenfeld A, Pollak C, Nagaraja H. Efficacy and safety of modafinil (Provigil®) for the treatment of fatigue in multiple sclerosis: a two Centre phase 2 study. J Neurol Neurosurg Psychiatry. 2002;72:179–83.

    Article  CAS  Google Scholar 

  7. van Kessel K, Moss-Morris R, Willoughby E, Chalder T, Johnson M, Robinson E. A randomized controlled trial of cognitive behavior therapy for multiple sclerosis fatigue. Psychosom Med. 2008;70:205–13.

    Article  Google Scholar 

  8. Jones K, Ford D, Jones P, John A, Middleton R, Lockhart-Jones H, et al. How people with multiple sclerosis rate their quality of life: an EQ-5D survey via the UK MS register. PLoS One. 2013;8(6):e65640.

    Article  CAS  Google Scholar 

  9. Guidelines for the economic evaluation of health technologies 4th edition. Ottawa: Canadian Agency for Drugs and Technologies in Health (CADTH). 2017; pp.1–76.

  10. Guidelines for preparing submissions to the Pharmaceutical Benefits Advisory Committee (Version 5.0). Barton, Australia: Pharmaceutical Benefits Advisory Committee, Australian Government, Department of Health and Ageing; 2016.

  11. Petrou S, Rivero-Arias O, Dakin H, Longworth L, Oppe M, Froud R, et al. The MAPS reporting statement for studies mapping onto generic preference-based outcome measures: explanation and elaboration. Pharmacoeconomics. 2015;33:993–1011.

    Article  Google Scholar 

  12. Hawton A, Green C, Telford C, Zajicek J, Wright D. Using the multiple sclerosis impact scale to estimate health state utility values: mapping from the MSIS-29, version 2, to the EQ-5D and the SF-6D. Value Health. 2012;15:1084–91.

    Article  Google Scholar 

  13. Versteegh M, Rowen D, Luime J, Boggild M, Groot CU-d, Stolk E. Mapping QLQ-C30, HAQ, and MSIS-29 on EQ-5D. Med Decis Mak 2012;32:554–568.

    Article  Google Scholar 

  14. Hawton A, Green C, Telford C, Wright D, Zajicek J. The use of multiple sclerosis condition-specific measures to inform health policy decision-making: mapping from the MSWS-12 to the EQ-5D. Mult Scler. 2012;18:853–61.

    Article  CAS  Google Scholar 

  15. Guide to the methods of technology appraisal 2013. London: National Institute for Health and Care Excellence; 2013; pp.1–93.

  16. Wailoo A, Hernandez-Alava M, et al. Mapping to estimate health-state utility from non–preference-based outcome measures: an ISPOR good practices for outcomes research task force report. Value Health. 2017;20(1):18–27.

    Article  Google Scholar 

  17. Krupp L, LaRocca N, Muir-Nash J, Steinberg A. The fatigue severity scale. Application to patients with multiple sclerosis and systemic lupus erythematosus. Arch Neurol. 1989;46:1121–3.

    Article  CAS  Google Scholar 

  18. Learmonth Y, Dlugonski D, Pilutti L, Sandroff B, Klaren R, Motl R. Psychometric properties of the fatigue severity scale and the modified fatigue impact scale. J Neurol Sci. 2013;331:102–7.

    Article  CAS  Google Scholar 

  19. Valko P, Bassetti C, Bloch K, Held U, Baumann C. Validation of the fatigue severity scale in a Swiss cohort. Sleep. 2008;31(11):1601–7.

    Article  Google Scholar 

  20. Armutlu K, Korkmaz N, Keser I, Sumbuloglu V, Akbiyik DI, Guney Z, Karabudak R. The validity and reliability of the fatigue severity scale in Turkish multiple sclerosis patients. Int J Rehabil Res. 2007;30:81–5.

    Article  Google Scholar 

  21. Hjollund N, Andersen J, Bech P. Assessment of fatigue in chronic disease: a bibliographic study of fatigue measurement scales. Health Qual Life Outcomes. 2007;5(12):1–5.

  22. Dolan P. Modeling valuations for EuroQol health states. Med Care. 1997;35:1095–108.

    Article  CAS  Google Scholar 

  23. Brazier J, Roberts J, Deverill M. The estimation of a preference-based measure of health from the SF-36. J Health Econ. 2002;21:271–92.

    Article  Google Scholar 

  24. Jenkinson C, Stewart-Brown S, Petersen S, Paice C. Assessment of the SF-36 version 2 in the United Kingdom. J Epidemiol Community Health. 1999;53:46–50.

    Article  CAS  Google Scholar 

  25. Goodwin E, Green C. A quality-adjusted life-year measure for multiple sclerosis: developing a patient-reported health state classification system for a multiple sclerosis-specific preference-based measure. Value Health. 2015;18:1016–24.

    Article  Google Scholar 

  26. Goodwin E, Green C, Spencer A. Estimating a preference-based index for an eight dimensional health state classification system derived from the multiple sclerosis impact scale (MSIS-29). Value Health. 2015;18:1025–36.

    Article  Google Scholar 

  27. Hobart J, Cano S. Improving the evaluation of therapeutic interventions in multiple sclerosis: the role of new psychometric methods. Health Technology Assessment. 2009;13(12):1–177.

  28. Dakin H, Petrou S, Haggard M, Benge S, Williamson I. Mapping analyses to estimate health utilities based on responses to the OM8-30 otitis media questionnaire. Qual Life Res. 2010;19:65–80.

    Article  Google Scholar 

  29. Longworth L, Rowen D. Technical support document 10: the use of mapping methods to estimate health state utility values. National Institute for Health and Care Excellence Decision Support Unit; 2011.

    Google Scholar 

  30. Riazi A. Patient-reported outcome measures in multiple sclerosis. Int MS J. 2006;13:92–9.

    CAS  PubMed  Google Scholar 

  31. Ware J. Conceptualization and measurement of health-related quality of life: comments on an evolving field. Arch Phys Med Rehabil. 2003;84(Suppl 2):S43–51.

    Article  Google Scholar 

  32. European Medicines Agency. Reflection paper on the regulatory guidance for the use of health-related quality of life (HRQL) measures in the evaluation of medicinal products. EMEA/CHMP/EWP/139391/2004. London: European Medicines Agency; 2005.

    Google Scholar 

  33. Nunnally J, Bernstein I. Psychometric theory. New York: McGraw-Hill; 1994.

    Google Scholar 

  34. Cohen J. Statistical power analysis for the Behavioural sciences. Hillsdale: Lawrence Erlbaum Associates; 1988.

    Google Scholar 

  35. Brazier J, Yang Y, Suchiya T, Rowen D. A review of studies mapping (or cross walking) non-preference based measures of health to generic preference-based measures. Eur Health Econ. 2010;11:215–25.

    Article  Google Scholar 

  36. Powell J. Least absolute deviations estimation for the censored regression model. J Econ. 1984;25:303–25.

    Article  Google Scholar 

  37. HERC database of mapping studies Version 7.0. 24th April 2019.

  38. Tremmas I, Petsatodis G, Potoupnis M, Laskou S, Giannakidis D, Mantalovas S, et al. Monitoring changes in quality of life in patients with lung cancer under treatment with chemotherapy and co administration of zoledronic acid by using specialized questionnaires. J Cancer. 2018;9(10):1731–6.

    Article  Google Scholar 

  39. Rosa K, Fu M, Gilles L, Cerri K, Peeters M, Bubb J, et al. Validation of the Fatigue Severity Scale in chronic hepatitis C. Health Quality Life Outcomes. 2014;12(90):1–12.

    Article  Google Scholar 

  40. Ernstsson O, Tingho P, Alexanderson K, Hillert J. Burstro¨m K. the external validity of mapping MSIS-29 on EQ-5D among individuals with multiple sclerosis in Sweden. MDM Policy Practice. 2017;2:1–9.

    Article  Google Scholar 

  41. Round J, Hawton A. Statistical alchemy: conceptual validity and mapping to generate health state utility values. Pharmaco Econ Open. 2017;1(4):233–9.

    Article  Google Scholar 

  42. Sidovar M, Limone B, Lee S, Coleman C. Mapping the 12-item multiple sclerosis walking scale to the EuroQol 5-dimension index measure in north American multiple sclerosis patients. BMJ Open. 2013;3:1–6.

    Article  Google Scholar 

  43. Confavreaux C, Compston A. The natural history of multiple sclerosis. In: Compston A, editor. McAlpine's multiple sclerosis. Philadelphia: Churchill Livingstone Elsevier; 2006.

    Google Scholar 

  44. Ford H, Gerry E, Airey C, Al E. The prevalence of multiple sclerosis in the Leeds health authority. J Neurol Neurosurg Psychiatry. 1998;64:605–10.

    Article  CAS  Google Scholar 

  45. Forbes R, Wilson S, Swingler R. The prevalence of multiple sclerosis in Tayside, Scotland: do latitudinal gradients really exist? J Neurol Neurosurg Psychiatry. 1999;246:1033–40.

    CAS  Google Scholar 

  46. Fox C, Bensa S, Bray I, Zajicek J. The epidemiology of multiple sclerosis in Devon: a comparison of new and old classification criteria. J Neurol Neurosurg Psychiatry. 2004;75:56–60.

    CAS  PubMed  PubMed Central  Google Scholar 

  47. Robertson N, Deans J, Fraser M, Al E. Multiple sclerosis in the north Cambridgeshire districts of East Anglia. J Neurol Neurosurg Psychiatry. 1995;59:71–6.

    Article  CAS  Google Scholar 

  48. McCabe C, Edlin R, Meads D, Brown C, Kharroubi S. Constructing indirect utility models: some observations on the principles and practice of mapping to obtain health state utilities. Pharmacoeconomics. 2013;31(8):635–41.

    Article  Google Scholar 

Download references


The authors are grateful to the SWIMS Project participants for allowing access to data they provided for the SWIMS Project. The authors acknowledge the SWIMS Project Team for delivering these data. This publication is the work of the authors, who will serve as guarantors for the contents of this publication. This publication does not necessarily reflect the views of the SWIMS Project Team nor the SWIMS Data-Sharing Committee.


This work was supported by the Multiple Sclerosis Society of Great Britain and Northern Ireland and the UK NIHR Collaboration for Leadership in Applied Health Research and Care of the South West Peninsula (PenCLAHRC). The funding agreements ensured the authors’ independence in designing the study, interpreting the data, writing, and publishing the report.

The views expressed in this publication are those of the authors and not necessarily those of the Multiple Sclerosis Society, the UK NIHR or the Department of Health.

The Multiple Sclerosis Society of Great Britain and Northern Ireland and the Peninsula Medical School Foundation provided support for the SWIMS Project.

Author information

Authors and Affiliations



All authors conceived the idea for the research, EG conducted the data analysis with support and supervision from AH, EG drafted the article, and CG and AH provided suggestions/edits etc., all authors approved the final version of the paper.

Corresponding author

Correspondence to A. Hawton.

Ethics declarations

Ethics approval and consent to participate

The SWIMS study was approved in the UK by the Cornwall and Plymouth and South Devon Research Ethics Committees, and written informed consent is obtained from all participants.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Additional files

Additional file 1:

Development of a conceptual framework describing the impact of fatigue on people with MS: a systematic review of the literature. (DOCX 163 kb)

Additional file 2:

Histograms of source and target measures. (DOCX 119 kb)

Additional file 3:

Scatterplots of FSS and PBM scores. (DOCX 204 kb)

Additional file 4:

All model results. (XLSX 52 kb)

Additional file 5:

Scatterplots of observed vs predicted HSUVs. (DOCX 320 kb)

Additional file 6:

Observed versus predicted HSUVs by severity. (XLSX 359 kb)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Goodwin, E., Hawton, A. & Green, C. Using the Fatigue Severity Scale to inform healthcare decision-making in multiple sclerosis: mapping to three quality-adjusted life-year measures (EQ-5D-3L, SF-6D, MSIS-8D). Health Qual Life Outcomes 17, 136 (2019).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: