- Open Access
Validation of a three-item Fatigue Severity Scale for patients with substance use disorder: a cohort study from Norway for the period 2016–2020
Health and Quality of Life Outcomes volume 19, Article number: 69 (2021)
Little attention has been paid to customising fatigue questionnaires for patients with Substance Use Disorders (SUDs). The present study aims to validate and shorten the nine-item Fatigue Severity Scale (FSS-9) and Visual Analogue Fatigue Scale (VAFS) for use with this population.
We used data from a nested cohort with annual health assessments with responses on the FSS-9 and VAFS. During the period 2016–2020, 917 health assessments were collected from 655 patients with SUD in Bergen and Stavanger, Norway. A total of 225 patients answered the health assessment at least twice. We defined baseline as the first annual health assessment when the health assessments were sorted chronologically per patient. We checked for internal consistency, and we used longitudinal confirmatory factor analysis (CFA) and linear mixed model (LMM) analysis to validate and shorten the FSS-9 and VAFS.
The internal consistency of the FSS-9 was excellent with a Cronbach’s α of 0.94 at baseline and 0.93 at the second annual health assessment. When shortening the FSS-9 to a three-item FSS (FSS-3, items 5–7), the Cronbach’s α was 0.87 at baseline and 0.84 at the second health assessment. The internal consistency was not affected when the VAFS was added to the FSS-3 and the FSS-9. The longitudinal CFA model showed a well-fitting model for the FSS-3 (χ2 = 13.33, degree of freedom = 8, P = 0.101). The LMM analysis showed equal linear changes at the individual level for the FSS-3 (slope: 0.00, P > 0.05) and FSS-9 (slope: 0.01, P > 0.05) between the health assessments.
The FSS-9 could be shortened to the FSS-3 with high validity and reliability for patients with SUDs and the addition of VAFS did not provide much added variability.
Fatigue is a common subjective health complaint that significantly affects patients with chronic diseases [1,2,3]. For patients with substance use disorders (SUDs), fatigue is associated with underlying physical and mental health problems—such as hepatitis C virus infection, depression, and an increased risk of suicide [4,5,6]. However, we have scarce evidence about fatigue in the SUD population, and validated questionnaires for surveying fatigue and investigating how it varies over time do not exist for this patient group. To ensure high-quality clinical trials focusing on fatigue, developing a validated and customized fatigue questionnaire aimed for SUD patients is therefore of particular interest.
Patients with SUDs often live in a chaotic situation with extensive medical and psychosocial health challenges, including polysubstance use, substance intoxications and withdrawals, psychiatric comorbidities (e.g., attention deficit hyperactivity disorder, psychosis disorders, or personality disorders), chronic viral hepatitis, financial risk, and temporary living situations [7, 8]. This might make surveying fatigue with questionnaires particularly challenging and might influence the patients’ fatigue experiences and how they respond to questionnaires [9,10,11]. Using simple wording and phrases, avoiding the use of questions which differ in subtle ways which make them appear repetitive, and administering measures that produce reliable and valid scores based on very few questions, might be essential to ensuring reliable results in clinical trials on fatigue in this population.
The nine-item Fatigue Severity Scale (FSS-9) and the Visual Analog Fatigue Scale (VAFS) are two well-known fatigue questionnaires. The evidence for the validity of the FSS-9 has been obtained for a wide range of chronic infectious and neurological diseases—such as hepatitis C virus infection , Parkinson’s disease , myasthenia gravis , systemic lupus erythematosus , and stroke ; while the VAFS has shown evidence for a good validity and reliability in post-stroke patients . In two validation studies, the VAFS was used alongside the FSS-9 with high correlations [15, 16]. Furthermore, excellent validity and reliability were achieved when the FSS-9 was shortened into a seven-item version [17, 18], while uncertain results were reported when a five-item version was validated .
With this validation study we sought to address two gaps in the literature on the assessment of fatigue in patients with SUDs: (1) to evaluate whether the scores produced by a shortened version of FSS-9 can be interpreted similarly to the scores produced by the FSS-9, and (2) to determine whether there is value to adding the VAFS to either the FSS-9 or the shortened version of FSS-9.
We hypothesised that the FSS-9 and VAFS questionnaires can create a short version of the FSS-9 which demonstrates strong evidence of reliability and validity when used with patients who have SUDs.
We drew data from a nested cohort from the INTRO-HCV trial that collected data on patients with SUDs in Bergen and Stavanger, Norway . We recruited patients receiving opioid agonist therapy in Bergen and Stavanger and patients with SUDs receiving healthcare from the primary health clinics in the municipality of Bergen. This study included all patients in the cohort who had answered the FSS-9 and/or VAFS in the study period from May 2016 to January 2020.
All included patients were invited to an annual health assessment, including FSS-9 and VAFS measurements and a survey of their current sociodemographic situation. We collected all data in a health register using data collection software (Checkware®) under the supervision of research nurses.
We conducted 917 health assessments of 655 patients, and this included 916 FSS-9 measurements and 915 VAFS measurements during the study period. We defined a measurement as when at least one of the items in the FSS-9 or the VAFS were answered during a health assessment. Baseline was defined as the first health assessment including measurements of the FSS-9 or VAFS when the health assessments were listed chronologically. The FSS-9 and VAFS were completely answered in 914 health assessments. For the remaining three health assessments, one patient only completed the VAFS and not the FSS-9, one only answered five of the nine items on the FSS-9 and did not complete the VAFS, and a third completed the FSS-9 and not the VAFS. Of the 655 included patients, 188 completed two health assessments, while 37 patients completed three health assessments. The time intervals between the annual health assessments varied with a mean of 12 months (standard deviation (SD): 4 months) (Additional file 1). Due to the relatively small number of patients with three health assessments, we used two health assessments when estimating internal consistency reliability and construct validity. For patients with three annual health assessments, we only included the first (baseline) and second health assessments in these analyses.
We used the FSS-9 and VAFS to measure the level of fatigue. The FSS-9 measures fatigue during the past week, and it includes items regarding: mental and physical functioning, motivation, exercise, carrying out certain duties, and interference with work, family, or social life. The VAFS measures the patient’s general experience of fatigue. The FSS-9 was answered on a Likert scale from 1 (no fatigue) to 7 (worst fatigue) and the VAFS was answered by placing a mark on a line from 0 (no fatigue) to 10 (worst fatigue) that represent the fatigue level. The data collection software only allowed valid responses for each question and prompted for responses to unanswered questions before submission in order to minimise missing data. In a previous study, the US-English version of the FSS-9 has been translated into Norwegian by a qualified native Norwegian-speaking translator and back-translated into US-English by an authorised native US-English-speaking translator (Additional file 2) .
We used Stata/SE 16.0 (StataCorp, TX, USA) for descriptive analysis and IBM SPSS version 24.0 and Mplus version 8.4 for reliability analysis (Cronbach’s α if-item-deleted and Item-Total correlation), for confirmatory factor analysis (CFA), and for linear mixed model (LMM) analysis (Mplus: TwoLevel analysis). The threshold for statistical significance was set to P < 0.05 for all analyses unless otherwise stated.
Internal consistency of the FSS-9 and the shortened version of FSS-9 (FSS-3), and these scales including the VAFS
We calculated the internal consistency of the FSS-9 and this scale including the VAFS at baseline and at the second health assessment. As a part of the validation study, we explored whether there was value in adding the VAFS to the FSS-9 by evaluating if the VAFS added more information than captured by the FSS-9. Cronbach’s α was considered to show good internal consistency if Cronbach’s α was above 0.70 [22,23,24]. We then shortened the FSS-9 by deleting the item that resulted in the highest Cronbach’s α value for the remaining items (alpha-if-item-deleted analysis). The remaining items’ Cronbach’s α coefficients were recalculated, and the next item was deleted. If the remaining scale showed almost equal Cronbach’s α values after removing one or another item, clinical experience was used in the decision of what item we removed. We deleted items that were less adaptable to patients with SUDs, for example, items about employment (unemployment was common in this population) and items with complex phrases and wordings that could be difficult to understand for patients with SUDs when they were intoxicated or went through substance withdrawals. Furthermore, we calculated Cronbach’s α for the VAFS plus the shortened version of FSS-9 (FSS-3) at baseline and at the second health assessment. Due to strong inter-item correlations and good reliability in previous studies evaluating the FSS-9 alongside the VAFS [15, 16], we expected that the VAFS would not provide much added variability than that captured by the FSS-9 and FSS-3 questionnaires at baseline and at the second health assessment.
Longitudinal confirmatory factor analysis for evaluating the fit of the FSS-3 and FSS-9, and these scales including the VAFS
We used CFA models to test the structure of the items in the FSS-3 and FSS-9, and these scales including the VAFS at baseline and at the second health assessment in order to evaluate the relationships between the items and their underlying latent factors [22, 25,26,27,28]. We expected that both the FSS-3 and FSS-9 should support one-dimensional models. The VAFS was added to the FSS-3 and FSS-9 to examine whether the VAFS provided more added variability in fatigue than captured by the FSS-3/FSS-9. This should be indicated by a less than a perfect correlation between the FSS-3/FSS-9 and VAFS. Further, we used longitudinal CFA in order to test measurement invariance for the FSS-3. First, we estimated a free model with all unique parameter values. We then tested for constraints in the model by setting the factor loadings within each item equal to each other at baseline and at the second health assessment. Third, we tested for equality within the residuals over time. The last model constrained the intercept values for the indicators. We used the Wald test to compare model restrictions. All CFA models were evaluated with standard fit measures: χ2, degrees of freedom, p values, Comparative fit index, Tucker Lewis Index, Root Mean Square Error of Approximation with 90% confidence interval, and the probability of close fit. A well-fitted model should have a statistically non-significant χ2, values of Comparative fit index and Tucker Lewis Index should be above 0.95, and Root Mean Square Error of Approximation should preferably be below 0.05 (close fit) . Root Mean Square Error of Approximation above 0.10 is considered to be a poorly fitted model . We used the modification index to explore model improvements if the goodness of fit measures indicated a poorly fitted model (χ2 difference test). We analysed all variables as continuous variables due to the relatively high number of categories in the ordinal variables (FSS-9 items ranged from 1 to 7, and the VAFS ranged from 0 to 10). The CFAs were run using the Robust Maximum Likelihood estimator. According to previous studies showing good reliability and strong inter-item correlations of the FSS-9 and this scale including the VAFS [15, 16], we expected increased support for the FSS-3 reflecting fatigue as one dimension with stronger levels and homogeneity in the factor loadings than for the FSS-9, also when including the VAFS at baseline and at the second health assessment. This should be the consequence of reducing the scale by using the most relevant measurement indicators in this population of respondents. Furthermore, we expected that the longitudinal CFA would support measurement invariance over time for the FSS-3 . Measurement invariance is indicated if each measurement indicator is equally important for the underlying factor over time, with equal factor loadings and intercepts within each item over time. In addition, strict invariance would be supported if residuals for each item are equal over time.
Linear mixed model analysis for evaluating changes in the FSS-3 and FSS-9 sum scores and the VAFS score
We used a LMM analysis (Mplus multilevel modelling: TwoLevel) to evaluate linear changes from baseline in the sum scores of the FSS-3 and FSS-9, the scores in the separate FSS-9 items, and the VAFS score. We included all 917 health assessments. First, we estimated a full random intercept random slope model, which gave us the mean and individual variance in terms of both level and change together with the relationship between level and change . We re-estimated the model as a random intercept fixed slope model if the covariance between the intercept and the slope variance was statistically non-significant. We used the Mplus Maximum Likelihood Robust estimator to correct standard errors for potential deviation from normality . In addition, interclass correlations were estimated. We used full information maximum likelihood in order to use all available measurements. The full information maximum likelihood assumes ‘missing at random’ . Based on separate variables as indicators of the same underlying construction, we expected similar changes over time in the indicators and as seen in the FSS-9 sum score.
Ethics approval and consent to participate
The study was reviewed and approved by the Regional Ethical Committee for Health Research (REC) West, Norway (reference number: 2017/51/REK Vest, dated 29.03.2017/20.04.2017). Each patient provided written informed consent prior to enrolling in the study.
Patient characteristics at baseline
Seventy-one percent were males, and the mean age was 43 years (Table 1). Half had more than primary school as the highest level of education. Eighty-three percent received opioid agonist therapy, and 42% had injected substances in the last 30 days leading up to the health assessment. The mean values of the FSS-9 items varied from 4.43 to 5.38 at baseline (Table 2). For the VAFS, the mean value was 5.19 at baseline. The FSS-9 and VAFS variables were slightly left-skewed (skewness ranged from − 1.14 to − 0.29) and tended towards a flattened distribution (kurtosis ranged from − 1.39 to − 0.09).
Internal consistency of the FSS-3 and FSS-9, and these scales including the VAFS
The nine-item Fatigue Severity Scale’s Cronbach’s α was 0.94 at baseline and 0.93 at the second health assessment (Additional file 3). For the FSS-3, retaining items 5–7 from the FSS-9, Cronbach’s α was 0.87 at baseline and 0.84 at the second health assessment. The internal consistency was not substantially affected when we added the VAFS to the FSS-3 and FSS-9 at baseline (FSS-3 plus VAFS: α = 0.87; FSS-9 plus VAFS: α = 0.94), and at the second health assessment (FSS-3 plus VAFS: α = 0.85; FSS-9 plus VAFS: α = 0.93).
Longitudinal confirmatory factor analysis for evaluating the fit of the FSS-3 and FSS-9, and these scales including the VAFS
The results from the CFAs comparing the fit of items in the FSS-3 and FSS-9, and these scales including the VAFS at baseline and at the second health assessment are shown in Table 3. At baseline, the unidimensional model with unique factor loadings, residuals, and intercept values resulted in a borderline fitted model, with a Comparative Fit Index and Tucker Lewis Index below the suggested levels and a Root Mean Square Error of Approximation point estimate just at the level of a poor fit for the FSS-9. The modification index showed that the estimation of the residual covariance between items 2 and 3 improved the model with the best result (Δχ2 = 80.1, degree of freedom (df) = 1, P < 0.001). The factor loadings for the nine items of the FSS-9 in this model were (ranged from items 1 to 9): 0.66, 0.74, 0.77, 0.78, 0.85, 0.79, 0.84, 0.84, and 0.84. The FSS-3 showed a well-fitted model. The FSS-3 and FSS-9 were highly correlated with r = 0.95, P < 0.001, giving 90% explained variance after estimating factor scores. The VAFS, together with the FSS-3 and FSS-9, respectively, gave identical results. The correlations with VAFS were r = 0.68, P < 0.001 (FSS-3) and r = 0.70, P < 0.001 (FSS-9). We obtained relatively similar results for the factor models at the second health assessment for both the FSS-3 and FSS-9 with and without the VAFS. The full FSS-9 version showed the factor loadings to be (ranged from items 1 to 9) 0.59, 0.66, 0.75, 0.69, 0.84, 0.74, 0.85, 0.80, and 0.79. The longitudinal CFA model based on the FSS-3 supported time-invariant equal factor loadings and equal residuals between the baseline and the second health assessment (Fig. 1). The correlation between the models at baseline and at the second measurement was r = 0.52, P < 0.001. A small reduction in the model fit was found if the intercept values were constrained to be equal within each measured item between health assessments. However, this simpler model was still a well-fitted model.
Linear mixed model analysis for evaluating changes in the FSS-3 and FSS-9 sum scores and the VAFS score
The linear mixed model analysis showed considerable intra-individual clustering for the FSS-3 and FSS-9 sum scores, the VAFS score, and the separate items’ scores (Table 4). However, the intraclass correlation coefficient estimated variations of 0.18 to 0.52, and these showed more variation over time in some variables than in others. The random slope and covariance between the intercept and the slope were statistically non-significant for all models. This indicated an equal linear change from baseline at the individual level. The re-estimated linear random intercept fixed slope models showed a small increase in the items 1–4, while no similar change was found in the other items. A mean change was also found in the VAFS variable.
The present study shows that the FSS-9 can be shortened to the FSS-3, with most of the included variance and validity retained. The value in adding the VAFS to the FSS-3 and FSS-9 did not provide much added variability. Questionnaires that are easily understood and with few items are essential to ensure high completion rates among patients with SUDs. Our findings showed that the full-scale FSS-9 had good internal reliability when tested empirically based on internal consistency. The internal consistency measured for the FSS-3 was nearly as high for the FSS-3 as FSS-9, even though it was just below a suggested 0.90 threshold [24, 32]. This supports the use of the FSS-3 as a patient-reported outcome measure in group-level comparisons, but its use in clinical practice would be controversial. The FSS-3 results from the longitudinal CFA showed a well-fitted unidimensional model with equal factor loadings and equal residuals when comparing baseline to the second health assessment. The FSS-3 and FSS-9 were almost perfectly correlated, which was in line with what we expected considering the homogeneity of the FSS-9. The factor analysis supported longitudinal stability in the indicators and confirmed the longitudinal measurement invariance. Although the fatigue level varied, the LMM analysis showed that the scoring structure was substantially stable and equal over time from baseline, which supported the validity of the indicators. This finding was somehow surprising considering the chaotic life situations and extensive substance use of many patients, which could contribute to substantial individual changes in item scores over time.
The reliability analyses showed high internal consistency for the FSS-3 and FSS-9, and the internal consistency only decreased from 0.94 to 0.87 when reducing the number of items from nine (FSS-9) to three (FSS-3). Cronbach’s α is related to the ratio between the mean covariance and the total variance, and the number of items, which makes the use of α thresholds to some degree arbitrary . Our results showed that the reductions in the Cronbach’s α were accounted for by the number of items and not the covariance/total variance ratio, which is a less problematic reason for this small amount of reduction of internal consistency reliability. A homogenous and substantially equal internal consistency was also found in studies evaluating the FSS-9 in other chronic diseases such as stroke , hepatitis C virus infection , and multiple sclerosis , as well as studies that have shortened the Fatigue Severity Scale from nine to seven items [17, 18, 33]. In the present study, most patients were polysubstance users, of which nearly 40% had injected substances during the past 30 days. This may contribute to substantial changes in medical and psychosocial factors affecting the health assessment, for example, being affected by substances, living temporarily on the street, or having a lack of income, thus making the present highly reliable and very short FSS-3 questionnaire useful in further fatigue surveys.
The confirmatory factor analyses demonstrated unidimensional models for the FSS-9 at baseline and at the second health assessment, which was improved when adding the residual covariance between items 2 and 3. This means that the single-factor model did not fully capture these items’ responses. The explanations for this might be related to similar phrasing and wording , as well as the order of items 2 and 3, which might affect patients’ perception and interpretation, and increase their confusion. Moreover, the unidimensional FSS-9 factor model was in line with models reported in other studies that have validated the FSS-9 [2, 21, 33]. In those studies, however, small study samples have been a substantial limitation, contributing to a potential risk of overlooking underlying multidimensional models [2, 33]. When using a relatively large cohort of patients with SUDs, the present study showed that the unidimensional factor models of the FSS-3 and FSS-9 were maintained. Therefore, regardless of sample sizes, one can assume that the unidimensional factor models are generally well-fitted for the FSS-3 and FSS-9.
The linear mixed model analysis showed that items 5–7 included in the FSS-3 remain substantially stable and equal over time between the annual health assessments compared to the separate items 1–4 and VAFS. The result might give further arguments for the better validity of the FSS-3 questionnaire, and the FSS-3 is assumed to be less sensitive to fluctuations compared to the FSS-9. This points out that the FSS-3 might be preferred when evaluating changes in fatigue over time in the SUD population.
Strengths and limitations
This study had some strengths. First, we collected data from patients who are difficult to reach in both research and health care. Of those, 225 patients were followed up by two or three annual health assessments, making longitudinal analyses possible. Second, patients recruited to this study answered the FSS-9 and VAFS questionnaires under different mental and psychosocial conditions, for example, when they were going through substance intoxications or withdrawal and living on the street, which might increase the generalisation of the results. The study also had some limitations. First, the majority of the patients were recruited from opioid agonist therapy, making this validation study more transferable to opioid agonist therapy populations than other substance dependence populations. Second, the FSS-9 questionnaire used was translated from US-English into Norwegian and back-translated into US-English by native Norwegian and US-English-speaking translators , but as far as we are aware, no specific protocol for high-quality translations was used . This might slightly reduce the external validity of our results. Third, the time intervals between the baseline and the second health assessment and between the second and the third health assessment varied, which could have affected how patients scored the FSS-9 and VAFS. Forth, comparing the FSS-3 to the FSS-9, the FSS-3 might increase the risk of common method bias considering that three items are more likely to be recalled and more accessible in the short-term memory than nine items . Previous studies have not evaluated the impact of common method bias; however, high reliability was achieved when validating a shortened FSS-9 into seven items in various study samples [17, 37]. Nevertheless, these studies detected cross-sample differences between items 3, 5, 6, and 9, which corresponded to two items (5 and 6) in the FSS-3 questionnaire. This points out the need for further validation and shortening studies on the FSS-3 when adapting it to other populations. Fifth, the psychometric properties of the FSS-3 are not thoroughly investigated [34, 38]. We had assessed the construct validity of the FSS-3; however, we did not know that the criterion validity is conceptually equivalent to the FSS-9, and this scale including the VAFS. For further research, we call for more longitudinal data on the SUD population to improve the validity and the psychometric properties of the FSS-3 and make it even more useful for clinicians and SUD researchers.
The present study demonstrates that the FSS-9 can be shortened to just the FSS-3 among patients with SUDs. The value in adding the VAFS to the FSS-3 and FSS-9 did not provide much added variability. We found that the FSS-3 was more consistent in the structure of changes in fatigue levels compared to the FSS-9. The FSS-3 seems to be a useful patient-reported outcome measure of fatigue in this population at a group level, although the clinical relevance at an individual level remains more controversial.
Availability of data and materials
No additional data are available due to data protection requirements.
Confirmatory Factor Analysis
Three-item Fatigue Severity Scale
Nine-item Fatigue Severity Scale
Linear mixed model
The Regional Ethical Committee for Health Research
Substance use disorder
Visual Analog Fatigue Scale
Rosa K, Fu M, Gilles L, Cerri K, Peeters M, Bubb J, Scott J. Validation of the Fatigue Severity Scale in chronic hepatitis C. Health Qual Life Outcomes. 2014;12:90.
Alekseeva TM, Gavrilov YV, Kreis OA, Valko PO, Weber KP, Valko Y. Fatigue in patients with myasthenia gravis. J Neurol. 2018;265(10):2312–21.
Ozyemisci-Taskiran O, Batur EB, Yuksel S, Cengiz M, Karatas GK. Validity and reliability of Fatigue Severity Scale in stroke. Top. 2019;26(2):122–7.
Just-Østergaard E, Mortensen EL, Tolstrup JS, Flensborg-Madsen T. Vital exhaustion and risk of alcohol use disorders: a prospective cohort study. J Psychosom Res. 2018;114:25–30.
Obeid S, Akel M, Haddad C, Fares K, Sacre H, Salameh P, Hallit S. Factors associated with alcohol use disorder: the role of depression, anxiety, stress, alexithymia and work fatigue—a population study in Lebanon. BMC Public Health. 2020;20(1):245.
Teuber G, Schäfer A, Rimpel J, Paul K, Keicher C, Scheurlen M, Zeuzem S, Kraus MR. Deterioration of health-related quality of life and fatigue in patients with chronic hepatitis C: association with demographic factors, inflammatory activity, and degree of fibrosis. J Hepatol. 2008;49(6):923–9.
van Emmerik-van OK, van de Glind G, van den Brink W, Smit F, Crunelle CL, Swets M, Schoevers RA. Prevalence of attention-deficit hyperactivity disorder in substance use disorder patients: a meta-analysis and meta-regression analysis. Drug Alcohol Depend. 2012;122(1–2):11–9.
van Emmerik-van OK, van de Glind G, Koeter MW, Allsop S, Auriacombe M, Barta C, Bu ET, Burren Y, Carpentier PJ, Carruthers S, et al. Psychiatric comorbidity in treatment-seeking substance use disorder patients with and without attention deficit hyperactivity disorder: results of the IASP study. Addiction (Abingdon). 2014;109(2):262–72.
Lappan SN, Brown AW, Hendricks PS. Dropout rates of in-person psychosocial substance use disorder treatments: a systematic review and meta-analysis. Addiction (Abingdon). 2020;115(2):201–17.
McGovern MP, Morrison DH. The Chemical Use, Abuse, and Dependence Scale (CUAD). Rationale, reliability, and validity. J Subst Abuse Treat. 1992;9(1):27–38.
Scherer M, Furr-Holden CD, Voas RB. Drug use disorder (DUD) questionnaire: scale development and validation. Eval Rev. 2013;37(1):35–58.
Siciliano M, Chiorri C, De Micco R, Russo A, Tedeschi G, Trojano L, Tessitore A. Fatigue in Parkinson’s disease: Italian validation of the Parkinson Fatigue Scale and the Fatigue Severity Scale using a Rasch analysis approach. Parkinsonism Relat Disord. 2019;65:105–10.
Krupp LB, LaRocca NG, Muir-Nash J, Steinberg AD. The Fatigue Severity Scale. Application to patients with multiple sclerosis and systemic lupus erythematosus. Arch Neurol. 1989;46(10):1121–3.
Tseng BY, Gajewski BJ, Kluding PM. Reliability, responsiveness, and validity of the visual analog fatigue scale to measure exertion fatigue in people with chronic stroke: a preliminary study. Stroke Res Treat. 2010. https://doi.org/10.4061/2010/412964.
Ferentinos P, Kontaxakis V, Havaki-Kontaxaki B, Dikeos D, Lykouras L. Psychometric evaluation of the Fatigue Severity Scale in patients with major depression. Qual Life Res. 2011;20(3):457–65.
Kleinman L, Zodet MW, Hakim Z, Aledort J, Barker C, Chan K, Krupp L, Revicki D. Psychometric evaluation of the Fatigue Severity Scale for use in chronic hepatitis C. Qual Life Res. 2000;9(5):499–508.
Lerdal A, Kottorp A, Gay C, Aouizerat BE, Portillo CJ, Lee KA. A 7-item version of the Fatigue Severity Scale has better psychometric properties among HIV-infected adults: an application of a Rasch model. Qual Life Res. 2011;20(9):1447–56.
Lerdal A, Kottorp A. Psychometric properties of the Fatigue Severity Scale-Rasch analyses of individual responses in a Norwegian stroke cohort. Int J Nurs Stud. 2011;48(10):1258–65.
Mills RJ, Young CA, Nicholas RS, Pallant JF, Tennant A. Rasch analysis of the Fatigue Severity Scale in multiple sclerosis. Mult Scler. 2009;15(1):81–7.
Fadnes LT, Aas CF, Vold JH, Ohldieck C, Leiva RA, Chalabianloo F, Skurtveit S, Lygren OJ, Dalgard O, Vickerman P, et al. Integrated treatment of hepatitis C virus infection among people who inject drugs: study protocol for a randomised controlled trial (INTRO-HCV). BMC Infect Dis. 2019;19(1):943.
Lerdal A, Wahl A, Rustoen T, Hanestad BR, Moum T. Fatigue in the general population: a translation and test of the psychometric properties of the Norwegian version of the Fatigue Severity Scale. Scand J Public Health. 2005;33(2):123–30.
Pedhazur EJ, Schmelkin LP. Measurement, design, and analysis: an integrated approach: Hillsdale. New York: Lawrence Erlbaum Associates; 1991.
Cronbach LJ. Coefficient alpha and the internal structure of tests. Psychometrika. 1951;16(3):297–334.
Frost MH, Reeve BB, Liepa AM, Stauffer JW, Hays RD. What is sufficient evidence for the reliability and validity of patient-reported outcome measures? Value Health. 2007;10(Suppl 2):S94-s105.
Bollen KA. Latent variables in psychology and the social sciences. Annu Rev Psychol. 2002;53:605–34.
Kline RB. Principles and practice of structural equation modeling. 4th ed. New York: The Guilford Press; 2016.
Schumacker R, Lomax R. A beginner’s guide to structural equation modeling. 4th ed. New York: Psychology Press; 2004.
Widaman KF, Ferrer E, Conger RD. Factorial invariance within longitudinal structural equation models: measuring the same construct across time. Child Dev Perspect. 2010;4(1):10–8.
Hox JJ, Moerbeek M, Van de Schoot R. Multilevel analysis techniques and applications. New York: Routledge; 2018.
Wang J, Wang X. Structural equation modeling: applications using Mplus. West Sussex: Wiley; 2012.
Enders CK. Applied missing data analysis. New York: The Guilford Press; 2010.
Aaronson N, Alonso J, Burnam A, Lohr KN, Patrick DL, Perrin E, Stein RE. Assessing health status and quality-of-life instruments: attributes and review criteria. Qual Life Res. 2002;11(3):193–205.
Feng C, He Q, Wu Y, Hu X, Wu J, He X, Zhao S. Psychometric properties of fatigue severity scale in Chinese systemic lupus erythematosus patients. Health Qual Life Outcomes. 2019;17(1):71.
Reise SP, Morizot J, Hays RD. The role of the bifactor model in resolving dimensionality issues in health outcomes measures. Qual Life Res. 2007;16(Suppl 1):19–31.
Wild D, Grove A, Martin M, Eremenco S, McElroy S, Verjee-Lorenz A, Erikson P. Principles of good practice for the translation and cultural adaptation process for patient-reported outcomes (PRO) measures: report of the ISPOR task force for translation and cultural adaptation. Value Health. 2005;8(2):94–104.
Podsakoff PM, MacKenzie SB, Lee JY, Podsakoff NP. Common method biases in behavioral research: a critical review of the literature and recommended remedies. J Appl Psychol. 2003;88(5):879–903.
Johansson S, Kottorp A, Lee KA, Gay CL, Lerdal A. Can the Fatigue Severity Scale 7-item version be used across different patient populations as a generic fatigue measure—a comparative study using a Rasch model approach. Health and quality of life outcomes. 2014;12:24.
Fung SF, Chow EO, Cheung CK. Development and validation of a brief self-assessed wisdom scale. BMC Geriatr. 2020;20(1):54.
We thank the devoted clinical staff for their enthusiasm during the planning of the project. We also thank Nina Elisabeth Eltvik and Christer Kleppe for valuable help and input during the planning and preparation phases.
INTRO-HCV Study Group participating investigators:
Bergen: Christer Frode Aas, Vibeke Bråthen Buljovcic, Fatemeh Chalabianloo, Jan Tore Daltveit, Silvia Eiken Alpers, Lars T. Fadnes (principal investigator), Trude Fondenes Eriksen, Per Gundersen, Velinda Hille, Kristin Holmelid Håberg, Kjell Arne Johansson, Rafael Alexander Leiva, Siv-Elin Leirvåg Carlsen, Martine Lepsøy Bonnier, Lennart Lorås, Else-Marie Løberg, Mette Hegland Nordbotn, Cathrine Nygård, Maria Olsvold, Christian Ohldieck, Lillian Sivertsen, Hugo Torjussen, Jørn Henrik Vold, Jan-Magnus Økland. Stavanger: Tone Lise Eielsen, Nancy Laura Ortega Maldonado, Ewa Joanna Wilk. proLAR: Ronny Bjørnestad, Ole Jørgen Lygren, Marianne Cook Pierron. Oslo: Olav Dalgard, Håvard Midgard, Svetlana Skurtveit. Bristol: Peter Vickerman, Aaron Lim.
This work was supported by The Norwegian Research Council (BEHANDLING, Contract No. 269855); and the Western Norway Regional Health Authority («Åpen prosjektstøtte») with Department of Addiction Medicine, Haukeland University Hospital, Bergen, Norway as responsible institution. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Two of the authors are funded from the research grant BEHANDLING related to the project INTRO-HCV from the Norwegian Research Council. Their respective affiliations fund the other authors.
Ethics approval and consent to participate
The study was reviewed and approved by the Regional Ethical Committee for Health Research (REC) West, Norway (Reference No. 2017/51/REK Vest, dated 29.03.2017/20.04.2017). Each patient provided written informed consent prior to enrolling in the study.
Consent for publication
Participants have consented to publication.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The number of months from baseline to the second or third measurements. No.: Number of patients; SD: Standard deviation; ref.: Reference. Tables display the number of patients with one, two, and three health assessments, including the Fatigue Severity Scale and the Visual Analog Fatigue Scale measurements. The table displays the time interval between the second and third health assessments and baseline.
The US-English and the Norwegian versions of the nine-item Fatigue Severity Scale. FSS-9: Nine-item Fatigue Severity Scale. All items in the FSS-9 are answered with a Likert scale from 1 to 7 where 1 indicates “strongly disagree” and 7 indicates “strongly agree”.
The Cronbach’s α if-item-deleted and Item-Total correlation for the FSS-3 and FSS-9, and these scales including the VAFS. FSS-9: Nine-item Fatigue Severity Scale; FSS-3: Three-item Fatigue Severity Scale; VAFS: Visual Analogue Fatigue Scale; No.: Number of patients. *Due to high internal consistency and almost equal Cronbach’s α of the remaining items, we deleted item 8 using clinical judgement. Item 8 could be difficult to answer if patients were affected by substances or were going through substance withdrawals. **We deleted item 9 using clinical judgement due to high internal consistency and almost equal Cronbach’s α in the remaining items and the fact that unemployment was frequently reported among patients with SUDs. Tables a) and c) show the Cronbach’s α if-item-deleted for the FSS-3 and FSS-9, and these scales including the VAFS using data from the first health assessment (baseline) (Table a) and the second health assessment (Table c). Tables b) and d) show the Item-Total correlation for the FSS-3 and FSS-9, and these scales including the VAFS using data from the first health assessment (baseline) (Table b) and the second health assessment (Table d). The values in bold represent the FSS-3’s Cronbach’s α if-item-deleted and Item-Total correlation.
About this article
Cite this article
Vold, J.H., Gjestad, R., Aas, C.F. et al. Validation of a three-item Fatigue Severity Scale for patients with substance use disorder: a cohort study from Norway for the period 2016–2020. Health Qual Life Outcomes 19, 69 (2021). https://doi.org/10.1186/s12955-021-01708-w
- Substance-related disorders
- Quality of life
- Fatigue Severity Scale
- Visual Analogue Fatigue Scale
- Activities of daily living