Health-related quality of life in facial palsy: translation and validation of the Dutch version Facial Disability Index
Health and Quality of Life Outcomes volume 18, Article number: 256 (2020)
Patient-reported outcome measures are essential in the evaluation of facial palsy. Aim of this study was to translate and validate the Facial Disability Index (FDI) for use in the Netherlands.
The FDI was translated into Dutch according to a forward-backward method. Construct validity was assessed by formulating 22 hypotheses regarding associations of FDI scores with the Facial Clinimetric Evaluation scale, the Synkinesis Assessment Questionnaire, the Short Form-12 and the Sunnybrook Facial Grading System. Validity was considered adequate if at least 75% (i.e. 17 out of 22) of the hypotheses were confirmed. Additionally, confirmatory factor analysis was performed. Cronbach’s α was calculated as a measure of internal consistency. Participants were asked to fill out the FDI a second time after 2 weeks to analyse test-retest reliability. Lastly, smallest detectable change was calculated.
In total, 19 hypotheses (86.4%) were confirmed. Confirmatory factor analysis showed acceptable fit for the two factor structure of the original FDI (root mean square error of approximation = 0.064, standardized root mean square residual = 0.081, comparative fit index = 0.925, Chi-square = 50.22 with 34 degrees of freedom). Internal consistency for the FDI physical function scale was good (α > 0.720). Internal consistency for the FDI social/well-being scale was slightly less (α > 0.574). Test-retest reliability for both scales was good (intraclass correlation coefficients > 0.786). Smallest detectable change at the level of the individual was 17.6 points for the physical function and 17.7 points for the social/well-being function, and at group level 1.9 points for both scales.
The Dutch version FDI shows good psychometric properties. The relatively large values for individual smallest detectable change may limit clinical use. The translation and widespread use of the FDI in multiple languages can help to compare treatment results internationally.
Facial palsy results in functional and social problems related to the inability to control the muscles of facial expression [1,2,3,4]. Additionally, the altered facial function and appearance of the face may increase feelings of depression and anxiety, and may negatively influence self-image and quality of life [3,4,5,6,7]. The latter describes not so much the condition affecting the individual, but rather the individuals perception on their position in life including their social environment and mental health in the context of the condition. Evaluation of facial palsy should thus not only include facial movements and disabilities, but also include patient-perceived disability and quality of life.
The Facial Disability Index (FDI) is patient-reported outcome measure including ten-items, with a six-point ordinal answering scale . Two FDI domain scores, the physical function and the social/well-being function, can be calculated ranging from 0 (worst) to 100 (best). Since the introduction of the original FDI in the 1990s, the FDI has been translated and validated to Spanish , Swedish , Italian , German , French , and Brazilian Portuguese . However, previous studies did not include a pilot test stage [11, 12], pre-determined hypotheses for adequate construct validity [9, 10, 12,13,14], did not perform test-retest reliability [9, 12], and none determined smallest detectable change the FDI [9,10,11,12,13,14]. Aim of this study was to translate the FDI into Dutch and culturally validate the Dutch version of the FDI (FDI-NL) for use in Dutch speaking populations.
Materials and methods
Our study protocol was reviewed by the medical ethics committee of our institution. The medical ethics committee deemed full and formal testing of our study protocol not necessary under current Dutch law. Patients from the outpatient departments of our institution provided written consent prior to participation. The developers of the original FDI granted permission to translate it into Dutch.
The FDI-NL was created using a forward-backward translation method (Fig. 1). Two native Dutch speakers who are also fluent in English were asked to translate the English FDI into Dutch (B.t.H. and C.V., acknowledgements). A three-person committee (first, before last and last author, all native Dutch speakers with an excellent proficiency in English) with experience in the treatment of facial palsy and translating questionnaires then combined both forward translations into one consensus version FDI-NL. The consensus version was translated back to English by two native English speakers who were also fluent in Dutch (S.B. and N.T., acknowledgements). The same three-person committee compared the backward translations to the original FDI and the consensus version FDI-NL. A second consensus version FDI-NL was created and pilot tested in 10 patients with facial palsy and 10 healthy individuals. Pilot test participants were asked to critically review wording, phrasing and overall comprehensibility of the questionnaire, after which the final FDI-NL was constructed. Pilot testing was performed with 10 patients and 10 healthy individuals since facial palsy is relatively rare and the condition does not affect reading and language capabilities.
Adult patients with facial palsy who visited our department between January 2007 and January 2018 were invited to participate in our study. The patients were asked to visit our institution to fill out the questionnaires and measure current facial function. Patients fill out the questionnaires independently, without a researcher in the room.
Validity of the FDI-NL was analysed by comparing FDI scores to several Dutch validated PROMs (Facial Clinimetric Evaluation scale (FaCE scale) [15, 16], Short Form 12 (SF-12) , the Synkinesis Assessment Questionnaire (SAQ)) [18, 19] and the Sunnybrook Facial Grading System (Sunnybrook)  as a measure of severity of facial palsy. The FaCE scale is a 15-item facial palsy-related quality of life questionnaire that comprises a total score and six domain score [15, 16]. The SF-12 is a measure of general health-related quality of life and comprises two domains: physical health and mental health . The SAQ was used as a patient-reported measure of the severity of synkinesis [18, 19]. The Sunnybrook score was used to establish clinician-graded facial function . Sunnybrook scoring was all done by one investigator (second author) based on a video from the clinic visit. Working hypotheses for the magnitude of the associations between the FDI-NL and FaCE, SF-12, SAQ and Sunnybrook scores were established based on those reported in the literature (Table 1) [9,10,11,12,13, 16, 19]. Based on the minimal and maximum reported associations we established a range in which we expected the associations to fall. We assumed adequate construct validity of the FDI-NL if 75% (i.e. 17 out of the 22) of hypotheses were confirmed .
Reliability of the FDI-NL was examined by assessing internal consistency, item-total correlations and test-retest reliability for the FDI-NL scales. Internal consistency was examined at the test moment. Patients with a stable facial function (e.g. excluding patients in the recovery phase of Bell’s palsy or with reconstructive surgery planned in the near future) were asked to fill out the FDI-NL for a second time after 2 weeks to test for test-retest reliability of the FDI-NL.
The smallest detectable change (SDC) was calculated at an individual and group level to yield a value for FDI-NL scores after which change can be considered actual change, instead of measurement error. A SDC at the level of the individual was calculated (SDCind) which can be used when interpreting change scores of one individual . The group level SDC (SDCgroup) can be used to interpret changes at a group level .
Statistical tests were performed in SPSS version 23 (IBM, New York, USA). Data is presented as frequencies and percentages, medians and interquartile ranges (IQR), and means and standard deviations (SD) as appropriate. Associations were analysed using Spearman’s rank correlation coefficients. A confirmatory factor analysis was performed using R software (version 3.4; R Foundation for Statistical Computing) to evaluate construct validity.
Cronbach’s α was calculated to analyse the internal consistency of the FDI-NL physical and social/well-being function scales. Additionally, Cronbach’s α was calculated for the FDI-NL scales with each item once excluded to evaluate if internal consistency would improve if that item was removed. Lastly, inter-item correlations were calculated to evaluate correlation between items.
Test-retest reliability was analysed using an intraclass correlation coefficient (ICC, two-way random effects model, single measures, absolute agreement). SDC was calculated in the following way. First the standard error of measurement (SEMagreement) was calculated by taking the square root of the error variance. Next, the SDCind was calculated using the formula: 1.96 x √2 x SEMagreement . The SDCgroup was calculated by SDCind / √n . Missing data for questionnaire items was estimated using multiple imputation.
Questionnaire translation and pilot testing
The FDI-NL was created according to the above described steps. No problems in the wording and phrasing of the consensus version FDI-NL were identified during pilot testing. Seventeen out of 20 pilot test participants preferred to have the answer options presented in a long format instead of the two columns in the original version. For further testing the long format answer options were used (Appendix – FDI-NL final version).
After pilot testing, 118 unilateral adult patients with facial palsy were included in this study. Eighty-seven (73.7%) patients also completed the retest FDI questionnaire 2 weeks after the visit to our institution. Sixty-two patients (52.5%) were female, median (IQR) age of the patients was 62.6 (48.8; 71.6) years. Most common cause of facial palsy was an acoustic neuroma (n = 29 (24.6%)), followed by trauma (n = 12 (10.2%)) (Table 2). All patients suffered from long-standing and irreversible facial palsy, and completed treatment for the underlying condition.
Nineteen of the 22 validity associations (86.4%) were within the pre-determined range (Table 3). The correlations between both FDI-NL scales and the Sunnybrook total score and the FDI-NL physical function and FaCE Lacrimal Control subscale did not confirm our hypothesis.
Confirmatory factor analysis examining the fit of the original two latent factors of the FDI showed an acceptable level of fit for the Dutch version FDI with a root mean square error of approximation of 0.064, standardized root mean square residual of 0.081, comparative fit index of 0.925, and Chi-square value of 50.22 with 34 degrees of freedom [23,24,25,26]. Least fitting items were item 4 (‘How much difficulty did you have with your eye tearing excessively or becoming dry?’) in the physical function scale and item 8 (‘How much of the time did you get irritable toward those around you?’) in the social/well-being function scale (Table 4).
Internal consistency of the FDI-NL physical function scale was considered good, with a Cronbach’s α > 0.7. Cronbach’s α for the social/well-being function was 0.574 and 0.607 (Table 5). The ICC for test-retest reliability was good for both scales, with an ICC of 0.845 and 0.786 for the physical and social/well-being function respectively. On the 0 to 100 point FDI-NL scales, SDCind was 17.6 points for the physical function and 17.7 points for the social/well-being function. SDCgroup was 1.9 points for both FDI scales (Table 6).
Cronbach’s α was higher if item 4 (‘How much difficulty did you have with your eye tearing excessively or becoming dry?’) was deleted from the physical function scale, and if item 8 (‘How much of the time did you get irritable toward those around you?’) and item 9 (‘How often did you wake up early or wake up several times during your nighttime sleep?’) were deleted form the social/well-being function scale (Table 5). Inter-item correlations were deemed acceptable with the highest inter-item correlations within each subscale, and without highly correlated in general (Table 5).
The FDI–NL has good construct validity, test-retest reliability, and an acceptable internal consistency. Associations between the FDI-NL scales and Sunnybrook total scored below the expected range of correlations based on the literature. The association between FDI physical function and Sunnybrook was 0.63, 0.44 and 0.30 and 0.40, 0.19, and 0.21 between the FDI social/well-being function and Sunnybrook in the Swedish, Italian and French validation study respectively. We found a correlation of 0.072 and 0.023 respectively [10, 11, 13]. Hypothetically this is because we see relatively severe cases at our department, which might be different for the otolaryngology departments where the other studies were performed. The association between the FDI-NL and Sunnybrook was still positive, although much smaller than elsewhere reported. This difference may partly be due to the long duration of facial palsy time in our study. The median duration of facial palsy in our study was 12.4 years. Much longer compared to the 29 months in the validation study of the Dutch version FaCE scale , 22 months in the validation study of the Swedish version FDI , 140 days in the French validation study , and a mean duration of 3.5 years in the Italian version of the FDI validation study . Patients with facial palsy may learn to cope with their disability over time and the association between patient-perceived disability and quality of life and a clinician-grading of facial palsy severity may change.
The internal consistency of the FDI-NL physical function scale was good with a Cronbach’s α above 0.7 at both the test and retest moment. The internal consistency of the social/well-being function scale was slightly less and did not reach the level of 0.7. Further analysis showed that removing item 9 (‘How often did you wake up early or wake up several times during your nighttime sleep?’) from the questionnaire would improve internal consistency of the scale the most. However, the median age of our study sample was 62.9 years compared to a mean age of 46.8 years in the original development study of the FDI . The lower internal consistency caused by this question might be related to sleeping problems due to older age instead of a symptom of depression resulting from facial palsy. Additionally, removing item 4 from the questionnaire would increase the internal consistency of the physical function scale, although much less drastically. We believe this is related to the nature of the question itself; item 4 asks about eye-related complaints, while the other items are related to the mouth or midface. Perhaps further dividing the physical function scale into a scale related to the mouth and a scale related to the eye, such as in the FaCE scale , would have solved this issue. Removing item 8 from the questionnaire improved the internal consistency only slightly and only at the test moment. Since we did not develop the FDI, but solely translated and validated it for use in the Netherlands, we chose to keep the questionnaire as it is.
Similar to the internal consistency, we found items 4, 8 and 9 of the FDI-NL to be the least fitting items in our confirmatory factor analysis; most likely for the same reasons as described above. However, the FDI-NL as whole still showed an acceptable level of fit.
Test-retest reliability of the FDI-NL scales was good, with ICC point estimates of 0.845 and 0.786 and a confidence interval lower limit above 0.7 for the physical function scale. However, when using an instrument for individual decision making an ICC of 0.9 is advised . We did not reach that level of test-retest reliability in our study. Recall bias because of the two-week time interval between the test and retest measurements could partly have influenced the ICC values. A two-week interval is generally considered as a margin to avoid recall bias, but short enough to avoid clinical improvement or deterioration .
The SDC is important for the interpretation of changes in scores. It indicates the point from which a change can be considered a true change and not due to measurement error. The FDI-NL SDCgroup values were quite small in the present study. However, at an individual level the large SDC values of both the physical and social/well-being function limit clinical applicability. SDC values for other facial palsy-specific PROMs such as the FaCE scale and SAQ, are not reported yet and comparison is therefore impossible.
We did not perform a formal sample size calculation for this study. However, we assumed approximately 60 participants would be needed in our retest sample for adequate power of our test-retest reliability. Anticipating a participation rate of 50% in the retest, we set out to include approximately 120 patients in our study. Based on the literature, our actual sample size of 118 patients, with 87 retest participants, can be considered good to excellent [28, 29].
Although the FDI-NL knows several limitations, the developed Dutch version allows for objective measurement of patient-perceived disability and quality of life in a Dutch speaking population. Furthermore, it can be used to compare results to the international literature or to combine patient data from different countries. The larger values for the SDCind limit the use in clinical setting. Future research should focus on the development of a facial palsy-specific PROM that is well usable in individual follow up.
The Dutch version FDI is a valid, reliable and easy to use questionnaire for the assessment of patient-perceived disability and quality of life in facial palsy. Although limited in clinical use in individuals, the FDI-NL provides the possibility to compare between clinics and so further increase knowledge about facial palsy and its effect on quality of life.
Availability of data and materials
The datasets used and/or analysed during the current study are available from the corresponding author on request.
Kim JH, Fisher LM, Reder L, Hapner ER, Pepper J. Speech and communicative participation in patients with facial paralysis. JAMA Otolaryngol-Head Neck Surg. 2018;144(8):686. https://www.ncbi.nlm.nih.gov/pubmed/29955841. https://doi.org/10.1001/jamaoto.2018.0649.
Moverare T, Lohmander A, Hultcrantz M, Sjogreen L. Peripheral facial palsy: Speech, communication and oral motor function. Eur Ann Otorhinolaryngol Head Neck Dis. 2017;134(1):27. https://doi.org/10.1016/j.anorl.2015.12.002.
Coulson SE, O'dwyer NJ, Adams RD, Croxson GR. Expression of emotion and quality of life after facial nerve paralysis. Otol Neurotol. 2004;25(6):1014–9. https://doi.org/10.1097/00129492-200411000-00026.
Dey JK, Ishii LE, Nellis JC, Boahene KDO, Byrne PJ, Ishii M. Comparing patient, casual observer, and expert perception of permanent unilateral facial paralysis. JAMA Facial Plastic Surg. 2017;19(6):476–83. https://doi.org/10.1001/jamafacial.2016.1630.
VanSwearingen JM, Cohn JF, Bajaj-Luthra A. Specific impairment of smiling increases the severity of depressive symptoms in patients with facial neuromuscular disorders. Aesthet Plast Surg. 1999;23(6):416–23.
Pouwels S, Beurskens CH, Kleiss IJ, Ingels KJ. Assessing psychological distress in patients with facial paralysis using the hospital anxiety and depression scale. J Plast Reconstr Aesthet Surg. 2016;69(8):1066–71. https://doi.org/10.1016/j.bjps.2016.01.021.
Nellis JC, Ishii M, Byrne PJ, Boahene KDO, Dey JK, Ishii LE. Association among facial paralysis, depression, and quality of life in facial plastic surgery patients. JAMA Facial Plastic Surg. 2017;19(3):190–6. https://doi.org/10.1001/jamafacial.2016.1462.
VanSwearingen JM, Brach JS. The facial disability index: Reliability and validity of a disability assessment instrument for disorders of the facial neuromuscular system. Phys Ther. 1996;76(12):1288–300. https://doi.org/10.1093/ptj/76.12.1288.
Gonzalez-Cardero E, Infante-Cossio P, Cayuela A, Acosta-Feria M, Gutierrez-Perez JL. Facial disability index (FDI): adaptation to spanish, reliability and validity. Med Oral Patol Oral Cir Bucal. 2012;17(6):1006.
Marsk E, Hammarstedt-Nordenvall L, Engstrom M, Jonsson L, Hultcrantz M. Validation of a swedish version of the facial disability index (FDI) and the facial clinimetric evaluation (FaCE) scale. Acta Otolaryngol. 2013;133(6):662–9. https://doi.org/10.3109/00016489.2013.766924.
Pavese C, Cecini M, Camerino N, et al. Functional and social limitations after facial palsy: Expanded and independent validation of the italian version of the facial disability index. Phys Ther. 2014;94(9):1327–36. https://doi.org/10.2522/ptj.20130254.
Volk GF, Steigerwald F, Vitek P, Finkensieper M, Kreysa H, Guntinas-Lichius O. Facial disability index and facial clinimetric evaluation scale: Validation of the german versions. Laryngorhinootologie. 2015;94(3):163–8. https://doi.org/10.1055/s-0034-1381999.
Barry P, Mancini J, Alshukry A, Salburgo F, Lavieille J, Montava M. Validation of french versions of the facial disability index and the facial clinimetric evaluation scale, specific quality of life scales for peripheral facial palsy patients. Clin Otolaryngol. 2019;44(3):313–22. https://onlinelibrary.wiley.com/doi/abs/10.1111/coa.13294. https://doi.org/10.1111/coa.13294.
Graciano AJ, Bonin MM, Mory MR, Tessitore A, Paschoal JR, Chone CT. Translation, cultural adaptation and validation of the facial disability index into brazilian portuguese. Braz J Otorhinolaryngol. 2019. https://www.sciencedirect.com/science/article/pii/S1808869418303902. https://doi.org/10.1016/j.bjorl.2019.04.003.
Kahn JB, Gliklich RE, Boyev KP, Stewart MG, Metson RB, McKenna MJ. Validation of a patient-graded instrument for facial nerve paralysis: The FaCE scale. Laryngoscope. 2001;111(3):387–98. https://doi.org/10.1097/00005537-200103000-00005.
Kleiss IJ, Beurskens CH, Stalmeier PF, Ingels KJ, Marres HA. Quality of life assessment in facial palsy: Validation of the dutch facial clinimetric evaluation scale. Eur Arch Otorhinolaryngol. 2015;272(8):2055–61. https://doi.org/10.1007/s00405-015-3508-x.
Ware J, Kosinski M, Keller SD. A 12-item short-form health survey: construction of scales and preliminary tests of reliability and validity. Med Care. 1996;34(3):220–33.
Mehta RP, WernickRobinson M, Hadlock TA. Validation of the synkinesis assessment questionnaire. Laryngoscope. 2007;117(5):923–6. https://doi.org/10.1097/MLG.0b013e3180412460.
Kleiss IJ, Beurskens CH, Stalmeier PF, Ingels KJ, Marres HA. Synkinesis assessment in facial palsy: Validation of the dutch synkinesis assessment questionnaire. Acta Neurol Belg. 2016;116(2):171–8. https://doi.org/10.1007/s13760-015-0528-7.
Ross BG, Fradet G, Nedzelski JM. Development of a sensitive clinical facial grading system. Otolaryngol Head Neck Surg. 1996;114(3):380–6. https://doi.org/10.1016/S0194-5998(96)70206-1.
Terwee CB, Bot SD, de Boer MR, et al. Quality criteria were proposed for measurement properties of health status questionnaires. J Clin Epidemiol. 2007;60(1):34–42.
Vet d HCW, Bouter LM, Bezemer PD, Beurskens AJHM. Reproducibility and responsiveness of evaluative outcome measures - theoretical considerations illustrated by an empirical example. Int J Technol Assess Health C. 2001;17(4):479 https://www.narcis.nl/publication/RecordID/vu2:oai:dare.ubvu.vu.nl:1871%2F22283.
Cheung GW, Rensvold RB. Evaluating goodness-of-fit indexes for testing measurement invariance. Struct Equ Model Multidiscip J. 2002;9(2):233–55.
Fan X, Sivo SA. Sensitivity of fit indices to model misspecification and model types. Multivar Behav Res. 2007;42(3):509–29.
Hoyle RH, Duvall JL. Determining the number of factors in exploratory and confirmatory factor analysis. In: Kaplan D, editor. The SAGE Handbook of Quantitative Methodology for Social Sciences; 2004. p. 301–15.
Marsh HW, Hau KT, Wen Z. In search of Golden rules: comment on hypothesis-testing approaches to setting cutoff values for fit indexes and dangers in overgeneralizing Hu and Bentler’s (1999) findings. Struct Equ Model. 2004;11(3):320–41.
Nunnally JC. Psychometric theory. New York: McGraw-Hill; 1978.
Terwee CB, Mokkink LB, Knol DL, Ostelo RWJG, Bouter LM, de Vet HCW. Rating the methodological quality in systematic reviews of studies on measurement properties: a scoring system for the COSMIN checklist. Qual Life Res. 2012;21(4):651–7.
Park MS, Kang KJ, Jang SJ, Lee JY, Chang SJ. Evaluating test-retest reliability in patient-reported outcome measures for older people: a systematic review. Int J Nurs Stud. 2018 Mar;79:58–69.
The authors thank Britt ten Hoope, Charlotte Veuger, Sam Barclay, and Natali Talukder for their time and help in the translation process, and Britt van Veen with her help in performing confirmatory factor analysis.
No funding was received to perform this work.
Ethics approval and consent to participate
All procedures performed in studies involving human participants were in accordance with the ethical standards of the Institutional Review Board of the University Medical Centre Groningen (METc number 2018/562) and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards. Informed consent was obtained from all individual participants included in the study.
Consent for publication
The authors declare that they have no conflict of interest.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
van Veen, M.M., Bruins, T.E., Artan, M. et al. Health-related quality of life in facial palsy: translation and validation of the Dutch version Facial Disability Index. Health Qual Life Outcomes 18, 256 (2020). https://doi.org/10.1186/s12955-020-01502-0
- Facial palsy
- Facial disability index
- Quality of life
- Smallest detectable change