Health outcome assessment is an important component of patient care. Patient reported outcome (PRO) measures [1, 2] are primarily used to objectively reflect a patient’s health or functional status at any given time and to detect changes in this status as a response to an intervention . This assists the clinicians’ understanding of the effects of a condition or disease on a patient’s capabilities, functioning and symptoms . Traditionally, clinical signs and symptoms were used as outcomes and studies that wished to reflect patient health status employed generic quality of life measures. However, over the last two decades region specific PROs that represent the three key body regions, of the upper limb, lower limb and spine have been used more frequently in the assessment of a musculoskeletal patient’s functional status . The Upper Limb Functional Index (ULFI) is a recent example of this. It was initially published in a dichotomous format  then updated and modified to a three-point scale . These regional PRO measures are argued to provide greater sensitivity and improved representation of the individual’s functional status than joint or condition specific measures [7–9]. Though various region specific PROs have been used to assess upper-limb functional status, it is accepted that ‘there is no gold standard [8, 10–12]. These tools also guide treatment decisions and assess the effectiveness of interventions, including direct comparisons between pre- and post-operative status, and subsequently during rehabilitation .
There are several regional upper limb PROs that are advocated and recommended by national associations or organizations around the world for Physical, Occupational and Hand Therapy, Orthopedics and Surgery. This is through their respective institutional websites and subject related Journals. The Disability of Arm, Shoulder and Hand (DASH) [14–16] and the shortened QuickDASH  version are two prominent examples. However the DASH has  excessive internal consistency, with a documented Cronbach Alpha value >0.95 [6, 8, 12, 19], the recognized upper limit for ‘item redundancy’ or the presence of too many items being too similar to enable a valid change to be detected . The factor structure has also been challenged [21–23] which further questions validity. A questionnaire must provide a single-factor structure so that it can be summated to provide a single or summary score. It cannot be influenced by other constructs such as psychological or emotional status [24, 25]. The QuickDASH, as derived from 11extracted DASH items, has also been challenged. The factor structure has not been consistently shown as one-dimensional [7, 26–28], which raises concerns on its validity; and it has been found to underestimate symptoms and overestimate disability . Several other regional PROs are also advocated. The Upper Extremity Functional Scale (UEFS)  which has been shown to lack reliability and methodological criteria [5, 31]. The Upper Extremity Functional Index (UEFI)  which is criticized due to it development methodology using a specific workers population in a small data set with a high average age [6, 8]. It has been subsequently independently validated  but uses a matrix response format which has a high error tendency for completion and scoring . It is also reported to have no advantage over the DASH for measuring clinical change . The Neck and Upper Limb Index (NULI)  which has been demonstrated as having item-redundancy from excessive internal consistency  and development concerns . There are also a significant number of joint and condition specific scales but these cannot be used regionally as they do not consider the upper limb as a single kinetic chain [8, 18].
The ULFI with a three-point option improved both the responsiveness and practicality . It was shown to have strong psychometric properties for reliability, validity, responsiveness, error measurement, and internal consistency that approximated or exceeded those of the DASH and UEFS . The ULFI was shown to have strong psychometric properties for reliability, validity, responsiveness, error measurement, and internal consistency that approximated or exceeded those of the DASH and UEFS . The ULFI’s practical characteristics of brevity, ready transferability to a 100-point scale, ease and rapidity of completion and scoring reinforced the methodological consistency [7, 26, 38]. This comparative analysis in separate studies has provided scope to suggest the ULFI was preferred to the criterion tools of the DASH [6, 17, 38], UEFS  and QuickDASH [7, 26] due to a combination of enhanced psychometric and practical characteristics. A further consideration was that the ULFI has a single factor structure  and an acceptable level of internal consistency in all studies.
The ULFI has also been accepted by the international PRO database  ‘PROQUOLID’. A Spanish version of the ULFI had not been developed or validated to date. This is significant given that Spanish is one of the five most spoken languages and the second widest geographically . Consequently, a Spanish version of the ULFI (ULFI-Sp) was developed to meet this need. The four published studies to date investigating the ULFI suggest the practical characteristics along with the responsiveness and error range [4, 8]
, are consistently defined [6, 7, 26, 38]. Therefore the aims of this paper were: to describe the process of translation and cross-cultural adaptation of the original ULFI to Spanish; and to subsequently assess the four critical psychometric properties of reliability, factor structure, internal consistency, and concurrent criterion validity for clinical use with Spanish speakers.