Skip to main content

An evaluation of the psychometric properties of the sf-12v2 health survey among adults with hemophilia



This study examined the psychometric properties of version 2 of the SF-12 Health Survey (SF-12v2) among adults with hemophilia in the United States.


This study employed a cross-sectional design using web-based and paper-based self-administered surveys. Hemophilia patients were recruited using an online panel and at a hemophilia treatment clinic. The psychometric properties of the SF-12v2 were assessed in terms of construct validity, internal consistency reliability, and presence of floor and ceiling effects.


A total of 218 adults with hemophilia completed the survey, with most recruited via the online panel (78%). Confirmatory factor analysis using the WLSMV estimator in Mplus supported a two-factor model for the SF-12v2 where the physical functioning, role physical, bodily pain, and general health items loaded onto a latent physical factor (LPF) and the role emotional, mental health, social functioning, and vitality items loaded onto a latent mental factor (LMF). Model fit statistics for the two-factor model were: Chi-square [df] = 172.778 [48]; CFI = 0.972; TLI = 0.962; RMSEA [90% CI] = 0.109 [0.092–0.127]; WRMR = 0.947. Correlated residuals for items belonging to similar domains were estimated and there was a significant correlation between LPF and LMF. All standardized factor loadings were strong and statistically significant, indicating adequate convergent validity. Item-to-other scale correlations were lower than item-to-hypothesized scale correlations suggesting good item discriminant validity. Model testing revealed that LPF and LMF were not perfectly correlated, suggesting adequate construct discriminant validity. Increasing levels of symptom severity were associated with significant decreases in physical component summary (PCS) and mental component summary (MCS) scores, supporting known-groups validity. Internal consistency reliability was satisfactory, with Cronbach’s alpha of 0.848 for the LPF and 0.785 for the LMF items. Finally, none of the participants received the least or maximum possible PCS or MCS score, indicating the absence of floor and ceiling effects.


Overall, the SF-12v2 was found to have adequate psychometric validity in our sample of adults with hemophilia. These results add to the growing evidence of psychometric validity of the SF-12v2 in different patient populations including hemophilia.


Hemophilia is a rare X-linked chronic genetic blood coagulation disorder seen predominantly among males. It is caused by a deficiency of clotting factors VIII or IX in blood plasma. It affects about 400,000 people across the world and about 20,000 in the United States (US) [1, 2]. Patients with hemophilia experience bleeding into joints and muscles which in severe cases can lead to chronic pain, reduce the range of joint motion and eventually progress to chronic arthritis [3].

For patients living with hemophilia, merely treating and preventing bleeding episodes and other physical symptoms using clotting factor concentrates is not enough. Patients with hemophilia must be careful about participating in activities such as contact sports because immediate bleeding may ensue. Long-term impairments in mobility and impact on functional status due to reduced range of joint motion may also limit the activities in which patients can participate. This can affect social participation and peer integration [4, 5]. Employment and occupational disabilities can occur as well. Also, the disease can influence the mental well-being of patients within whom signs of depression, anxiety and psychological distress are common [6]. Thus, the physical, mental and social consequences of the disease serve to reduce the HRQOL of patients. Therefore, HRQOL assessment is now recognized as an important health outcomes endpoint which can help decide and optimize treatment options among patients with hemophilia. Overall HRQOL is a multidimensional, subjective concept which incorporates physical functioning, psychological functioning, social interaction, and somatic sensation [7].

One key aspect of measuring HRQOL of a population is the selection of the appropriate instrument. The SF-12 Health Survey version 2 (SF-12v2) is a generic measure of HRQOL [8]. Generic instruments allow for comparison of patients’ health status across disease states and conditions [9]. Generic HRQOL measures may be less sensitive to certain key aspects or symptoms of a particular disease state and as a result may not be able to capture small changes in the HRQOL of patients having a certain disease [10]. On the other hand, disease-specific HRQOL measures focus on problems that may be specific to a disease population. However, these instruments cannot be used to compare HRQOL across different disease states. Such information may be important to clinicians and policy makers in making key treatment and resource allocation decisions. Given their underlying utility, it is necessary to obtain evidence about the appropriateness of use of generic HRQOL measures (such as the SF-12v2) in different patient population [11].

Initial evidence regarding the reliability and validity of the SF-12 in the general US population was provided by Ware and colleagues in 1996 using data from the National Survey of Functional Health Status (NSFHS) and the Medical Outcomes Study (MOS) [12]. The instrument has since been evaluated for use among general populations in several different countries such as Denmark, Germany, United Kingdom, Netherlands, United States and others [13,14,15,16] as well as among patients with different diseases including Parkinson’s disease, stroke, diabetes mellitus, inflammatory rheumatic disease, hemodialysis [17,18,19,20,21]. The results of these studies suggest that the SF-12 has good psychometric properties. The SF-12v2 is an abbreviated version of the SF-36, which is one of the most commonly used generic HRQOL measures [9].

Although the SF-12v2 has been used to assess the HRQOL of hemophilia patients [22, 23], its psychometric properties have never been established among patients with hemophilia. To ascertain that the SF-12v2 is appropriate for use among hemophilia patients, its psychometric properties must be established in this population. Therefore, this study evaluated the psychometric properties of the SF-12v2 among adult patients with hemophilia in the US. The psychometric properties of the SF-12v2 assessed included: convergent validity, discriminant validity, known-groups validity, factorial validity using confirmatory factor analysis, and internal consistency reliability. Presence of floor and ceiling effects was also examined.



A cross-sectional design using a web-based, self-administered survey was distributed to a national convenience sample of adults with hemophilia in the United States. Study approval was obtained from the University of Mississippi Institutional Review Board under the exempt status.

Potential participants were sent an email explaining the objective and scope of the study. This email assured the respondents that their information would be kept confidential. The email also contained a URL link to the survey which was programmed in Qualtrics [24]. The survey was open from October 31, 2015 to January 31, 2016. All respondents were provided $10 Amazon gift cards for participation in the study.


The sample included adults (≥ 18 years of age) with hemophilia A or B. Patients with other blood coagulation disorders such as Von Willebrand’s disease were excluded from the study sample. The sample was recruited with the help of a market research vendor company called Rare Patient Voice [25] which maintains a panel of hemophilia patients who were primarily recruited at hemophilia-related conferences and patient advocacy group meetings across the US. Considering hemophilia is a rare disease, patients were also recruited using a Facebook community of hemophilia patients called Hemo Friends and at the University of Mississippi (UMMC) hemophilia treatment center (HTC) to maximize the analyzable sample size for the current study. In this study, 169 (77.5%) patients were recruited using the Rare Patient Voice panel, 44 (20.2%) from the Hemo Friends Facebook community, and 6 (2.3%) from UMMC. Given the nature of the statistical analysis plan for this study (i.e., confirmatory factor analysis), an a priori sample size of 200 patients with hemophilia was considered to be adequate [26].


Patients with hemophilia were asked to describe their HRQOL using the SF-12 Health Survey Version 2 (SF-12v2). The SF-12 is the shorter version of the SF-36 [12]. The SF-12v2 is a generic health profile instrument with 12-items which compose 8 health concepts forming a health profile [8]. These eight sub-domains are: physical functioning (PF), role physical (RP), bodily pain (BP), general health (GH), vitality (VT), social functioning (SF), role-emotional (RE), and mental health (MH). These eight sub-domain scores can be weighted and summarized into two component scores – the physical component summary (PCS) score and the mental component summary (MCS) score. According to the theoretical test model, the items from the physical functioning, role-physical, bodily pain, and general health sub-domains are primarily indicators of PCS while vitality, social functioning, role-emotional, and mental health items are primarily indicators for MCS [18]. For the SF-12v2, the norm-based PCS and MCS scores for the general US population have a mean of 50 and a standard deviation of 10 with higher scores indicating a better health status [8]. PCS, MCS, and sub-domain scores were calculated using the scoring software available from Optum (using 2009 US norms).

Hemophilia-related symptom severity was reported using the Patient Global Impression of Severity (PGI-S). The PGI-S is a single self-reported item that asks respondents to rate the severity of their disease condition. In this study, the PGI-S was worded: “When thinking about all of the hemophilia-related symptoms that you may have experienced during the past 4 weeks, please indicate the one option that best describes how your symptoms overall have been: (1) no symptoms, (2) mild symptoms, (3) moderate symptoms, or (4) severe symptoms.” A similar self-reported symptom severity measure has been used in studies of males with lower urinary tract symptoms secondary to benign prostatic hyperplasia and women with stress urinary incontinence [27, 28].


A descriptive analysis of the individual SF-12v2 items was conducted in terms of means and standard deviations (SD). Missing data, if any, was reported in terms of frequencies and percentages on a per-item level. Kurtosis and skewness coefficients were also calculated and variables with absolute value of the skew index > 3.0 and kurtosis index > 10.0 were considered to be non-normal [26]. Descriptive statistics were calculated for all other study variable in the form of frequencies and percentages for categorical variables, means and standard deviations for continuous variables.

Confirmatory factor analysis (CFA) was used to evaluate the factor structure of the SF-12v2 among patients with hemophilia. CFA is a structural equation modeling technique which can be used to evaluate the fit of a theoretically-based measurement model. Three measurement models were tested. First, a 1-factor model (Model 1) which forced all the SF-12v2 items to load on a single latent factor. Second, a 2-factor model based on the approach adopted by Okonkwo and colleagues (Model 2) [18] where the PF, RP, and BP items were specified to load on a latent physical health factor (LPF), the RE and MH items were specified to load on a latent mental health factor (LMF), and the three GH, VT, and SF items were allowed to load on both latent factors. The residuals for the two PF items were allowed to correlate in both the 1-factor and 2-factor models as modeled by Okonkwo et al. [18]. Third, a 2-factor model employed by Maurischat and colleagues (Model 3) [20] where the GH, PF, RP, and BP items were specified to load on a LPF and the RE, MH, VT, and SF items were allowed to load onto a LMF. The residuals for each of the two PF, RP, RE, and MH items were allowed to correlate as modeled by Maurischat et al. [20]. In both models 2 and 3, LPF and LMF were allowed to correlate.

Considering that the items on the SF-12v2 are ordered categorical variables with limited response options along with the possibility of item responses that skewed toward one end (i.e., the presence of floor or ceiling effects), weighted least squares estimation (WLSMV) for categorical indicators was used to quantify the hypothesized relationships [29]. All CFA models were estimated using Mplus version 7.31 (Muthen & Muthen, Los Angeles, CA). Model fit for each model was assessed using the following five fit statistics: χ2 statistic, the root mean square error of approximation (RMSEA), the Tucker Lewis Index (TLI), the comparative fit index (CFI), and the weighted root mean square residual (WRMR). Bagozzi and Yi (2012) suggest that for a well-fitting model, the RMSEA, TLI, CFI must be ≤0.08, ≥0.92, and ≥ 0.93 respectively [30]. For a good fitting model, WRMR must be less than or equal to 1 [31].

Given that the LPF and LMF loadings are sample specific, an additional model was fit to estimate the correlations of the latent factors with PCS and MCS scores from the standard algorithm.

Factor loadings from the CFA models and item-scale correlations were used to assess convergent validity among the items. The size of the factor loading is an indication of the amount of variance in a particular item that is explained by the latent construct. For the current study, standardized factor loadings that were statistically significant and greater than 0.5 were considered to be indicative of good convergent validity [26, 31]. Statistical significance of the factor loadings was considered as a minimum requirement because a significant loading could be weak or moderate in strength.

Higher item-scale correlations (Pearson’s correlation between score on an individual item in a sub-domain with the total score on the underlying sub-domain) indicate that expected items in the same sub-domains correlate strongly with each other. This approach of establishing convergent validity has been used by previous studies [32]. Item-scale correlations of 0.1–0.29 were considered small, 0.3–0.49 as moderate, and ≥ 0.45 was considered to be suggestive of strong [33]. A strong correlation of the items belonging to the GH, PF, RP, and BP sub-domains with PCS was hypothesized. Similarly, a strong correlation between items representing the RE, MH, VT, SF sub-domains with MCS was hypothesized.

To assess latent construct discriminant validity, the fit of the best fitting 2-factor model obtained from the factorial validity analysis was compared to that of a similar model where the latent factor correlation (i.e., correlation between LPF and LMF) was fixed to 1; this test was carried out using the DIFFTEST option in Mplus [34, 35]. A significant difference in the model fit (χ2 statistic) between the two models was suggestive of discriminant validity [36].

To assess item discriminant validity [37], lower item-to-other scale correlations (≤ 0.40) were suggestive of adequate discriminant validity. The reasoning behind this technique was that items from different domains should have low or no correlations with each other. A weak correlation of the items belonging to the GH, PF, RP, and BP sub-domains with MCS was hypothesized. Similarly, a weak correlation between items representing the RE, MH, VT, SF sub-domains with PCS summary scale score was hypothesized.

Known-groups validity is the ability of an instrument to differentiate among individuals who have varying levels of disease severity. One-way ANOVA was used to compare mean PCS and MCS scores from the SF-12v2 across hemophilia patients with different symptom severity levels measured using the PGI-S.

In order to evaluate the internal consistency reliability for the SF-12v2, Cronbach’s alpha (α) was calculated for the LPF and the LMF items. An α ≥ 0.70 was considered to be suggestive of adequate internal consistency reliability [32].

To assess the floor and ceiling effects of the SF-12v2, the percentage of adults with hemophilia with the least possible and the maximum possible PCS and MCS were determined. Floor and ceiling effects were considered to be present if more than 20% of the respondents received the lowest or the highest possible PCS or MCS score [32, 38]. Given the estimation technique used for the CFA and the treatment of the items as categorical indicators, floor and ceiling effects of the individual SF-12v2 items were not considered to be problematic.


The final study sample consisted of 218 adults with hemophilia (Table 1). The majority of the sample included patients with hemophilia A (77.5%), males (79.5%), and Caucasians (68.5%). The mean age of the study sample was 35.45 (12.3) years. Hepatitis C (36.5%) and depression (38.4%) were the most commonly reported comorbidities.

Table 1 Demographic and clinical characteristics of the sample

Table 2 shows the mean scores, and skewness and kurtosis coefficients on a per-item basis for the SF-12v2. The skewness and kurtosis coefficients for all items on the SF-12v2 were found to be within a range of − 1.00 and 0.63. The mean PCS was 43.68 (SD = 10.20) and the mean MCS was found to be 46.48 (SD = 10.09) among adults with hemophilia. Mean PCS and MCS scores were lower than the norm scores for the general healthy US population. This indicated that adults with hemophilia had a worse overall HRQOL as compared to the US norm population. There was no missing data for any of the SF-12v2 items.

Table 2 SF-12v2 item-level characteristics and PCS/MCS scores among adults with hemophilia

The three measurement models tested to examine the factorial validity of the SF-12v2 among adults with hemophilia can be found in Fig. 1. The model fit indices for the three models can be found in Table 3. The two-factor model based on the approach used by Maurischat et al. [20] had the best fit among the three models (Chi-square [df] = 270.183 [49]; CFI = 0.952; TLI = 0.935; RMSEA [90% CI] = 0.144 [0.127–0.162]; WRMR = 1.250). Based on the modification indices for Model 3, residuals for items 9 and 10 (i.e., MH09 and VT10) were correlated in addition to the correlated residuals already specified in the Maurischat et al. model. This significantly improved model fit of the final model (Chi-square [df] = 172.778 [48]; CFI = 0.972; TLI = 0.962; RMSEA [90% CI] = 0.109 [0.092–0.127]; WRMR = 0.947).

Fig. 1
figure 1

a: Single-Factor Model (Model 1) for the SF-12v2. b: Two-Factor Model (Model 2) for the SF-12v2 based on Okonkwo et al. c: Two-Factor Model (Model 3) for the SF-12v2 based on Maurischat et al.

Table 3 Summary of model fit indices for the SF-12v2 confirmatory factor models

For the best fitting model (i.e., Model 3 in Fig. 1c), the correlation between LPF and the PCS score was 0.996 (p < 0.0001) while the correlation between LMF and MCS was > 0.999 (p < 0.0001). This suggested that there was a high and significant correlation between the sample-specific latent factors (i.e., LPF and LMF) and the PCS and MCS scores calculated using population-based weighting coefficients.

The standardized factor loadings for the final study model (Fig. 1c) can be found in Table 4. All factor loadings were statistically significant (p < 0.05). Most factor loadings (except MH09 on LMF) were greater than 0.5. Table 5 depicts the item-scale correlation matrix. Items comprising the PF, RP, GH, and BP sub-domains had a strong and statistically significant correlation with the PCS. While RE, MH, VT, SF items were strongly correlated with the MCS summary scale score. Overall, the standardized factor loadings and item-scale correlations suggested acceptable convergent validity for the SF-12v2 among adults with hemophilia.

Table 4 Standardized factor loadings for the final two-factor model of HRQOL (Model 3) for the SF-12v2 among adults with hemophilia
Table 5 Item-scale correlations for the SF-12v2 among adults with hemophilia

The fit of the final two-factor model (Model 3) where the correlation between LPF and LMF was freely estimated was compared to that of a model where the correlation between LPF and LMF was fixed to one. Although the correlation between LMF and LPF was high (r = 0.83), the test yielded a significant difference in the chi-square value (Δχ2 [df] = 18.686 [1]; p < 0.0001), suggesting that LPF and LMF are not perfectly correlated (i.e., latent construct discriminant validity) [34, 36]. Items comprising the PF, RP, GH, and BP subdomains had a weak correlation with the MCS summary scale score. While RE, MH, VT, SF items had a weak to moderate correlation with the PCS summary scale score, supporting item discriminant validity. Overall the SF-12v2 was found to have acceptable discriminant validity among adults with hemophilia.

The ability of the SF-12v2 to discriminate among hemophilia patient groups defined by the PGI-S (i.e., no symptoms, mild symptoms, moderate symptoms, and severe symptoms) was assessed using a one-way ANOVA (Table 6). Differences in PCS and MCS scores between individual groups were assessed using Tukey’s honestly significant difference (HSD) tests. The mean PCS (50.10 vs 47.39 vs 40.79 vs 35.24; p < 0.0001) and MCS (50.61 vs 46.80 vs 46.96 vs 42.29; p = 0.007) scores were significantly different across the four symptom severity levels. A definite gradation was observed in terms of PCS and MCS mean scores with increasing levels of symptom severity on the PGI-S.

Table 6 Known-groups validity for the SF-12v2 components among adults with hemophilia

The internal consistency reliability for the SF-12v2 was found to be satisfactory with the Cronbach’s alpha value of 0.848 for LPF and 0.785 for LMF.

Less than 20% of the study sample received the lowest or highest possible PCS or MCS summary scale score which was indicative of the absence of floor and ceiling effects. The minimum and maximum PCS score for the study sample was 16.95 and 67.63, respectively. The minimum and maximum MCS score was 15.75 and 68.91, respectively. The minimum and maximum PCS score for the general US population as per the SF-12v2 scoring manual is 4.92 and 69.24, respectively [8]. While the minimum and maximum MCS score for the US norm population was 8.14 and 73.24, respectively. Therefore, none of the respondents from our study sample received the lowest or highest possible score as compared to the general US population.


As HRQOL continues to evolve as a key endpoint among patients with hemophilia, so does the need for psychometrically-sound generic instruments which measure HRQOL. Such instruments not only allow one to ascertain the burden of hemophilia on patient HRQOL, but also compare their HRQOL to the healthy US population and across subgroups of individuals suffering from other diseases. The current study assessed the validity (factorial, convergent, discriminant, and known-groups) and internal consistency reliability of the SF-12v2, a generic measure of HRQOL, among adults with hemophilia.

Factorial validity of the SF-12v2 was tested by examining the model fit indices across three different models. A two-factor model based on the approach adopted by Maurischat and colleagues [20] was found to be the best fitting model in this population. Previous studies have also conceptualized the SF-12v2 as a two-factor model where items related to the GH, PF, RP, BP subdomains loaded onto a LPF while items related to the RE, MH, VT, SF subdomains loaded onto a LMF and the error covariance for items which belonged to the same subdomain (PF, RP, RE, and MH) were correlated. Items belonging to the same subdomain were expected to have additional commonality not explained by the latent factors due to similarities in item wording (i.e., a shared method effect), which warranted the specification of residual correlations for these items. A similar two-factor model for the SF-12v2 was found to have acceptable fit among patients with inflammatory rheumatic disease [19] and diabetes mellitus [20]. In the current study, modification indices suggested an additional residual correlation between MH09 (felt calm and peaceful) and VT10 (had a lot of energy). Residuals for these items on the SF-12 have been previously shown to be correlated by McBride et al. [39]. in a sample of diagnostic orphans (i.e., adults with a type of alcohol dependence or use disorder) and by Fleishman and Lawrence [40] in a population of non-institutionalized US civilians [39, 40].

The SF-12v2 was found to have good convergent and discriminant validity among adults with hemophilia. These findings were supported by factor loadings, the latent factor correlation, and correlations between the individual items and SF-12 subdomains. Although the latent factor correlation was high (0.83), the test for construct discriminant validity suggested that this correlation was significantly different from one. Additionally, a one-factor model had the worst model fit in the factorial validity analysis. These results provide evidence that a two-factor HRQOL model was appropriate and that the two latent factors (LPF and LMF) did indeed measure distinct concepts. This is important as a two-factor model forms the basis of the commonly reported PCS and MCS scores. Because other studies have reported have reported smaller correlations between latent physical and mental factors using the SF-12 (i.e., range 0.5–0.7) [18,19,20], future research should examine reasons for the higher latent factor correlation found in the current study. The high correlations between the sample-specific latent factors and the PCS and MCS scores calculated using the standard scoring approach also provide support for the use of the summary scores in HRQOL research with adults with hemophilia. Such high correlations have also been observed in studies with different populations and slightly different factor structures for the latent variables [18], providing evidence of the generalizability of the standard scoring approach for the component summaries.

The results of the current study lend support to the known-groups validity of the SF-12v2 in terms of its ability to discriminate across different symptom severity levels among adults with hemophilia. PCS and MCS means were found to be significantly different across the four symptom severity groups. Additionally, significant decreases in PCS scores were associated with increasing levels of symptom severity. Although the severe symptoms group and no symptoms group were notably different on MCS scores as expected, the overall linear trend observed with PCS scores and symptom severity was not seen in the case of MCS scores. The mean MCS for the moderate symptom severity group was slightly greater, although not statistically different, than the MCS for the mild symptom severity group.

The internal consistency reliability of the LPF and LMF summary scales was found to be good. The PCS and MCS scale scores did not indicate the presence of any floor or ceiling effects. These results may indicate that the SF-12v2 is sensitive in capturing the variation in HRQOL among adults with hemophilia.

The results of the current study must be interpreted in the light of certain limitations. The cross-sectional nature of the study precluded the assessment of the predictive validity as well as test-retest reliability of the SF-12v2. Future studies should adopt a longitudinal design in order to explore these aspects of the psychometric profile of the SF-12v2. Adults with hemophilia who participated in this study are likely to have higher physical functioning because of their ability to participate in survey research. Also, future studies must examine the measurement invariance of the SF-12v2 among adults with hemophilia in addition to testing its psychometric properties in order to ensure the appropriateness of its use in this patient population.

This was the first study to assess the psychometric properties of the SF-12v2 among adults with hemophilia. Considering that hemophilia is a rare genetic disorder, most previous published reports have employed smaller sample sizes. To the best of our knowledge, this is the first US-based study to capture the HRQOL of such a large population of adults with hemophilia. The study sample included an even distribution of patients from all regions of the country which ensures the generalizability of the study results to most adults with hemophilia in the US.


This study provides evidence about the acceptable psychometric properties of the SF-12v2 among adults with hemophilia in the US. The SF-12v2 was found to be a valid and reliable generic measure of HRQOL among adults with hemophilia. The scale demonstrated adequate factorial, convergent, discriminant, and known-groups validity. The scale was found to have adequate internal consistency reliability and no evidence of floor or ceiling effects was found. Overall, the results provide basis for the future use of the SF-12v2 among adults with hemophilia and incorporating the HRQOL information obtained from these studies into health policy and clinical decision making.



Confirmatory Factor Analysis


Comparative Fit Index


General Health


Health-related Quality of Life


Hemophilia Treatment Center


Latent Mental Factor


Latent Physical Factor


Mental Component Summary


Mental Health


Modification Indices


Physical Component Summary


Physical Functioning


Patient Global Impression of Severity


Role Emotional


Root Mean Square Error of Approximation


Role Physical


Structural Equation Modeling


Social Functioning


Standardized Root Mean Square Residual


Tucker Lewis Index




Weighted Least Squares Minimum Variance


  1. Center for Disease Control and Prevention. Hemophilia - Data and Statistics. 2014 [cited 2015 May 9]. Available from:

  2. National Hemophilia Foundation. Fast facts about bleeding disorders. 2014 [cited 2015 Feb 10]. Available from:

  3. Dolan G, Hermans C, Klamroth R, Madhok R, Schutgens REG, Spengler U. Challenges and controversies in haemophilia care in adulthood. Haemophilia. 2009;15(Suppl 1):20–7.

    Article  Google Scholar 

  4. Aznar JA, Magallón M, Querol F, Gorina E, Tusell JM. The orthopaedic status of severe haemophiliacs in Spain. Haemophilia. 2009;6:170–6.

    Article  Google Scholar 

  5. Mackensen S. Quality of life and sports activities in patients with haemophilia. Haemophilia. 2007;13:38–43.

    Article  Google Scholar 

  6. Ghanizadeh A, Baligh-Jahromi P. Depression, anxiety and suicidal behaviour in children and adolescents with Haemophilia. Haemophilia. 2009;15:528–32.

    Article  CAS  Google Scholar 

  7. Schipper H, Clinch JJOC. Quality of life studies: definitions and conceptual issues. In: Spilker B, editor. Quality of life and Pharmacoeconomics in clinical trials. Philadelphia: Lippincott-Raven Publishers; 1996. p. 11–23.

    Google Scholar 

  8. Ware JE, Kosinski M, Turner-Bowker DM, Gandek B. How to score version 2 of the SF-12 health survey (with a supplement documenting version 1). Lincoln: QualityMetric Incorporated; 2002.

  9. Coons SJ, Rao S, Keininger DL, Hays RD. A comparative review of generic quality-of-life instruments. PharmacoEconomics. 2000;17:13–35.

    Article  CAS  Google Scholar 

  10. Guyatt GH, Feeny DH, Patrick DL. Measuring health-related quality of life. Ann Intern Med. 1993;118:622–9.

    Article  CAS  Google Scholar 

  11. Patrick DL, Deyo R. Generic and disease-specific measures in assessing health status and quality of life. Med Care. 1989;27:S217–32.

    Article  CAS  Google Scholar 

  12. Ware JE, Kosinski M, Keller SD. A 12-item short-form health survey: construction of scales and preliminary tests of reliability and validity. Med Care. 1996;34:220–33.

    Article  Google Scholar 

  13. Hanmer J, Lawrence WF, Anderson JP, Kaplan RM, Fryback DG. Report of nationally representative values for the noninstitutionalized US adult population for 7 health-related quality-of-life scores. Med Decis Mak. 2006;26:391–400.

    Article  Google Scholar 

  14. Kontodimopoulos N, Pappa E, Niakas D, Tountas Y. Health and quality of life. Health Qual Life Outcomes. 2007;5:1–9.

    Article  Google Scholar 

  15. Gandek B, Ware JE, Aaronson NK, Apolone G, Bjorner JB, Brazier JE, et al. Cross-validation of item selection and scoring for the SF-12 health survey in nine countries: Results from the IQOLA Project. J Clin Epidemiol. 1998;51:1171–8.

    Article  CAS  Google Scholar 

  16. Montazeri A, Vahdaninia M, Mousavi SJ, Omidvari S. The Iranian version of 12-item short form health survey (SF-12): factor structure, internal consistency and construct validity. BMC Public Health. 2009;9:341.

    Article  Google Scholar 

  17. Jakobsson U, Westergren A, Lindskov S, Hagell P. Construct validity of the SF-12 in three different samples. J Eval Clin Pract. 2012;18:560–6.

    Article  Google Scholar 

  18. Okonkwo OC, Roth DL, Pulley L, Howard G. Confirmatory factor analysis of the validity of the SF-12 for persons with and without a history of stroke. Qual Life Res. 2010;19:1323–31.

    Article  Google Scholar 

  19. Maurischat C, Ehlebracht-König I, Kühn A, Bullinger M. Factorial validity and norm data comparison of the short form 12 in patients with inflammatory-rheumatic disease. Rheumatol Int. 2006;26:614–21.

    Article  Google Scholar 

  20. Maurischat C, Herschbach P, Peters A, Bullinger M. Factorial validity of the short form 12 (SF-12) in patients with diabetes mellitus. Psychol Sci Q. 2008;50:7–20.

    Google Scholar 

  21. Pakpour AH, Nourozi S, Molsted S, Harrison AP, Nourozi K, Fridlund B. Validity and reliability of short form-12 questionnaire in Iranian hemodialysis patients. Iran J Kidney Dis. 2011;5:175–81.

    PubMed  Google Scholar 

  22. Poon J-L, Doctor JN, Nichol MB. Longitudinal changes in health-related quality of life for chronic diseases: an example in hemophilia a. J Gen Intern Med. 2014;29:760–6.

    Article  Google Scholar 

  23. Duncan N, Shapiro A, Ye X, Epstein J, Luo MP. Treatment patterns, health-related quality of life and adherence to prophylaxis among haemophilia a patients in the United States. Haemophilia. 2012;18:760–5.

    Article  CAS  Google Scholar 

  24. Qualtrics. Available from:

  25. Rare Patient Voice: Helping Patients with Rare Diseases Voice Their Opinions. Available from:

  26. Kline RB. Principles and practice of structural equation modeling. 3rd ed. New York: Guilford Press; 2011.

    Google Scholar 

  27. Viktrup L, Hayes RP, Wang P, Shen W. Construct validation of patient global impression of severity (PGI-S) and improvement (PGI-I) questionnaires in the treatment of men with lower urinary tract symptoms secondary to benign prostatic hyperplasia. BMC Urol. 2012;12:30.

    Article  Google Scholar 

  28. Yalcin I, Bump RC. Validation of two global impression questionnaires for incontinence. Am J Obstet Gynecol. 2003;189:98–101.

    Article  Google Scholar 

  29. Muthén B. A general structural equation model with dichotomous, ordered categorical, and continuous latent variable indicators. Psychometrika. 1984;49:115–32.

    Article  Google Scholar 

  30. Bagozzi RP, Yi Y. Specification, evaluation, and interpretation of structural equation models. J Acad Mark Sci. 2012;40:8–34.

    Article  Google Scholar 

  31. Yu CY. Evaluating cutoff criteria of model fit indices for latent variable models with binary and continous outcomes. Los Angeles: University of California; 2002. Available from:

    Google Scholar 

  32. Khanna R, Jariwala K, West-Strum D. Validity and reliability of the medical outcomes study short-form health survey version 2 (SF-12v2) among adults with autism. Res Dev Disabil. 2015;43–44:51–60.

    Article  Google Scholar 

  33. Cohen J. Statistical power analysis for the behavioral sciences. 2nd ed. New York: Academic Press; 1988.

    Google Scholar 

  34. Brown T. Confirmatory factor analysis for applied research: 2nd edition. New York: The Guilford Press; 2015.

    Google Scholar 

  35. Muthén LK, Muthén BO. Mplus User’ s Guide. 17th edition. Los Angeles, CA: Muthén & Muthén; 2015.

    Google Scholar 

  36. Anderson JC, Gerbing DW. Structural equation modeling in practice: a review and recommended two-step approach. Psychol Bull. 1988;103:411–23.

    Article  Google Scholar 

  37. Gandek B, Sinclair SJ, Kosinski M, Ware JE Jr. Psychometric evaluation of the SF-36® health survey in medicare managed care. Health Care Financ Rev. 2004;25:5–25.

    PubMed  PubMed Central  Google Scholar 

  38. Holmes WC, Shea JA. Performance of a new, HIV/AIDS-targeted quality of life (HAT-QoL) instrument in asymptomatic seropositive individuals. Qual Life Res. 1997;6:561–71.

    Article  CAS  Google Scholar 

  39. McBride O, Adamson G, Bunting BP, McCann S. Assessing the general health of diagnostic orphans using the short form health survey (SF-12v2): a latent variable modelling approach. Alcohol Alcohol. 2009;44:67–76.

    Article  Google Scholar 

  40. Fleishman JA, Lawrence WF. Demographic variation in SF-12 scores: true differences or differential item functioning? Med Care. 2003;41:III75–86.

    Article  Google Scholar 

Download references


The authors would like to acknowledge the Rare Patient Voice Panel in providing pro-bono assistance with data collection.


No funding was received for this study.

Availability of data and materials

The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

Author information

Authors and Affiliations



Authors contributed in the following ways: RS, BB, EH, AP, MB, RK, and JB designed and developed the research study. RS coordinated data collection and data management. RS and JB conducted the statistical analysis. All authors contributed to the interpretation of results and manuscript preparation, with RS taking the lead in manuscript preparation. JB provided critical analysis of the developed research paper. All authors read and approved the final version of the manuscript.

Corresponding author

Correspondence to Ruchitbhai M. Shah.

Ethics declarations

Authors’ information

At the time of project completion, Ruchit Shah was a graduate student at the University of Mississippi.

Ethics approval and consent to participate

Ethical approval for the survey was granted by the University of Mississippi Institutional Review Board (UM-IRB). Upon opening the survey, panel members were directed to an information page about the study, including contact details for the Principal Investigator and UM-IRB. Respondents then provided consent to participate by clicking a link to start the survey.

Consent for publication

The manuscript does not contain data from any individual person. The section is not applicable.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Shah, R.M., Banahan, B.F., Holmes, E.R. et al. An evaluation of the psychometric properties of the sf-12v2 health survey among adults with hemophilia. Health Qual Life Outcomes 16, 229 (2018).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: