Our study validated FACT-Ga in a clinically heterogeneous sample of Singaporean Chinese patients with GC. The study sample covered the full spectrum of clinical cases which would allow for applying the validated questionnaire to various diagnostic groups. To the best of our knowledge, this is the first study dedicated to validating FACT-Ga for the Chinese as the target population. Given that the target population lives in a bilingual culture, both English and Chinese versions of the questionnaires were validated so that the FACT-Ga would be applicable in mainland Chinese as well as overseas Chinese populations.
For this sample of outpatients, the measurement ability of the FACT-Ga appears to be limited in evaluating the QoL outcomes of patients who survived GC relatively well. The ceiling effect was observed for all scores, especially the PWB, SWB and EWB subscales, for which the percentage of patients rating themselves in perfect health exceeded a notable value of 15% (Table 2). The core module FACT-G also showed a ceiling effect when applied to other types of cancer patients from the same study population . A ceiling effect above 15% could have a negative impact on other psychometric properties of an instrument , for example, the sensitivity to change, as supported by the finding that FACT-G is weak in detecting the improvement in a patients’ health status [9, 17]. The evidence thus far seems to suggest that the existence of a ceiling effect of FACT-Ga compromises its potential as an evaluative QoL instrument.
The FACT-Ga questionnaire showed sensitivity to the clinical characteristics of different patients groups, supporting FACT-Ga as a discriminative QoL instrument. Treatment Intent and Clinical Stage are important concerns for doctors to consider when making a clinical decision. The QoL profile described by the FACT-Ga scores corresponded very well to the clinical classification by the two variables (Table 3). The patients in a more severe situation, i.e., those treated for palliation or those with the disease in the advanced stage rated their life worse. For either overall or specific QoL aspects, it is clear that the group differences are in the direction theoretically hypothesized and consistent with the findings from the EQ-5D.
Over and above its reflection of clinical severity of GC malignancy, the FACT-Ga scores exhibited differential sensitivity to clinical status. As suggested by the varying degree of the effect size and significance level of the t-test for each FACT-Ga score, the EWB and FWB subscales were sensitive to Treatment Intent, while the GCS and potentially the SWB subscales were sensitive to the patient’s clinical stage. The effects of clinical classifications are moderate on the QoL outcomes (Table 3). Furthermore, the FACT-Ga instrument revealed an interesting finding about the social aspect of the patient’s life, which was not measured in EQ-5D. The FACT-Ga SWB subscale scores were higher for severe cases than for less severe cases for both Treatment Intent and Clinical Stage. This direction was opposite to that demonstrated by other scales. It would be too simplistic to ascribe the finding to random variation. We speculated instead that the Chinese culture played a part in this observation considering that our study population was Chinese. Sympathy is the essence of Chinese value systems and it could naturally be inferred that severe GC patients would receive more love and care from the people close to them. The SWB subscale is a measure of a patient’s self-perception of family support and emotional closeness to friends. Therefore, the reverse trend of SWB scores is not unexpected and possibly culturally-specific.
In our study, the FACT-Ga instrument demonstrated excellent reliability in measuring the QoL of Chinese GC patients. Except for the EWB subscale, Cronbach's α values indicating internal consistency reliability were greater than 0.80 for the instrument and other subscales . The EWB subscale had a Cronbach’s α of 0.62, comparable to 0.60 as previously reported , yet below the generally accepted standard of 0.70 . As the Cronbach’s α of a scale is computed based on the inter-correlations among its constituent items, the extremely low item-to-scale correlation of the EWB subscale with its item GE2 (r = 0.08) was supposed to account for the suboptimal reliability of the EWB subscale . After excluding item GE2, the EWB subscale had an improved Cronbach’s α of 0.72.
The Cronbach’s α of the SWB subscale was reported as 0.82, indicating excellent reliability of the SWB subscale, based on the 67 participants who completed the first six items of the SWB subscale. The seventh item of the SWB subscale, GS7 asking about the sex life of the patient, introduced a non-response rate of 48% (n = 32), which greatly reduced the sample size for computing the reliability index. The Cronbach’s α would drop to 0.77 based on the 35 participants with complete information for seven SWB items. Non-response to the item GS7 was common in FACT-G validation studies in different cancer populations [30, 31]. Excluding GS7 in the calculation of Cronbach’s α for the SWB subscale has been practiced to minimize the detriment of missing information . Doing so also prevented an unstable estimate of the reliability index due to an insufficient sample size, which was demonstrated by the Cronbach’s α varying from 0.26 to 0.86 in five small samples (n = 15) validating the FACT-Ga .
Construct validity of FACT-Ga was explored internally by examining the item-to-scale correlations for each item and externally by contrasting with EQ-5D. As hypothesized a priori, most items converged around their individual master subscales as required for good convergent validity. However, the Pearson correlation coefficient of item GE2 with the EWB subscale was only 0.08 with a 95% CI below 0.4. It was also associated with a definite scaling error implying that item GE2 should be included in the FWB subscale rather than the EWB subscale when FACT-Ga is used in Chinese GC patients . This finding supports the results of previous cross-cultural studies investigating the factor structure of the FACT-G in Asian populations [15, 32].
External validation involves a MTMM correlation matrix correlating the FACT-Ga subscales with the EQ-5D domains. The similar QoL constructs which were specified separately for EQ-5D and FACT-Ga yielded stronger monotrait-multimethod correlations than the multitrait-multimethod correlations in the Heteromethod block (Table 4). As shown by the monotrait-multimethod correlation coefficients, the PWB, EWB and GCS subscales are measuring the QoL aspects as intended. They are also able to discriminate the different aspects of a patient’s life . With regard to a patient’s functionality, the FACT-Ga FWB subscale was not strongly related to the Usual Activity domain of EQ-5D as we hypothesized, but to two EQ-5D domains, the Mobility and Anxiety/Depression domains, with similar strengths of correlation (r = −0.58 and r = 0.52 respectively). This may reflect the fact that the SWB subscale score is an integration of the mental and physical functions of a patient’s life. The QoL constructs which are covered by only one questionnaire had the lowest cross-instrument correlations, for example, the SWB subscale of FACT-Ga and the Self-care domain of EQ-5D. These results confirmed and substantiated the convergent and discriminant validities of FACT-Ga.
However, several limitations in this study must be noted. We were only able to recruit outpatients who usually have a better current QoL and prognosis than those hospitalized for radical treatments or bedridden at home. We acknowledge that the validity and reliability estimates were influenced by the narrow sampling due to logistical difficulties . However, considering the statistical property of Cronbach’s α and the correlations indicating construct validity, a more heterogeneous study sample generated by a wider patient pool would strengthen the current estimates . With a cross-sectional sample, we were unable to assess test-retest reliability and the responsiveness of FACT-Ga measures to QoL change over time. Finally, the sample size is sufficient, yet modest, for a study validating a cancer specific questionnaire. This may explain in part why some t-tests comparing subscale scores were insignificant (Table 3). A bigger cohort with follow-up information would be necessary to consolidate the current findings.