Psychometric properties of the Chinese version of Five Facet Mindfulness Questionnaire—short form in cancer patients: a Bayesian structural equation modeling approach

Background Mindfulness has emerged as an important correlate of well-being in various clinical populations. The present study evaluated the psychometric properties of the 20-item short form of the Five Facet Mindfulness Questionnaire (FFMQ-SF) in the Chinese context. Methods The study sample was 127 Chinese colorectal cancer patients who completed the FFMQ-SF and validated physical and mental health measures. Factorial validity of the FFMQ-SF was assessed using Bayesian structural equation modeling (BSEM) via informative priors on cross-loadings and residual covariances. Linear regression analysis examined its convergent validity with the health measures on imputed datasets. Results The five-factor BSEM model with approximate zero cross-loadings and one residual covariance provided an adequate model fit (PPP = 0.07, RMSEA = 0.06, CFI = 0.95). Satisfactory reliability (ω = 0.77–0.85) was found in four of the five facets (except nonjudging). Acting with awareness predicted lower levels of perceived stress, negative affect, anxiety, depression, and illness symptoms (β = − 0.37 to − 0.42) and better quality of life (β = 0.29–0.32). Observing, nonjudging, and nonreacting did not show any significant associations (p > .05) with health measures. Acting with awareness was not significantly correlated (r < 0.15) with the other four facets. Conclusion The present findings provide partial support for the psychometric properties of the FFMQ-SF in colorectal cancer patients. The nonjudging facet showed questionable validity and reliability in the present sample. Further studies with larger sample sizes are needed to elucidate the viability of FFMQ-SF as a measure of mindfulness facets in cancer patients.

adjuvant chemotherapy and radiotherapy. Colorectal cancer patients are at risk of both physical symptoms (fatigue, lethargy, and insomnia) and emotional symptoms (loss of control, anxiety, depression, and fear of recurrence) following cancer treatments [3].
Mindfulness-based interventions have received tremendous interest from researchers and practitioners in health settings [4]. Mindfulness is a psychological construct that originated from the West. It involves bringing one's complete attention to experiences that occur in the present moment in a non-judgmental and accepting way [5]. Evidence from a meta-analysis [6] has found beneficial effects of mindfulness on physical and mental wellbeing in clinical populations. Among cancer patients, mindfulness interventions have shown improvements in psychosocial adjustment [7], cognitive functioning [8], cortisol slopes, and telomerase length [9]. A viable and valid tool for measuring mindfulness is essential for evaluating the mechanisms of the mindfulness effects [10].
A review [11] found eight available self-report measures of mindfulness, with examples such as the Kentucky Inventory of Mindfulness Skills [12], Mindful Attention Awareness Scale [13], Freiburg Mindfulness Inventory [14], Cognitive and Affective Mindfulness Scale-Revised [15], and Southampton Mindfulness Questionnaire [16]. The Five Facet Mindfulness Questionnaire (FFMQ) [17] was a 39-item multifaceted scale developed from the above five validated mindfulness questionnaires. The FFMQ has a comprehensive conceptualization by integrating the five validated mindfulness scales via measuring five distinct mindfulness facets. The FFMQ has shown good psychometric properties in terms of construct validity and reliability in a systematic review of mindfulness instruments [18]. A 20-item short form of the FFMQ (FFMQ-SF) was developed under comprehensive criteria in a community sample of healthy adults and a clinical sample with psychological distress [19].
Both the FFMQ and FFMQ-SF have been validated primarily in samples of healthy adults, meditators, and depressive patients [20][21][22][23][24][25][26]. In the context of cancer patients, the only psychometric study [27] revealed a six-factor structure rather than the original five-factor structure for the FFMQ in a sample of prostate cancer patients. To our knowledge, no existing studies have examined the validity and reliability of the FFMQ-SF in cancer populations. The FFMQ has shown measurement non-invariance in the form of scalar noninvariance (different intercepts) between clinical and non-clinical samples [28] and metric non-invariance (different loadings) between meditators and non-meditators [29]. These findings imply that the factor structure and measurement parameters of the FFMQ-SF could differ across the clinical contexts. There is a practical need to evaluate the psychometric properties of the FFMQ-SF to authenticate its use in the cancer populations.
The majority of validation studies on the FFMQ-SF focused on confirmatory factor analysis via the maximum likelihood approach. This traditional approach depends on the unrealistic assumptions of zero crossloadings [30] and limits the researchers' ability to investigate cross-loadings and residual covariance parameters [31]. This approach could lead to inflated model rejections and contribute to the discrepancy in the dimensionality of the FFMQ in the previous psychometric study in cancer samples [27]. Bayesian structural equation modeling (BSEM) is a new approach that can flexibly specify these minor parameters to be approximately zero using informative priors [32]. The present study aimed to evaluate the psychometric properties of the FFMQ-SF in a sample of colorectal cancer patients. The present study hypothesized that the five-factor structure would provide adequate factorial validity in terms of an adequate model fit, satisfactory reliability, and convergent validity with health outcomes.

Participants and procedures
The study sample was a total of 127 Chinese patients with colorectal cancer. The participants were recruited by convenience sampling via doctors' referrals and newsletter advertisements in cancer patient resource centers in two local hospitals and three community organizations servicing cancer patients in Hong Kong. Trained research assistants conducted the screening based on the following inclusion criteria: diagnosis of stage 0 to III colorectal cancer, expected survival time of at least 12 months, Chinese speaking, aged 18 or above, and 0.5 to 5 years following cancer treatment. The exclusion criteria were the presence of severe cachexia, bone pain, nausea, or diagnosis of a major psychiatric disorder or other forms of cancer. All of the participants provided voluntary written informed consent before joining the study. The participants completed a set of paper-and-pencil questionnaires comprising the FFMQ-SF and validated self-report measures of perceived stress, affect, anxiety, depression, illness symptoms, and quality of life. Table 1 summarizes the psychometric information (number of items, number of factors, score range, and reliability) of the adopted scales. Completion of the questionnaires took around 20-25 min. Ethical approval was obtained from the human research ethics committee of the author's university.

Measurements Mindfulness
The Chinese version of the FFMQ-SF is a 20-item selfreport questionnaire [19] that assesses five facets of mindfulness: observing one's reaction (observing), ability to describe this reaction (describing), acting with awareness, nonjudging of inner experience (nonjudging), and nonreacting to inner experience (nonreacting). Four items measure each facet. The 20 items were selected from the original 39-item FFMQ using criteria such as maintenance of reliability, item-total correlation of at least 0.40, and minimum cross-loadings. Participants rate the degree to which each statement applies to them on a 5-point Likert scale (1 = never, 5 = very often). Items on two facets (acting with awareness and nonjudging) are reversely coded. Each facet score is the average of the four items, with higher scores (theoretical range = 1-5) indicating higher mindfulness levels. The original 5-factor structure of the FFMQ-SF can be found in the Additional file 1 as reference.

Perceived stress
The Chinese version [33] of the Perceived Stress Scale [34] measures the participants' stress level over the past week. This 10-item scale asks about the degree to which the respondent appraises life events as stressful on a 5-point (0 = never, 4 = very often) Likert scale. Higher scores (theoretical range = 0-40) indicate higher levels of perceived stress. The scale showed a satisfactory level of reliability (α = 0.76) in the present study.

Affect
The Chinese version [35] of the 20-item Positive and Negative Affect Schedule [36] was used to assess the participants' ambient moods over the past two weeks. The scale consists of 10 items each to measure positive affect (e.g., active, determined, inspired, interested) and negative affect (e.g., guilty, irritable, nervous, upset) on a 5-point (1 = never, 5 = very often) Likert scale. The higher the score (theoretical range = 10-50) on a subscale, the more the respondent experienced that affective state.
Both positive and negative affect showed good reliability (α = 0.89) in the present study.

Anxiety and depression
The Chinese version [37] of the Hospital Anxiety and Depression Scale [38] was used to measure anxiety and depressive symptoms over the past week. This 14-item scale consists of two subscales: anxiety (seven items) and depression (seven items) measured on a 4-point (0-3) Likert scale. The item sum scores provide the total subscale scores and higher scores (theoretical range = 0-21) indicate higher distress. The scale showed good reliability (α = 0.78-0.87) in this study.

Cancer-related illness symptoms
The Chinese version [39] of the Memorial Symptom Assessment Scale [40] was used to measure participants' cancer-related symptom distress. This instrument evaluates 32 symptoms (such as pain, nausea, insomnia, diarrhea, and drowsiness) experienced over the past week regarding frequency, severity, and distress, each measured on a 4-point Likert scale. In case of absence of experience of a symptom, each domain's score and the symptom overall is scored as zero. If a symptom is experienced, the score is calculated as the average of the frequency, severity, and distress domains. The symptom scores for all 32 symptoms are averaged to produce the total scale score (theoretical range = 0-4). The total score showed excellent reliability (α = 0.94) in the present study.

Quality of life (QoL)
The Chinese version [41] of the short-form Health Survey (SF-12) is a commonly used self-report measure of health-related QoL [42]. The 12 items produce summary scores for the physical and mental health domains using standardized scoring (theoretical range = 0-100). The scale showed acceptable levels of reliability (α = 0.72-0.77) in the present study. A previous study in Hong Kong [43] established separate population norms for 1493 persons without chronic diseases (mean physical QoL = 53.2, SD = 6.3; mean mental QoL = 50.6, SD = 8.8) and 917 persons with chronic diseases (mean physical QoL = 44.7, SD = 11.0; mean mental QoL = 49.1, SD = 10.5). The respondents' physical and mental QoL were compared to these population norms to understand their well-being levels.

Statistical analysis
Exploratory factor analysis (EFA) was first conducted to determine the dimensionality of the FFMQ-SF by comparing models with an increasing number of factors (from 2 to 6). EFA is considered a less-restrictive and more realistic variant of confirmatory factor analysis by allowing cross-loadings. The underlying factor structure was evaluated via BSEM [30] using Mplus 8.4. The BSEM approach does not assume normal distributions for the model parameters [32], but instead allows direct estimation of the posterior distributions based on the data and prior distribution. The Bayesian approach was found to outperform the maximum likelihood approach in terms of accuracy of parameter estimates in factor analysis with small sample sizes (N < 200) [44]. In the present study, all of the 20 items showed skewness and kurtosis values less than one and fulfilled the normality testing requirements for Bayesian estimation. The primary factor loadings should be at least 0.40 and substantial factor loadings should have magnitudes of ≥ 0.50. The rationale and methodological details of BSEM are available to the readers in relevant literature [30][31][32]45]. In the present study, we fitted the BSEM models following a series of prior specifications: 1) uninformative priors, 2) informative priors for approximate zero cross-loadings, and 3) informative priors for approximate zero residual covariances. Sensitivity analysis was conducted by varying the informative priors in the BSEM analysis [31]. The prior variance for the cross-loadings varied from 0.01 to 0.04, and the degrees of freedom (d) ranged from 10 to 30. Model estimation was conducted using Markov chain Monte Carlo methods with two chains and at least 20,000 iterations. The first half of the iterations were used as burn-in phase and discarded, and the latter half derived the empirical distribution of the model parameters. Model convergence was checked via the potential scale reduction criterion [46] with values of < 1.05 implying small between-chain variation.
The model fit of the BSEM models was assessed using posterior predictive checking via the posterior predictive p value (PPP), root mean square error of approximation (RMSEA), and comparative fit index (CFI). A positive lower 95% posterior predictive (PP) limit denoted a poor model fit between the real and replicated data. For the PPP, a small value (< 0.05) denotes rejection of an exact model fit with a higher value indicating a better fit. The prior posterior predictive p value (pppp) was used to evaluate the tenability of the minor parameters with small values (pppp < 0.05) against the prior specification. Comparison of the BSEM models was based on the deviance information criterion (DIC), which penalizes model deviance with the estimated number of model parameters and avoids model overfitting. A lower DIC value denotes a better model fit with greater parsimony. The optimal BSEM model should provide a good model fit (PPP > 0.05, RMSEA ≤ 0.06, and CFI ≥ 0.95) and have the lowest DIC. Traditional factor analysis was conducted under the maximum likelihood approach to supplement the Bayesian results. The results under the maximum likelihood approach are shown in Additional file 2 and Additional file 3 displays the CFA model of the 5-factor structure.
The present study examined the convergent validity of the FFMQ-SF under a two-step approach. The first step used multiple imputations to generate 50 imputed datasets for the plausible values of FFMQ-SF factors in the BSEM model. An interval of 100 iterations was used for thinning to derive the imputed datasets. The second step performed univariate regression analysis to select potential demographic factors of the derived FFMQ-SF factor scores at p < 0.10 level. The univariate regression model included gender, age, cancer stage, education level, marital status, comorbid illness, and income as potential model covariates. The third step conducted multivariate regression analysis by regressing the physical and mental health measures (perceived stress, affect, anxiety, depression, illness symptoms, and quality of life) on the FFMQ-SF factor scores using the imputed datasets. The regression analyses were averaged over the 50 imputed datasets to improve the results' reliability.
The composite reliability of the FFMQ-SF factors was assessed by the Omega (ω) coefficient, with values of ω ≥ 0.75 indicative of acceptable reliability [47]. The participants completed the FFMQ-SF again four weeks later to evaluate the test-retest reliability. Intraclass correlation coefficient (ICC) values above 0.75 indicated good test-retest reliability [48]. The amount of missing data was minimal in the FFMQ-SF, with 125 of the 127 participants completing all 20 items. Eight participants had missing data on some of the health measures and around one-tenth of the sample did not provide information on their cancer stage or income (N = 12-16). Missingness in cancer stage and income was significantly and positively associated (r = 0.20, p = 0.023). Participants who reported missing data in cancer stage or income were likely to be older, have lower education level, have lower levels of perceived stress and negative affect, and have higher mental quality of life. Missing data were handled using full information maximum likelihood under the missing-at-random assumption [49].

Sample profile
The majority of the sample was female (58%), married (76%), and had at least an upper secondary education (64%). The sample mean age was 63.8 years (SD = 8.9) and around two-third had been diagnosed with stage II or III colorectal cancer. Most participants (96%) received surgery treatment and less than half received chemotherapy or radiotherapy treatment. The median monthly income of the sample was 5.5 thousand HKD. Around one-fourth of the sample reported comorbid illness such as hypertension and diabetes. Table 2 lists the sample means (SD) for perceived stress, positive affect, negative affect, anxiety, depression, illness symptoms, physical QoL, and mental QoL. The present sample showed significantly lower levels of physical QoL than persons without chronic diseases (t = − 13.7, p < 0.01, d = 1.56) and persons with chronic diseases (t = − 1.99, p < 0.05, d = 0.15). The present sample displayed comparable levels (p > 0.05, d < 0.15) of mental QoL as persons with and without chronic diseases.

Dimensionality
For the Bayesian EFA models (Table 3), the two-, three-, and four-factor models provided mediocre fits to the data with positive lower PP limits and PPP < 0.001. The five-factor EFA model showed an adequate model fit with a negative lower PP limit and lower DIC. The six-factor EFA model provided the best model fit in terms of lower PP limit and DIC. However, the sixth factor had only one significant factor loading and was not correlated with the other five factors, suggesting  the potential over-extraction of factors. Overall, the results lent support to the presence of five factors for the FFMQ-SF. Table 4 shows the fit indices of the five-factor BSEM models with various prior specifications. Using uninformative priors, Model 1 provided a poor fit (high lower PP limit and PPP < 0.001). Informative priors were added for approximate zero cross-loadings in Models 2-4. Model 2 was rejected by the positive lower PP limit (PPP < 0.01). Model 3 showed a good prior PP p value (> 0.5) and a negative lower PP limit. Increasing the prior variance to 0.04 in Model 4 resulted in trivial improvements in the PP limits and PPP, but not the RMSEA or CFI. Because Model 3 had a lower DIC and allowed smaller cross-loadings (± 0.28 instead of ± 0.40) than Model 4, the prior variance for the cross-loadings was set to 0.02 in the subsequent analysis. In Model 3, item 5 ("shouldn't be feeling the way I'm feeling") did not load significantly on "nonjudging" (λ = 0.23, 95% CI = − 0.05 to 0.50) but instead loaded on "act with awareness" (λ = 0.30, 95% CI = 0.13 to 0.46). After this re-specification, Model 5 showed a slightly better model fit than Model 3 in terms of the PP limits, PPP, and DIC. These models failed to provide an adequate model fit with PPP < 0.05, RMSEA > 0.06, and CFI < 0.95, which necessitated further investigation of the model misfit source. Informative priors were added on the items' residual covariances. Models 6-8 provided an exact fit to the data with negative lower PP limits and PPP ~ 0.5. However, these models showed higher DICs than Model 5, with a great increase in the number of estimated parameters.
Out of these three residual correlations, only the last residual correlation contained items with a shared underlying factor (Nonreacting). The re-specification adopted only this residual correlation but not the other two. Model 9 with this added residual correlation between item 6 and item 19 showed an adequate model fit with a negative lower PP limit, PPP > 0.05, RMSEA ≤ 0.06 and CFI ≥ 0.95, and the lowest DIC out of the nine models. As shown in Table 5, 15 out of the 20 items showed statistically significant and substantial (λ = 0.54-0.96, p < 0.05) factor loadings. The main factor loadings for the remaining five items (Item 5, 8, 15, 16, and 19) were statistically significant but less than 0.50 (λ = 0.38-0.49, p < 0.05). All of the cross-loadings were small and fell within the specified range of ± 0.28.

Reliability and factor correlations
As shown in Table 6  was not associated with the other factors (r = 0.00-0.15, p > 0.05). The factor correlations derived in EFA via the robust maximum likelihood estimator displayed a similar pattern.

Convergent validity
The upper portion of Table 7 shows the standardized estimates of the FFMQ-SF factors on the potential demographic factors in the univariate regression analysis. Age, gender, marital status, and comorbid illness did not have any significant effects (p > 0.05) on the FFMQ-SF factors. Participants at an advanced cancer stage scored higher on acting with awareness and those with higher incomes scored lower on observing and describing. More educated participants displayed higher levels of describing and nonreacting. The lower portion of Table 7 shows the standardized estimates of the physical and mental health outcomes on the FFMQ-SF factors. Observing, nonjudging, and nonreacting did not significantly predict any of the health outcomes (p > 0.05). Describing significantly predicted greater positive affect (β = 0.38, p < 0.05). Acting with awareness was a significant predictor of lower perceived stress, negative affect, anxiety, depression, and illness  Table 6 Descriptive statistics, reliability, and intercorrelations of the FFMQ-SF factors N = 127; SD = standard deviation; ω = omega coefficient; ICC = intraclass correlation coefficient for absolute agreement; CI = confidence interval. Factor correlations above and below the diagonal were derived via robust maximum likelihood and Bayesian estimators, respectively  Table 7, the FFMQ facets explained an additional 6.5% to 18.0% of the variance in the physical health outcomes and 14.2% to 26.3% of the variance in the mental health outcomes.

Discussion
The present study examined the psychometric properties of the FFMQ-SF in a Chinese sample of colorectal cancer patients using the Bayesian approach. The 5-factor structure supported by EFA was rejected by CFA under the maximum likelihood approach in the same sample. The rejection could be attributed to the stringent assumption of exact-zero cross-loadings in the CFA model, which failed to properly account for the small cross-loadings that exist in the EFA model. The Bayesian results supported the five-factor model with approximate zero cross-loadings to represent the underlying structure.
Only two of the 80 cross-loadings were statistically significant, and all of them were smaller than 0.28 (M + 2 SD). However, five out of the 20 items did not display substantial loadings (λ < 0.50) on the respective factor. In particular, two of the three items (items 8 and 16) in the nonjudging facet did not have substantial loadings. These factor loadings likely contributed to the lower composite reliability (ω = 0.68) for this factor. These results raise concerns over the construct validity of this factor and its use in the present sample. Though specification of informative priors for residual covariances led to an exact fit, the significantly increased number of estimated parameters resulted in a less parsimonious model compared to Model 9. Interestingly, item 5 ("I tell myself I should not be feeling the way I am feeling") belonged to the acting with awareness facet instead of the nonjudging facet. This discrepancy could be attributed to the translation issue of the FFMQ-SF. The phrase "the way" might not be translated into Chinese accurately. Despite sharing similar wordings with item 13 ("should not be thinking the way I am thinking"), the two items loaded on separate factors. The word "should" in these items implies the awareness of experiencing a specific thought or feeling. Interpretation of the experience is more important than the experience itself and results in corresponding feelings [50]. Awareness of habitual cognitive patterns leading to different experiences is considered an adjustable mental capacity trainable through mindfulness practice. Awareness of thought might be deemed more manageable with alertness toward the habitual thinking pattern. In comparison, feelings tend to arise spontaneously as an outcome that might be considered a recognizable rather than a controllable condition. Thus, item 5 could manifest as recognizing the mindless moments while item 13 could represent a nonjudging cognitive capacity. Acting with awareness consistently predicted various health measures and their close linkages are in line with previous findings [19,23,27]. There may be differences in the mental activities assessed by the FFMQ-SF facets. Acting with awareness measures the mindless, distracted moments in daily life through reverse-scored items. People without exposure to or understanding mindfulness may understand such awareness as irritations in daily life [51]. Novice practitioners are commonly surprised by the mindless moments that occur with a heightened sense of awareness. The other four facets appear to measure the capacity to be trained through exposure to mindfulness practices. Because the participants in this study did not have such exposure, they might have found the other four facets less comprehensible than acting with awareness. Differential semantic understanding and interpretations of the items may lead to response biases across mindfulness experience [52].

Practical implications
Researchers [11,53] have expressed doubts about the construct validity of the FFMQ and whether it could measure all relevant aspects of mindfulness. Grossman [51] questioned whether the describing facet, which assesses the ability to describe an experience verbally, is an essential mindfulness dimension. The lack of significant associations between acting with awareness and the other four facets in the present sample resembles previous findings [19]. The observing facet of the FFMQ-SF focuses on bodily sensation and external perception, and only emotional awareness was found to correlate with psychological symptoms [54]. The lack of item coverage of the observing facet on emotional awareness could explain the lack of association between this facet and acting with awareness. Similarly, the nonjudging facet did not show any significant associations with the covariates or health outcomes in the multivariate regression. The failure to demonstrate adequate convergent validity for this facet could partly be attributed to its low composite reliability.
In the previous psychometric study on FFMQ in prostate cancer patients [27], acting with awareness emerged as two separate and moderately correlated factors of awareness and attention. On the one hand, the sixth factor could reflect differences in mindfulness's conceptualization in the cancer population. On the other hand, the present study did not reveal such findings in the FFMQ-SF. The discrepancy could be attributed to differences in the cancer type (prostate versus colorectal), estimation technique (CFA versus BSEM), and questionnaire length (39-item FFMQ versus 20-item FFMQ-SF). Compared with previous validation studies, the present sample of colorectal cancer patients was older, less educated, and had more males. The present study found that patients with more advanced cancer showed higher senses of acting with awareness. The intensity of illness experience and lengthy treatment procedures could induce a heightened sense of mind-body dissonance and higher risks of emotional distress [55] for these patients.
One should note that mindfulness is far from a unitary construct and mindfulness could have both the trait and state components. Dispositional or trait mindfulness is typically assessed by the FFMQ as a self-report questionnaire. Mindfulness practice has been shown to lead to both trait and state changes in mindfulness [56] and disposition towards mindfulness could change over a longer period of time through repeated mindfulness practice. Longitudinal studies with repeated follow-up assessments are needed to distinguish between the intervention effects on dispositional mindfulness and state mindfulness. Future research could explore potential personality, genetic, environmental factors that moderate the effects of mindfulness practice.

Study limitations
Several limitations of this study should be noted. First, sample sizes equal to four times the number of parameters are needed [44] to achieve accurate parameter estimates and reliable goodness-of-fit tests under Bayesian estimation. The final BSEM model with informative priors suffered from an inadequate N:pD ratio of small sample size (N = 127) to the number of parameters (pD = 109). Despite the proper convergence of all the BSEM models, the low N:pD ratio might still affect the reliability of parameter estimations and goodnessof-fit tests. The use of BSEM with a small sample could inadvertently model idiosyncratic sample characteristics. Both exploratory and confirmatory factor analyses were carried out using data from the same sample. The lack of independent samples suggests the need to confirm the modifications of the factor loadings and residual correlations of the FFMQ-SF in further studies with larger samples.
Second, the convenience sampling method in the present study is subject to common-method and selfselection biases. The lack of meditation experience of the participants limits the generalizability of the results to non-meditators. Further studies are recommended to examine the measurement invariance of the Chinese FFMQ-SF across meditators and non-meditators. Third, the present study did not consider potential cultural factors that influence the role mindfulness plays in the Chinese context [57]. A recent study [58] found more robust relationships between mindfulness and grit in the