This research was a secondary analysis of the New South Wales (NSW) Cancer Survival Study, a population-based, cross-sectional study of the physical and psychosocial well-being of long-term cancer survivors [29, 30]. The Human Research Ethics Committees of the University of Newcastle and Cancer Council NSW approved the study.
Data collection occurred between April 2002 and October 2003. Survivors who agreed to being contacted about the study were mailed a pen-and-paper survey with a reply paid envelope for its return. The survey consisted of a series of instruments measuring survivors’ physical, psychological, and social well-being. Survivors who did not respond to the initial survey received a reminder survey after three weeks and a reminder telephone call three weeks thereafter. Return of the completed survey to the researchers indicated voluntary consent to participate.
Demographic and clinical characteristics including age, sex, cancer type, and spread of disease were collected from the Registry. Self-report survey items assessed the number of adults and children residing with the survivor, gross family income, current work status, highest educational qualification, marital status, health insurance status, remission status, treatments ever received and in last month, time since last hospital admission to receive cancer treatment, and treatment for psychiatric illness.
The 29-item mini-MAC was administered to assess five cancer specific coping strategies: 1) helplessness-hopelessness (8 items), 2) anxious preoccupation (8 items), 3) fighting spirit (4 items), 4) cognitive avoidance (4 items), and 5) fatalism (5 items). Each item is rated on a 4-point scale ranging from 1=‘Definitely does not apply to me’ to 4=‘Definitely applies to me’. A higher subscale score indicates stronger use of the coping strategy. The mini-MAC has demonstrated reliability with Cronbach’s alpha coefficients for each domain ranging from 0.62–0.88 . The mini-MAC does not distinguish between state- and trait-like coping responses.
Rasch analysis is a modern and rigorous psychometric approach increasingly used to obtain in-depth understanding of a scale’s measurement properties [27, 31], and to identify measurement issues not easily detected by traditional analyses (e.g., item bias, response format) . Rasch analysis involves testing of a scale against a mathematical measurement model developed by the Danish mathematician Georg Rasch . The Rasch measurement model assumes that the probability of a participant endorsing an item is a logistic function of the relative difference between the item’s location (difficulty of the item) and the person’s location (ability of the person). The mathematical Rasch model is considered the formal representation of ‘proper’ measurement against which data are examined. Hence, the overall objective of the analysis is to test the extent to which the observed pattern of item responses conforms to Rasch model expectations [33, 34]. The Rasch procedures and guidelines used in this analysis are consistent with those recommended by Pallant and Tennant  and Tennant and Conaghan  and other analyses conducted by our team .
The initial step in Rasch analysis is to decide which mathematical derivation of the Rasch model should be chosen. When items have three or more options, as in the case of the mini-MAC, one of two Rasch models need to be chosen –the Rating Scale Model  or the Partial Credit Model . The principal difference between these two models is that the Rating Scale Model expects the distance between thresholds (threshold refers to the point between two response categories where either response is equally probable) to be equal across items. That is, that the metric distance between the thresholds separating categories 1 and 2, for example, and the ones separating categories 2 and 3 are the same across all items. To determine which model to use a likelihood ratio test was conducted in RUMM for each subscale. The likelihood-ratio test assessed how many times more likely the data are under the Rating Scale Model than the Partial Credit Scale Model. If the p-value of the test is not significant, then the Rating Scale model can be adopted for the analysis. In the present analysis, the p-value of the likelihood ratio test was significant for all subscales (p<.001), indicating that the distances between thresholds varies across items and it is more appropriate to use the Partial Credit model than the Rating Scale model.
The mini-MAC was analysed in two stages. Only participants with responses to all items in a given subscale were included in the analyses. First, the original, five mini-MAC subscales were analysed separately and then the appropriateness of using the broader subscales of Adaptive (cognitive avoidance, fighting spirit, and fatalism subscales combined) and Maladaptive (helplessness-hopelessness and anxious preoccupation subscales combined) coping was examined. For all subscales there was an assessment of 1) overall model fit, 2) person separation index, 3) individual item and person fit residual standard deviation (SD), 4) response format (threshold maps), 5) local dependency, 6) targeting (person-item threshold maps), 7) differential item functioning (DIF), and 8) dimensionality.
The overall fit of the scale was evaluated using chi-square statistics and the summary items and persons fit residual mean values and SDs [27, 28]. As an indication of good fit, it was expected that the chi-square probability value would be non-significant (using Bonferroni alpha value adjusted to the number of items). At the summary level, a perfect fit for items and persons is represented by a mean of zero and a SD of + 1. A maximum value of 1.5 was accepted and indicative of good fit . Given the sensitivity of the chi-square statistics to large sample sizes  (in this case n=851), the residual statistics were used primarily to guide decision-making concerning fit.
The Person Separation Index (PSI) provides an indication of the internal consistency of the scale and the power of the measure to discriminate amongst respondents with different levels of the trait being measured. The PSI is interpreted in a comparable way to Cronbach's alpha coefficient where 0.70 is considered a minimal value for group or research use and 0.85 for individual or clinical use .
Individual item and person fit residual values were also inspected to identify items and/or persons that might be contributing to misfit (i.e., values outside the range ± 2.5). High positive fit residual values indicate misfit, while high negative fit residuals suggest item redundancy.
Threshold maps were examined to identify disordered thresholds. When individuals do not use the response categories in a manner that is consistent with the level of the trait being measured this often results in disordered thresholds. If a disordered threshold was detected, item rescoring was considered, informed by the item’s category probability curve.
The presence of local dependency was also investigated. Local independence means that the response to any item is unrelated to any other item when the level of the underlying construct is controlled for. To identify local dependency, the residual correlation matrix generated in RUMM was examined and pairs of items with correlations exceeding 0.3 were taken to indicate dependency. If local dependency was detected, sub-test analysis was performed to examine whether this level of correlation artificially inflated the reliability of the subscale.
It is important, particularly in clinical practice, that a measure is well-targeted . Comparison of the mean location score obtained for persons with that of the value of zero set for the items provides an indication of how well targeted the items are for the individuals in the sample . It was expected that for a well-targeted measure (i.e., not too easy, not too hard), the mean location for persons, as indicated by the person-item threshold distribution maps, would be around zero. A negative mean value indicates that the sample as a whole was located at a lower level than the average scale (floor effect), while a positive value would suggest the opposite (ceiling effect) [27, 37].
Potential item bias (i.e., DIF) can occur when different groups within the sample, despite equal levels of the underlying characteristic being measured, respond in a different manner to an individual item . When one group shows a consistent difference in their responses to an item, across the whole range of the attribute being measured, this is referred to as uniform DIF. When there is non-uniformity in the differences between the group, or DIF varies across levels of the attribute, this is referred to as non-uniform DIF. Every item was examined for DIF across three subgroups within the sample (referred to as ‘person factors’ in RUMM) - age (four groups: 18–49, 50–59, 60–69, 70 and older), sex (male, female), and cancer type (breast, prostate, or colorectal cancer or melanoma or other). For the purpose of this analysis, the small sub-group of individuals with head and neck cancer was excluded (n=30). To assess DIF in RUMM, analysis of variance (ANOVA) of the standardized response residuals for each item was conducted across each level of the factors and class interval (i.e., at different levels of trait). A Bonferroni adjusted alpha level was then used to determine statistical significance. In addition, the importance of DIF was judged graphically. When an item was found to exhibit DIF (statistically and graphically), deletion was considered, particularly if removal improved overall model fit .
Last, if the subscale included enough items (i.e., more than three), dimensionality analyses were conducted. To examine dimensionality of the subscales, principal component analysis (PCA) of the residuals was performed to identify the two subsets of items that showed the most difference from one another (i.e., identify the positively and negatively loading items). Person estimates (location values) derived from the highest positive set of items were compared for each person in the sample against those derived from the highest negative set using t-tests. The number of significant t-tests, outside the + 1.96 range, indicates whether the scale is unidimensional or not. If more than 5% of these tests are significant (or specifically the lower bound of the binomial confidence interval is above 5%), the scale is multidimensional . This approach has been shown to be robust to simulated levels of multidimensionality in polytomous scales .
To conduct the above analyses, mini-MAC data were entered into SPSS19.0  and then exported into RUMM2030 . In this study, 851 (out of the possible 863) survivors were included, which is adequate for the Rasch analyses conducted .