Measuring self-reported ability to perform activities of daily living: a Rasch analysis

Wæhrens, Eva Ejlersen; Kottorp, Anders; Nielsen, Kristina Tomra

doi:10.1186/s12955-021-01880-z

Health and Quality of Life Outcomes

Table 1 Overview of the Rasch analysis

From: Measuring self-reported ability to perform activities of daily living: a Rasch analysis

Steps in the analysis	Procedures	Indicators/criteria
1. Selecting a Rasch Measurement model	Evaluation of the log likelihood ratio	A non-significant (p > 0.05) log likelihood ratio indicates that data fits an interval scale model i.e. the Rasch Rating Scale Model
2. The psychometric properties of the ADL-I rating scale	Following Linacre’s guidelines [28,29,30]	Frequency distribution across response categories should be either uniform or peak in central or extreme categories to illustrate optimal use of the categories
		Average category measures should advance monotonically up the rating scale, indicating that persons, who experience higher quality of performance, have higher item ratings
		Scale category outfit mean square (MnSq) values should be ≤ 2.0
		Threshold calibrations should advance monotonically, with no threshold disordering
		Thresholds should increase by at least 1.4 logits to show distinction between categories, but by no more than 5 logits to avoid large gaps in the variable [29, 30]
3. Principal Component Analysis (PCA)	Identification of possible secondary dimensions within the data	The proportion of variance explained by the measure must be > 50% The largest secondary dimension should have an eigenvalue < 2.0 (i.e. less than two items) to support unidimensionality [33]
	Examination of potential secondary dimensions: division of ADL-I items into three clusters based on item loadings, estimation of a measure for each person on each cluster and performance of Pearson correlations between measures	A disattenuated correlation (correlation based on measures adjusted for their standard error) > 0.7 between clusters would support unidimensionality [33]
4. Item goodness-of-fit	Examining infit and outfit statistics. Items displaying underfit misfit were removed one at the time, in the order of highest MnSq values, considering high infit MnSq values first	MnSq values between 0.7 and 1.3 logits, combined with z values ≥ 2.0, indicated item fit [34]
	Removal of underfitting items was planned to stop when all items met the criteria for acceptable goodness-of-fit	Assuming the PCA does not support the presence of a secondary dimension in the data, an instrument is generally considered to be unidimensional, when no more than 5% of the items fail to fit the Rasch model (p < 0.05) [32]
5. Person goodness-of-fit	Evidence of person-response validity was evaluated by examining the person goodness-of-fit statistics	The criterion for acceptable person goodness-of-fit was infit MnSq values < 1.3 logits associated with a z value of < 2.0 [35, 36] It was accepted that, by chance, up to 5% of the sample would fail to demonstrate acceptable goodness-of-fit without a serious threat to validity [36, 37]
6. After removal of misfitting items	Persons with maximum scores on this shorter version were removed, and analyses of rating scale properties, PCA and person goodness-of-fit repeated	Determine if scale properties and unidimensionality had improved
7. Differential Item Functioning (DIF)	Determine if item difficulty estimates vary across gender and diagnostic groups	An item was considered to display DIF, when the difference in item difficulty estimates between groups was > 0.50 logits [38] and statistically significant (p < 0.01) [33, 39, 40]
8. Differential Test Functioning (DTF)	Scatterplots of the variance of person ability measures across versions were produced	A criterion was set that no more than 5% of the participants should differ significantly (z-values exceeding ± 1.96) between the two measures [41]
9. Reliability and precision	Determine if the mean item difficulty measure was appropriately targeted to the mean person ability measure	The mean person ability measure would be close to zero for a well-targeted instrument [23]
	Examining the item-person map	Dispersion of item difficulty and person ability measures were evaluated for a reasonable match
	Precision was evaluated by overall separation and reliability indices	Separation indices should be at least 2.0 to obtain a desired reliability coefficient of 0.80 for replicability of person ability and item difficulty ordering [42] The closer the reliability index was to 1.0 (range 0.0 to 1.0) the better [43]

Back to article page

ISSN: 1477-7525

Contact us

Submission enquiries: journalsubmissions@springernature.com