Measurement property | Rating | Criteria |
---|---|---|
Main category: validity | ||
Content validity | + | ≥ 85% of the items are relevant for the construct of interest, the target population, and the context of use AND no key concepts are missing (comprehensiveness) AND > 85% of items is comprehensible for the population of interestb |
? | Not all information for ‘+’ reported | |
− | Criteria for ‘+ ’ not met | |
Structural validityc | + | CTT CFA: CFI or TLI or comparable measure > 0.95 OR RMSEA < 0.06 OR SRMR < 0.08 |
IRT/Rasch No violation of unidimensionality: CFI or TLI or comparable measure > 0.95 OR RMSEA < 0.06 OR SRMR < 0.08 OR (for item banks only) Bifactor model: Standardized loadings on common factor (H) are > 0.30 and larger than loadings on group factors OR high coefficient omega (> 0.80) and a high ECV (> 0.60) AND (for item banks: OR) No or limited violation of local independence: Residual correlations among the items after controlling for the dominant factor < 0.20 in ≥ 95% of item pairs OR in < 95% of item pairs but evidence shown that impact is negligible OR Q3′s < 0.37 AND No violation of monotonicity: Adequate looking graphs OR item scalability (Hi) > 0.30 AND (not for item banks) Adequate model fit IRT: χ2 p-value > 0.001 Rasch: infit and outfit mean squares ≥ 0.5 and ≤ 1.5 OR Z-standardized values > − 2 and < 2 | ||
? | Not all information for ‘+’ reported OR residual correlations among the items after controlling for the dominant factor < 0.20 in < 95% of item pairs but no evidence shown on the impact | |
− | Criteria for ‘+’ not met | |
Hypothesis testing for construct validity | + | Result is in accordance with hypothesisd |
? | No hypothesis defined (by the review team) | |
− | Result is not in accordance with hypothesisd | |
Cross-cultural validity/measurement invariance | + | No important differences found between group factors (such as age, gender, language) in multiple group factor analysis OR DIF in ≤ 5% of item pairs for group factors (e.g., McFadden’s R2 < 0.02) OR DIF in > 5% of item pairs but evidence shown that impact is negligible |
? | No multiple group factor analysis OR DIF analysis performed, OR DIF in > 5% of item pairs and no evidence shown on impact | |
− | Important differences between group factors OR DIF was found in > 5% of item pairs with no mention of impact or evidence showing that impact is not negligible | |
Main category: Reliability | ||
Internal consistency/measurement precision | + | CTT At least low evidencee for sufficient structural validityf AND Cronbach’s alpha(s) ≥ 0.70 for each unidimensional scale or subscale IRT At least low evidencee for sufficient structural validityf AND reliability coefficient ≥ 0.90 over a range of at least two standard deviations around the average of the study population (or ≥ 68% of the study population) |
? | Criteria for “At least low evidencee for sufficient structural validityf” not met | |
− | Criteria for “At least low evidencee for sufficient structural validityf” AND other criteria for + not met | |
Reliability | + | ICC or weighted Kappa ≥ 0.70 |
? | ICC or weighted Kappa not reported | |
− | ICC or weighted Kappa < 0.70 | |
Measurement error | + | SDC or LoA < MICe |
? | MIC not defined | |
− | SDC or LoA > MICe | |
Main category: Responsiveness | ||
Responsiveness | + | Result is in accordance with hypothesisd OR AUC ≥ 0.70 |
? | No hypothesis defined (by the review team) | |
− | Result is not in accordance with hypothesisd OR AUC < 0.70 |