Skip to main content

The validity and reliability of quality of life questionnaires in patients with ankylosing spondylitis and non-radiographic axial spondyloarthritis: a systematic review and meta-analysis



Patients who suffered from ankylosing spondylitis (AS) or non-radiographic axial spondyloarthritis (nr-axSpA) often have poor quality of life (QoL) and there has been a substantial increase in research on acceptable questionnaires for assessment of QoL. This systematic review aims at examining the validity and reliability of QoL questionnaires in patients with AS/nr-axSpA.


Randomized controlled trials (RCTs), cohort trials, and cross-sectional trails were retrieved by searching seven databases. Primary outcomes included test–retest reliability and construct validity. Secondary outcomes included internal consistency, structural validity, responsiveness and so on. Data extraction and analyses were conducted according to the Cochrane standards. The Agency for Healthcare Research and Quality (AHRQ) checklists was used to assess the risk of bias for each included study. We used the Consensus-based Standards for the Selection of Health Status Measurement Instruments (COSMIN) to assess the methodological quality and measurement property of included instruments. The quality of evidence on pre-specified outcomes were assessed by the Grades of Recommendations, Development and Evaluation (GRADE) approach.


22 publications containing 10 self-rating instruments were included in this study. Most studies were cross-sectional in design and a total of 3,085 participants were enrolled. 19 studies had moderate to high test–retest reliability. Cronbach’s alpha (α) Coefficients were generally high (0.79–0.97) for overall scales. The ankylosing spondylitis quality of life (ASQOL) and evaluation of ankylosing spondylitis quality of life (EASi-QoL) questionnaires showed the strongest measurement properties in high-quality studies. The correlation coefficient for test–retest reliability of the ASQOL questionnaire was 0.85 (95% CI 0.80 to 0.89). The pooled Cronbach’s α coefficients of the ASQOL questionnaire and the EASi-QoL questionnaire were high. Bath Ankylosing Spondylitis Disease Activity Index (BASDAI) and Bath Ankylosing Spondylitis Functional Index (BASFI) were considered as two validity criteria. For the ASQOL and EASi-QoL questionnaire, pooled convergent validity associations with BASDAI and BASFI were low to strong (0.24–0.81).


This study indicated acceptable reliability and stability of included QoL questionnaires. The ASQOL and the EASi-QoL questionnaires are validated and reliable disease-specific questionnaires for the assessment of QoL in patients with AS/nr-axSpA.


Ankylosing spondylitis (AS) and non-radiographic axial spondyloarthritis (nr-axSpA) are common chronic inflammatory arthritis affecting the axial skeleton [1], which is characterized by chronic low back pain, radiographic sacroiliitis, excess spinal bone destruction and aberrant bone formation, and generally with positive HLA-B27. Current estimates indicate that AS affects up to 0.1–1.4% of the adult population worldwide [2], while data for nr-axSpA is not currently available. An update of review shows that the prevalence of AS ranged from 9 to 30 per 10,000 persons, and the risk of mortality seems to be increased [3]. The prevalence of AS was higher in males compared with females [4], with gender ratios of around 3.8:1 in Europe and 2.3:1 in Asia [2].

According to 2019 ACR recommendations, the primary recommendations of medical treatment is nonsteroidal anti-inflammatory Drugs (NSAIDs) and tumor necrosis factor inhibitors (TNFi) for AS/nr-SpA [5]. Moreover, non-pharmacological managements such as back exercise also have a benefit for releasing back pain and morning stiffness. However, there remains approximately 40% of patients do not achieve adequate disease control [6]. Bone destruction and aberrant often result in serious impairment of spinal mobility and physical function in patients with AS/nr-SpA. The onset is usually in early adulthood, even in late adolescence. Patients with AS/nr-SpA often have to suffer from disability during the most of life, and incapacity for work. Meanwhile, social problems, depression, and sexual activity difficulty have been reported among patients with AS/nr-SpA. Thus, sufficient shreds of evidence remind that patients with AS/nr-SpA often have poor health-related quality of life (QoL).

The world health organization (WHO) definition of QoL contains physical, psychological, and social [7]. As the increasing concerns of QoL, it has become an important outcome in studies. The measurements of QoL may make a contribution to improve health care, evaluate the safety of some specific therapies, predict disease activity and elucidate proper targets for treatment. Several tools have been developed to assess the patients’ self-reported QoL, which include the generic Short Form-36 (SF-36) survey, the world health organization quality of life (WHOQoL) pilot instrument, the disease-specific ankylosing spondylitis quality of life (ASQOL) [8] questionnaire and the evaluation of ankylosing spondylitis quality of life (EASi-QoL) [9] questionnaire. There is a glaring absence of a systematic review or meta-analysis concentrating on the comparative reliability and validity of QoL questionnaires in recent five years. The aim of this systematic review is to fill the information gap.

Methods and analysis

This systematic review is reported according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines checklist.

Protocol and registration

The protocol of this systematic review is documented in PROSPERO (ID = CRD42021218489).

Eligibility criteria

Types of study

Studies will be included if they use questionnaires assessing the QoL for patients with AS/nr-SpA, with no restrictions of language, and years of publishment. Randomized controlled trials (RCTs), cohort trials, and cross-sectional trails will be included.

Types of participants

Adult patients (≥ 18 years old) meet the standardized diagnostic criteria, such as the assessment of spondyloarthritis international society (ASAS) imaging criteria for axSpA [1] or the 1984 modified New York criteria for AS/nr-SpA [8], with no restrictions of gender or ethnicity.

Types of outcome measure

Primary outcomes

The test–retest reliability, and construct validity of the included health-related QoL questionnaires.

Secondary outcomes

The internal consistency, structural validity, responsiveness, and the floor and ceiling effects of included QoL tools.

Exclusive criteria

  1. 1.

    Conference abstract, editorial, opinion article, scientific statement, guideline, protocol, animal trials, retraction, review, or duplicate publications.

  2. 2.

    Studies could not provide available data.

Search methods

Electronic searches

The following online databases were searched from inception to October 31, 2020: PubMed, EMBASE, the Cochrane Library, China National Knowledge Infrastructure (CNKI), Chinese Scientific Journal Database (VIP), Wanfang Database, and SinoMed Database. The following search terms were used: ankylosing spondylitis, axial spondyloarthritis, quality of life, reliability, validity, internal consistency, questionnaires, surveys, scales, index, SF-36, and short forms, both in Chinese and English.

Searching other resources

We also screened reference lists of retrieved articles to identify potential missing studies.

Search strategies

Details of search strategies in English databases were provided in Additional file 1.

Study selection

The title/abstract and full article were screened by two reviewers (JQ Chen, JY Yang) according to the eligible criteria independently. Disagreements were resolved by consensus, or discussion with the third review author (J Luo). The full selection process was presented in a flow diagram.

Data extraction

A predesigned data extraction form was used to extract relevant data by four reviewers (JQ Chen, JY Yang, CH Yao, and CQ Xu) independently. The following information was included:

  1. 1.

    General information (title, the first author, year of publication, funding, country, study design, sample size, setting)

  2. 2.

    Participants (disease, diagnostic standard, age, gender, disease duration)

  3. 3.

    Properties of target questionnaires (instruments and version, number of items, internal consistency, test–retest reliability, convergent validity and discriminative validity, structural validity)

The missing information was sought by contacting the original authors if possible. Any discrepancies were resolved by consulting a third reviewer (J Luo).

Quality assessment

Risk of bias assessment

The tool recommended by the Agency for Healthcare Research and Quality (AHRQ) [10] was adopted to assess the risk of bias of include studies. The following criteria were assessed: selection bias and confounding, performance bias, attrition bias, detection bias, reporting bias, and other bias (risk of bias graph was provided in Additional file 2). Each item was judged as low risk of bias, high risk of bias or unclear on consensus between two reviewers (Q He and JQ Chen). Disagreement was resolved by consulting a third reviewer (J Luo).

Evaluation of the methodological quality and measurement property

Firstly, the risk of bias checklist of the uniform criteria tools (Consensus-based Standards for the selection of health status Measurement Instruments, COSMIN [11]) was used to assess the methodological quality of instruments. Each item was rated as very good, adequate, doubtful and inadequate. Then, two separate authors (Q He and JQ Chen) awarded a score of either positive (+), negative (−) or indeterminate (?) to each measurement property, based upon the quality criteria for good measurement criteria. At last, the reviewers graded the quality of evidence by the Grading of Recommendations Assessment, Development, and Evaluation (GRADE) approach [12]. The quality of evidence for each outcome was judged as high, medium, low, extremely low. Disagreement was discussed with a third reviewer (J Luo). Details is given in Additional file 2.

Statistical analysis and data synthesis

Meta-analysis of extracted coefficients of reliability and validity was performed when data was available from at least two studies. The Chi-square test was conducted to test heterogeneity. If heterogeneity was noticeable (50% < I2 < 75%), a random-effects model was performed to pool the effect sizes. If the I2 value was low (I2 ≤ 50%), a fixed-effects model will be performed. Data were not pooled when heterogeneity was high (I2 ≥ 75%). Subgroup analysis was adopted to explore potential reasons for heterogeneity according different characteristics. Sensitivity analysis was conducted if the heterogeneity was significant. Funnel plots were used to detected the publication bias if studies were more than eight.

Quantitative synthesis was conducted with Stata V.16.0 software. Internal consistency was reported as Conbranch's alpha (α) coefficient and test–retest reliability was reported as intraclass correlation coefficient (ICC) or Spearman’s correlation coefficients. All coefficients were transformed when meta-analysis was conducted. Conbranch's α coefficient and ICC were transformed with the method proposed by Hakstian and Whalen [13] (\(Transformed\;Conbranch^{\prime}s\;\alpha = \mathop {\left( {1 - \alpha } \right)}\nolimits^{\frac{1}{3}}\)). Spearman’s or Pearson’s correlation coefficients was transformed to Fisher's Z scales (\(\mathop r\nolimits_{s} = \frac{6}{\Pi }\mathop {\sin }\nolimits^{ - 1} \left( \frac{2}{r} \right)\) (rs = Spearman’s correlation coefficients, r = Pearson’s correlation coefficients); \(fisher^{\prime}s\;Z = 0.5 \times In\left( {\frac{{1 + \mathop r\nolimits_{s} }}{{1 - \mathop r\nolimits_{s} }}} \right)\); \(\mathop v\nolimits_{z} = \frac{1}{N - 3}\); \(\mathop s\nolimits_{E} = \sqrt {\mathop v\nolimits_{z} }\)). The pooled effects and confidence interval were transformed back to the original scale (\(Summary\;r = \frac{{\mathop e\nolimits^{2z} { - }1}}{{\mathop e\nolimits^{2z} + 1}}\left( {Z = Summary\;fisher^{\prime}s\;Z} \right)\)) to evaluate the measurement properties of QoL tools.

A ‘Summary of findings’ table was created using the GRADE profiler (V.3.6.1). Detailed description of correlation coefficients and Fisher’s Z calculations is given in Additional file 2. Funnel plots are given in Additional file 3.


Study selection

2115 publications were retried in the search, including 608 duplicate publications which were removed. 1459 articles were excluded, 1449 of them didn’t focus on our topic, and 10 articles were not clinical trials. As a result, 48 publications were selected after the title and abstract screening. 22 articles [8, 14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34] were enrolled according to the inclusion criteria and 15 [8, 15,16,17,18,19,20,21,22,23,24,25, 28, 29, 31] of them were included in meta-analysis. A PRISMA flow diagram was created to describe the study selection process (Fig. 1).

Fig. 1
figure 1

Flow diagram of study search and identification

Characteristics of included studies

Characteristics of studies included in the literature review were provided in Table 1. Two articles [32, 34] were published in Chinese, with one in Spanish [19] and one in French [24], and the others were English publications. 18 studies [8, 14, 17,18,19,20,21,22, 24,25,26,27,28,29,30,31,32] were cross-sectional in design, two [33, 34] were cohort studies and two [16, 23] were RCTs. Data from 3,085 participants was extracted at an average disease duration of 15 years. The percentage of male was ranging from 28.9 to 89.0%.

Table 1 Characteristics of studies included in this systematic review

A total of 12 self-rating instruments were included in this study. 13 studies [8, 15,16,17,18,19,20,21,22,23,24,25, 31] administered the ASQOL questionnaire, two [28, 29] assessed the EASi-QoL questionnaire, one reported the revised Leeds disability questionnaire (RLDQ) [20], the combined AS questionnaire for quality of life (CASQ-QoL) [30] questionnaire, the EuroQol [14] questionnaire, the patient-generated index (PGI) [27], the short form-36 health survey (SF-36) [26], the short form-12 health survey (SF-12) [14], and the modified ankylosing spondylitis-arthritis impact measurement scales 2 (AS-AIMS2) [34] respectively. One study attempted to develop an AS patient quality of life measurement scale (SQOL-AS) [32] in the Chinses population, and another one focused on comparing the characteristics of the EQ-5D and SF-6D scales [33].

All the questionnaires were convenient to finish within 10 min and the ASQOL questionnaire usually took out 2.4 to 5 min. The cross-cultural adoption of these instruments was confirmed. The disease-specific ASQOL questionnaire as well as the general SF-36 could multidimensionally assess the QoL in patients, including physical functioning, role physical, bodily pain, general health, vitality, social function, role emotion, and mental health. Physical component summary (PMC) and mental component summary (MCS) were set to summarize the physical and mental health. The SF-12 survey is a simplified version of the SF-36 survey, which maybe more suitable to report the QoL in general population or evaluate the change of condition in spectacular patients. The AS-AIMS2 and RLDQ had concentrated on disabling conditions, while the modified AS-AIMS2 exploring the aspects of mental health, emotional well-being and social interactivity. The PGI has been validated to estimate the life expectancy of patients and nominated areas of their life affected by disease. Properties of included instruments is given in Additional file 2.

Risk of bias in the included studies

Risk of bias summary for each study was shown in Fig. 2. The overall risk of bias was evaluated as low. Regarding the individual studies, it shows that performance bias and reporting bias are the majority of the risk of bias. Risk of bias graph is given in Additional file 3.

Fig. 2
figure 2

Risk of bias summary of include studies

Evaluation of the methodological quality and measurement property

According to the COSMIN criteria, 18 [14,15,16,17,18,19,20,21,22], [24,25,26], [28,29,30,31,32], [34] studies were found to have “very good” methodological quality of internal consistency and one [8] has “inadequate”. 11 studies [15, 16, 18,19,20, 24,25,26,27,28, 30] was rated as “very good”, 6 [8, 14, 17, 21, 22, 29] as “adequate” and 3 [31, 33, 34] as doubtful in the methodological quality evaluation of reliability. 11 [8, 14, 16, 18, 20, 22, 28,29,30,31] studies were found having “adequate” methodological quality for structural validity. Methodological quality evaluations concerning construct validity of 7 studies [8, 15, 16, 19, 20, 28, 29] was undertaken and received a “very good” methodological quality rating. The pooled results of per patient-reported outcome measures (PROMs) rated against the same quality criteria for good measurement properties were shown in Table 2 (Details are showed in Additional file 2).

Table 2 Methodological quality of PROMs and quality of measurement properties

Test–retest reliability

19 studies [8, 14,15,16,17,18,19,20,21,22, 24,25,26,27,28,29,30,31, 34] had moderate to high test–retest reliability (ICC value of 0.82 to 0.96 or Spearman’s correlation coefficients of 0.70 to 0.98). Adequate to very good test–retest reliability of high quality of evidence was found for the ASQOL questionnaire (ICC values of 0.44 to 0.933), and very good reliability of moderate quality was found for the EASi-QoL questionnaire (ICC range from 0.88 to 0.935) [28, 29], the RLDQ (ICC = 0.95) [20], the CASQ-QoL (r = 0.9), the Euro-QoL questionnaire (closed format: ICC = 0.88, blind format: ICC = 0.82) [14], the EQ-5D scale (ICC = 0.55) [33], the SF-6D scale (ICC = 0.68) [33], and the PGI(closed format: ICC = 0.88,blind format: ICC = 0.82) [27].

A sensitivity analysis was performed to reduce the significant heterogeneity (values of 88.6% to 60.47%). The pooled Fisher's Z estimate of the ASQOL scales was 1.26 (95% CI 1.10 to 1.41), and the pooled correlation coefficient value of test–retest reliability was 0.85 (95% CI 0.80 to 0.89).

Construct validity

Construct validity indicating the associations with the validity criteria includes convergent validity and discriminative validity. The Bath Ankylosing Spondylitis Disease Activity Index (BASDAI) and Bath Ankylosing Spondylitis Functional Index (BASFI) are commonly used to assess the disease activity of patients with AS/nr-SpA. The convergent validity of the ASQOL questionnaire is weak to good. The summary r values of the association with ASQOL questionnaire and BASDAI were 0.78 (95% CI 0.74 to 0.82) and 0.54 (95% CI 0.47 to 0.61) in the Europe and regions beyond Europe. Subgroup analysis demonstrated that the ASQOL questionnaire was more validated and reliable to evaluate the QoL in the Europe than other regions. The pooled summary r value of association with ASQOL questionnaire and BASFI was 0.62 (95% CI 0.57 to 0.68). The funnel plot had symmetry. The EASi-QOL questionnaire focuses on four dimensions: physician function, disease activity, emotional well-being, and social participation. “Very good” to convergent validity of moderate evidence quality of the association with scores of BASFI and the ASQOL questionnaire were 0.65 (95% CI 0.71 to 0.75) and 0.70 (95% CI 0.64 to 0.74).

Internal consistency

In 20 studies [8, 14,15,16,17,18,19,20,21,22, 24,25,26,27,28,29,30,31,32, 34], internal consistency of most included scales was generally high, with Cronbach’s α coefficients values of 0.79 to 0.97. Only one [16] study reported poor internal consistency with a Cronbach’s α coefficient value of 0.44. The ωH value of 0.82 was reported as a measure of reliability in one article [25].

The overall effects of transformed Cronbach’s α coefficient of the ASQOL and EASi-QoL questionnaires were 0.48 (95% CI 0.43 to 0.52), 0.46 (95% CI 0.42 to 0.49). The pooled Cronbach’s α coefficients scored as high and moderate quality of evidence of these two scales were 0.89 (95% CI 0.86 to 0.92), 0.91 (95% CI 0.88 to 0.93). It indicates good internal consistency and stability. No heterogeneity (I2 = 0.0) was detected, so the fix-effects model was chosen to conduct the meta-analysis.

Structural validity

Item response theory (IRT)/Rasch model has been used in five articles to test the structural validity of the ASQOL and EASi-QoL questionnaires. Two publications [28, 29] chose the exploratory factor analysis (EFA) and confirmatory factor analysis (CFA) models for the EASi-QOL questionnaire. It showed that the factor loadings were higher than 0.40 and the item-total correlations were ranged from 0.66 to 0.84. Principle component analysis (PCA) [20] was performed in five studies to assess the dimensionality of instruments. A 2-parameter Rasch model confirmed unidimensionality (chi-square fit p = 0.86) with good item discrimination of the RLDQ [20].

Other properties

One [22] research reported the Responsiveness of the ASQOL questionnaire. Each one study evaluated the responsiveness of CASQ-QoL [30] questionnaire and the PGI [27]. Five articles [18, 20, 27, 30, 31] reported the floor and ceiling effects and missing data for ASQOL questionnaire, the PGI, and the CASQ-QoL questionnaire (Table 3).

Table 3 Meta-analysis of the ASQOL and EASi-QoL questionnaires


This is the first PRISMA-compliant systematic review and meta-analysis for measurement properties of QoL in AS/nr-SpA populations. In this systematic review, 11 identified QoL questionnaires in 22 publications were summarized and the reliability and validity of different questionnaires were outlined. Reliability could represent the consistency, stability of scales at various times and populations. Validity is an estimate of the validity and accuracy of the test, which including content validity, construct validity, and structural validity. This systematic review suggested that the identified questionnaires have generally excellent internal consistency, test–retest reliability, and usually had moderate or good convergent validity. The ASQOL questionnaire was the most widely studied questionnaire. This questionnaire was initially developed parallelly in the United Kingdom and the Netherland, on the basis of a conceptual model. The Cronbach’s α coefficient and ICC were highest for it. The next most commonly used tool was the EASi-QoL questionnaire. The ICC was highest for the physical function domains. The SF-36 survey contains 36 items divided into eight domains, covering physical, social function, and mental health. It is convenient for researchers to use in comparison of the QoL among individuals in different health conditions. However, only one included study [26] paid attention to the measurement properties in the Singapore population. Convergent validity and discriminative validity were also variable for the clinical measures. The assessment of convergent validity and discriminative validity demonstrated a strong correlation of QoL questionnaires with disease activity measures. Fatigue, pain or chest expansion, recognized symptoms of AS, also showed a moderate association of QoL. Besides, BASDAI and BASFI are generally considered as validity criteria. Our results showed that included scales and the constructs could better reflect the multifaceted features of disease activity and health-related QoL in AS/nr-SpA patients. Meanwhile, the ASQOL questionnaire was also frequently used as an accepted disease-specific QoL scales for evaluation of QoL in AS/nr-SpA patients. Other measurements properties such as responsiveness have also been reported in some publications. Furthermore, it should be noted that quality of evidence for included studies was low to high.

PROMs properties are recommended to be evaluated by the COSMIN checklist. The COSMIN criteria could be used as a guideline to help selecting the most appropriate health state measurement tools in research and clinical practice in systematic review. According to the COSMIN standards, most instruments had adequate methodological quality. The meta-analysis showed that the ASQOL and EASi-QoL questionnaires both had strong reliability and moderate validity. Many effect indicators would correlate to the methodological quality of PROMs properties. For example, if the sample size was huge (more than 200 patients when using IRT/Rasch analysis model), or the time interval was appropriate (usually two weeks), or patients enrolled were stable, there will be “very good” methodological quality. If one study only reported the ICC without clear descriptions, “adequate” will be evaluated as that in test–retest reliability. Some studies didn’t perform the classical test theory or the IRT/Rasch analyses model to assess the structural validity, thus “doubtful” or “inadequate” will be rated to the structural validity.

On the basis of current studies, high heterogeneity was displayed in different countries and languages, especially these non-native English countries. Subgroup analysis was frequently used to explore the heterogeneity of meta-analysis. Although the ASQOL questionnaire and the SF-36 survey had been validated and used in countries worldwide. It still hard to overcome cultural and linguistic differences between countries. With the diversity of expression habits and customs, it is vital that researchers should develop translation and adoption studies in various languages versions. Few articles investigated the translation and adoption of the ASQOL and EASi-QoL questionnaires in Asian by using the COSMIN standard. With the increasing attention to QoL of patients with AS/nr-SpA, more concentrations should be put on measurement properties of these disease-specific QoL instruments in Asian countries. The SQOL-AS scale was designed and adopted in Chinese population. However, the reliability and validity should be confirmed with a lager sample size. There was a limit of consistency in statistic analysis among different articles. ICC was calculated to represent the internal consistency in some researches while the others used the Spearman’s or Pearson’s correlation coefficients. This might be due to differences in methods and outcomes across studies, including, but not limited to the heterogeneity of disease activity and disease durations.

There was only one review [35] focused on factors associated with QoL has systematically summarized the instruments for evaluating the QoL of AS patients. Several meta-analyses have been designed to calculate the QoL scores or predict the disease-related factors.

Despite this, there remains some limitations. This systematic review didn’t focus on content validity because only a few studies reported details about this property. Only published studies could be included in this systemic review. Although we tried to scan all the potential studies about the QoL questionnaires in AS/nr-SpA patients, the incompleteness of information could not be ignored. There was no result of the grey literatures after electronic searching and checking the reference lists. In this systematic review, most studies had an unclear selection bias with the cross-sectional study design and some studies didn’t report the randomized strategy. The validity criteria varied in included articles. Only data of three criteria (the BASDAI, BASFI and ASQOL questionnaire) was pooled in this meta-analysis to represent the construct validity. The Fisher method was used to meet the need to determine variance in analysis when data was not directly reported. Importantly, the previous researches on documented the lack of sufficient comparisons of these instruments in the same population, this unmet need should also be filled by future qualitative research. For the same reason of quantities limitation or high heterogeneity, the meta-analysis was only performed in two scales. Thus, the conclusions still need to be confirmed by high-quality studies.


This study indicated that the ASQOL and the EASi-QoL questionnaires are validated, reliable disease specific questionnaires for assessment of QoL in patients with AS/nr-axSpA. Different questionnaires have different clinical characteristics and measurement properties. Data from QoL studies are conflicting. Cultural and linguistic differences between countries should be considered during a new QoL questionnaire adoption. Future qualitative researches are also needed to compare different scales in measurement properties.

Availability of data and materials

All data generated or analyzed during this study are included in this published article and its supplementary information files (Additional files 1 and 2).


  1. Sieper J, Rudwaleit M, Baraliakos X, Brandt J, Braun J, Burgos-Vargas R, et al. The Assessment of SpondyloArthritis international Society (ASAS) handbook: a guide to assess spondyloarthritis. Ann Rheum Dis. 2009;68(Suppl 2):ii1-44.

    Article  Google Scholar 

  2. Dean LE, Jones GT, MacDonald AG, Downham C, Sturrock RD, Macfarlane GJ. Global prevalence of ankylosing spondylitis. Rheumatology (Oxford). 2014;53:650–7.

    Article  Google Scholar 

  3. Wang R, Ward MM. Epidemiology of axial spondyloarthritis: an update. Curr Opin Rheumatol. 2018;30:137–43.

    Article  Google Scholar 

  4. Stolwijk C, van Onna M, Boonen A, van Tubergen A. Global prevalence of spondyloarthritis: a systematic review and meta-regression analysis. Arthritis Care Res (Hoboken). 2016;68:1320–31.

    Article  Google Scholar 

  5. Ward MM, Deodhar A, Gensler LS, Dubreuil M, Yu D, Khan MA, et al. 2019 update of the American College of Rheumatology/Spondylitis Association of America/Spondyloarthritis Research and Treatment Network recommendations for the treatment of ankylosing spondylitis and nonradiographic axial spondyloarthritis. Arthritis Care Res (Hoboken). 2019;71:1285–99.

    Article  Google Scholar 

  6. Sepriano A, Regel A, van der Heijde D, Braun J, Baraliakos X, Landewé R, et al. Efficacy and safety of biological and targeted-synthetic DMARDs: a systematic literature review informing the 2016 update of the ASAS/EULAR recommendations for the management of axial spondyloarthritis. RMD Open. 2017;3: e000396.

    Article  Google Scholar 

  7. Whoqol Group. The World Health Organization Quality of Life assessment (WHOQOL): position paper from the World Health Organization. Soc Sci Med. 1998;41:1403–9.

    Google Scholar 

  8. Doward LC, Spoorenberg A, Cook SA, Whalley D, Helliwell PS, Kay LJ, et al. Development of the ASQoL: a quality of life instrument specific to ankylosing spondylitis. Ann Rheum Dis. 2003;62:20–6.

    Article  CAS  Google Scholar 

  9. Packham JC, Jordan KP, Haywood KL, Garratt AM, Healey EL. Evaluation of Ankylosing Spondylitis Quality of Life questionnaire: responsiveness of a new patient-reported outcome measure. Rheumatology (Oxford). 2012;51:707–14.

    Article  Google Scholar 

  10. Viswanathan M, Ansari MT, Berkman ND, Chang S, Hartling L, McPheeters M, et al. Assessing the risk of bias of individual studies in systematic reviews of health care interventions. In: Methods guide for effectiveness and comparative effectiveness reviews. Rockville (MD): Agency for Healthcare Research and Quality (US); 2012.

  11. Mokkink LB, Terwee CB, Patrick DL, Alonso J, Stratford PW, Knol DL, et al. The COSMIN checklist for assessing the methodological quality of studies on measurement properties of health status measurement instruments: an international Delphi study. Qual Life Res. 2010;19:539–49.

    Article  Google Scholar 

  12. Guyatt GH, Oxman AD, Vist GE, Kunz R, Falck-Ytter Y, Alonso-Coello P, et al. GRADE: an emerging consensus on rating quality of evidence and strength of recommendations. BMJ. 2008;336:924.

    Article  Google Scholar 

  13. Rodriguez MC, Maeda Y. Meta-analysis of coefficient alpha. Psychol Methods. 2006;11:306–22.

    Article  Google Scholar 

  14. Haywood KL, Garratt AM, Dziedzic K, Dawes PT. Generic measures of health-related quality of life in ankylosing spondylitis: reliability, validity and responsiveness. Rheumatology (Oxford). 2002;41:1380–7.

    Article  CAS  Google Scholar 

  15. Jenks K, Treharne GJ, Garcia J, Stebbings S. The ankylosing spondylitis quality of life questionnaire: validation in a New Zealand cohort. Int J Rheum Dis. 2010;13:361–6.

    Article  Google Scholar 

  16. Doward LC, McKenna SP, Meads DM, Twiss J, Revicki D, Wong RL, et al. Translation and validation of non-English versions of the Ankylosing Spondylitis Quality of Life (ASQOL) questionnaire. Health Qual Life Outcomes. 2007;5:7.

    Article  Google Scholar 

  17. Duruöz MT, Doward L, Turan Y, Cerrahoglu L, Yurtkuran M, Calis M, Tas N, et al. Translation and validation of the Turkish version of the Ankylosing Spondylitis Quality of Life (ASQOL) questionnaire. Rheumatol Int. 2013;33:2717–22.

    Article  Google Scholar 

  18. Leung YY, Lee W, Lui NL, Rouse M, McKenna SP, Thumboo J. Adaptation of Chinese and English versions of the Ankylosing Spondylitis quality of life (ASQoL) scale for use in Singapore. BMC Musculoskelet Disord. 2017;18:353.

    Article  Google Scholar 

  19. Ariza-Ariza R, Hernández-Cruz B, López-Antequera G, Toyos FJ, Navarro-Sarabia F. Cross-cultural adaptation and validation of a Spanish version of a specific instrument to measure health-related quality of life in patients with ankylosing spondylitis. Reumatol Clin. 2006;2:64–9.

    Article  CAS  Google Scholar 

  20. Haywood KL, Garratt AM, Jordan K, Dziedzic K, Dawes PT. Disease-specific, patient-assessed measures of health outcome in ankylosing spondylitis: reliability, validity and responsiveness. Rheumatology (Oxford). 2002;41:1295–302.

    Article  CAS  Google Scholar 

  21. Fallahi S, Jamshidi AR, Bidad K, Qorbani M, Mahmoudi M. Evaluating the reliability of Persian version of ankylosing spondylitis quality of life (ASQOL) questionnaire and related clinical and demographic parameters in patients with ankylosing spondylitis. Rheumatol Int. 2014;34:803–9.

    Article  Google Scholar 

  22. Pham T, van der Heijde DM, Pouchot J, Guillemin F. Development and validation of the French ASQoL questionnaire. Clin Exp Rheumatol. 2010;28:379–85.

    PubMed  Google Scholar 

  23. Zhao LK, Liao ZT, Li CH, Li TW, Wu J, Lin Q, et al. Evaluation of quality of life using ASQoL questionnaire in patients with ankylosing spondylitis in a Chinese population. Rheumatol Int. 2007;27:605–11.

    Article  CAS  Google Scholar 

  24. Hamdi W, Haouel M, Ghannouchi MM, Mansour A, Kchir MM. Validation of the Ankylosing Spondylitis Quality of Life questionnaire in Tunisian language. Tunis Med. 2012;90:564–70.

    PubMed  Google Scholar 

  25. Hoepken B, Serrano D, Harris K, Hwang MC, Reveille J. Validation of the Ankylosing Spondylitis Quality of Life assessment tool in patients with non-radiographic axial spondyloarthritis. Qual Life Res. 2021;30:945–54.

    Article  Google Scholar 

  26. Kwan YH, Fong WW, Lui NL, Yong ST, Cheung YB, Malhotra R, et al. Validity and reliability of the Short Form 36 Health Surveys (SF-36) among patients with spondyloarthritis in Singapore. Rheumatol Int. 2016;36:1759–65.

    Article  Google Scholar 

  27. Haywood KL, Garratt AM, Dziedzic K, Dawes PT. Patient centered assessment of ankylosing spondylitis-specific health related quality of life: evaluation of the Patient Generated Index. J Rheumatol. 2003;30:764–73.

    PubMed  Google Scholar 

  28. Öncülokur N, Keskin D, Garip Y, Bodur H, Köse K. Turkish version of evaluation of ankylosing spondylitis quality of life questionnaire in patients with ankylosing spondylitis: a validation and reliability study. Arch Rheumatol. 2018;33:443–54.

    Article  Google Scholar 

  29. Haywood KL, Garratt AM, Jordan KP, Healey EL, Packham JC. Evaluation of ankylosing spondylitis quality of life (EASi-QoL): reliability and validity of a new patient-reported outcome measure. J Rheumatol. 2010;37:2100–9.

    Article  Google Scholar 

  30. El Miedany Y, El Gaafary M, El Aroussy N, Ahmed I, Youssef S, Palmer D. Patient reported outcomes in ankylosing spondylitis: development and validation of a new questionnaire for functional impairment and quality of life assessment. Clin Exp Rheumatol. 2011;29:801–10.

    PubMed  Google Scholar 

  31. Graham JE, Rouse M, Twiss J, McKenna SP, Vidalis AA. Greek adaptation and validation of the Ankylosing Spondylitis Quality of Life (ASQoL) measure. Hippokratia. 2015;19:119–24.

    CAS  PubMed  PubMed Central  Google Scholar 

  32. Liu X, Deng J, Liu F, Chen G, Chen J. Development of an ankylosing spondylitis patient quality of life scale. J Guangzhou Univ Tradit Chin Med. 2005;22:315–9.

    Google Scholar 

  33. Boonen A, van der Heijde D, Landewé R, van Tubergen A, Mielants H, Dougados M, et al. How do the EQ-5D, SF-6D and the well-being rating scale compare in patients with ankylosing spondylitis? Ann Rheum Dis. 2007;66:771–7.

    Article  Google Scholar 

  34. Guillemin F, Challier B, Urlacher F, Vançon G, Pourel J. Quality of life in ankylosing spondylitis: validation of the ankylosing spondylitis Arthritis Impact Measurement Scales 2, a modified Arthritis Impact Measurement Scales Questionnaire. Arthritis Care Res. 1999;12:157–62.

    Article  CAS  Google Scholar 

  35. Kotsis K, Voulgari PV, Drosos AA, Carvalho AF, Hyphantis T. Health-related quality of life in patients with ankylosing spondylitis: a comprehensive review. Expert Rev Pharmacoecon Outcomes Res. 2014;14:857–72.

    Article  Google Scholar 

Download references


Not applicable.


This work is supported by the Program of National Regional Traditional Chinese Medicine (Specialty) Treatment Center Project (2019-ZX-006), the Elite Medical Professionals project of China-Japan Friendship Hospital (ZRJY2021-QM14), and the National Key Clinical Specialty Capacity Building Project (2011-ZDZK-001).

Author information

Authors and Affiliations



QT conceived and designed the study, provided methodological perspectives, verified data extraction and data analyses, revised the manuscript. QH conceived and designed the study, developed the search strategy, performed data extraction and quality assessment, drafted the manuscript. JL conceived and designed the study, quality assessment, revised the manuscript. JC performed data extraction, methodological quality assessment and quality assessment. JY performed data extraction and quality assessment. CY performed data extraction. CX performed data extraction. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Qingwen Tao.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1.

Search Terms for used English database.

Additional file 2.

PROMs properties, GRADE and methodological quality evaluation, effect sizes of meta-analysis.

Additional file 3.

Risk of bias graph and funnel plots.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

He, Q., Luo, J., Chen, J. et al. The validity and reliability of quality of life questionnaires in patients with ankylosing spondylitis and non-radiographic axial spondyloarthritis: a systematic review and meta-analysis. Health Qual Life Outcomes 20, 116 (2022).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: