Quality of life in primary sclerosing cholangitis: a systematic review

Background Primary sclerosing cholangitis (PSC) is a rare bile duct and liver disease which can considerably impact quality of life (QoL). As part of a project developing a measure of QoL for people with PSC, we conducted a systematic review with four review questions. The first of these questions overlaps with a recently published systematic review, so this paper reports on the last three of our initial four questions: (A) How does QoL in PSC compare with other groups?, (B) Which attributes/factors are associated with impaired QoL in PSC?, (C) Which interventions are effective in improving QoL in people with PSC?. Methods We systematically searched five databases from inception to 1 November 2020 and assessed the methodological quality of included studies using standard checklists. Results We identified 28 studies: 17 for (A), ten for (B), and nine for (C). Limited evidence was found for all review questions, with few studies included in each comparison, and small sample sizes. The limited evidence available indicated poorer QoL for people with PSC compared with healthy controls, but findings were mixed for comparisons with the general population. QoL outcomes in PSC were comparable to other chronic conditions. Itch, pain, jaundice, severity of inflammatory bowel disease, liver cirrhosis, and large-duct PSC were all associated with impaired QoL. No associations were found between QoL and PSC severity measured with surrogate markers of disease progression or one of three prognostic scoring systems. No interventions were found to improve QoL outcomes. Conclusion The limited findings from included studies suggest that markers of disease progression used in clinical trials may not reflect the experiences of people with PSC. This highlights the importance for clinical research studies to assess QoL alongside clinical and laboratory-based outcomes. A valid and responsive PSC-specific measure of QoL, to adequately capture all issues of importance to people with PSC, would therefore be helpful for clinical research studies. Supplementary Information The online version contains supplementary material available at 10.1186/s12955-021-01739-3.

tend to be rare and approximately 40-50% of people are asymptomatic at diagnosis [6,9]. However, with limited treatment options, people with PSC can live for many years with a number of debilitating symptoms such as fatigue, itch, and pain, as well as the emotional burden of an uncertain future [6,10], all of which can impact on quality of life (QoL) [11,12].
In 2019 PSC was identified as a top 10 research priority in non-alcohol related liver and gallbladder disorders in the UK [13], and the lack of treatments for PSC indicated as a major concern. Defining endpoints for clinical trials in PSC, however, has its challenges due to the unpredictable and prolonged clinical course of the condition [14]. Patient-reported outcome measures (PROMs), including assessments of QoL, are important for use in clinical trials, addressing health-related experiences from the patient perspective [15]. The assessment of patients' experiences in clinical research is particularly warranted for chronic conditions, such as PSC, which can have a long-term impact on functioning and well-being [16,17]. Capturing these experiences in a patient-centred way is necessary to enable holistic assessment of the safety and efficacy of new interventions for people with PSC [18].
As the first stage in a doctoral project developing a measure of QoL for people with PSC in the UK, we searched the literature for what is known about QoL in this population [19]. That search only found few existing reviews, all of which focused narrowly on QoL in PSC. One of these, a Cochrane review of pharmacological interventions for PSC, included QoL as an outcome measure [20]; another explored the impact of itch on QoL for cholestatic liver disease [21]. There was therefore a lack of clarity about how QoL had been measured in this population, which factors were important determinants of QoL in PSC, and how the condition affected QoL. Four more reviews by other groups have since been published, and were identified in later updates to the initial search: a systematic review assessing PROMs used in PSC [22], a systematic review identifying PRO instruments and concepts used in the PSC literature [23], a literature review exploring QoL in cholestatic disease [24], and a scoping review exploring the impact of PSC on psychological well-being [25].
The aim of this part of the doctoral study was to systematically review QoL-related outcomes in PSC. Due to the paucity of published literature, we formulated four separate review questions: (1) 'Which validated questionnaires have been used to assess QoL in people with PSC?' , (2) 'How does QoL in people with PSC compare with QoL in other groups?' , (3) 'What factors are associated with impaired QoL in people with PSC?' , and (4) 'Which interventions are effective in improving QoL in people with PSC?' . A separate systematic review [22] covered similar ground to our Review question [1], but was a broader review, identifying all PROMs used in PSC, and critically appraising included measures. Our review retrieved only validated measures of QoL, so is less broad and does not add to the work conducted by this other group. Our paper therefore reports only the findings from the last three of our review questions, identified as (A), (B) and (C) in the sections below.

Methods
This review was conducted and reported in line with the guidelines of the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA). The review was registered with the International Prospective Register of Systematic Reviews (PROSPERO), registration number: CRD42017071729 [26].

Search strategy
We developed a single systematic search strategy to locate relevant evidence across all the review questions: (A) How does QoL in people with PSC compare with QoL in other groups? (B) What factors are associated with impaired QoL in people with PSC? and (C) Which interventions are effective in improving QoL in people with PSC? We searched Embase, MEDLINE, PsycINFO, SCOPUS, and Web of Science databases from inception to 4 June 2019, with subsequent updates on 31 March 2020 [19], and then 1 November 2020. The key search terms used were 'primary sclerosing cholangitis' and 'quality of life' , along with synonyms and related terms. The search strategy was initially developed for MEDLINE (see the Additional file 1 for the full MEDLINE search strategy), and then translated for use in the other databases. In addition to the electronic database searches, we hand-searched the reference lists of included studies and relevant systematic reviews and conducted a forward citation search for included studies in Google Scholar. Authors were contacted where key data were missing from the published report. Authors of identified conference abstracts were contacted for full-text papers.

Inclusion and exclusion criteria
Studies were eligible for inclusion if they were primary studies of adult participants with PSC (≥ 18 years) which assessed QoL and were published from inception to the final database search (Nov 2020). For the purpose of this review we drew on work exploring QoL in relation to cancer, that is, health-related QoL [27], and defined QoL as a multi-dimensional construct comprising the impact of illness or treatment on a person's functioning and well-being in physical, psychological and social domains. Due to the paucity of literature exploring QoL in PSC, we included studies that used multi-dimensional QoL questionnaires, as well as studies that used questionnaires which focused on specific domains of QoL: physical symptoms (e.g. gastro-intestinal symptoms), psychological well-being and social functioning. When describing findings from multi-dimensional QoL questionnaires, we use the term QoL. Where studies report specific domains of QoL (e.g. depression), we explicitly name these. To include studies where participants had a range of conditions, we required at least 50% of the study sample to have PSC. Where < 50% of the sample had PSC, authors were contacted to request disaggregated data. We excluded studies of children and adolescents with PSC. Non-primary studies, such as systematic reviews and editorials, were also excluded. Due to resource constraints we limited publications to English language papers. Specific inclusion criteria for each review question were as follows. Question A: Any primary study, including cohort, cross-sectional and case-control studies, comparing QoL outcomes between PSC participants and any other group. Question B: Any primary study, including cohort, cross-sectional and case-control studies, exploring the association of any factor or attribute with QoL. Question C: Any randomised or non-randomised controlled study comparing any intervention with any comparator. Before-and-after studies were excluded.

Selection of studies
We exported citations from each electronic database search to Endnote (version X7) and removed duplicates. We screened titles and abstracts of identified studies for inclusion against agreed criteria. Two reviewers (EM, AMK) independently screened 10% of references, and, because the inter-rater reliability was good (88% agreement), one reviewer screened the remaining references. All primary-level studies included after the first scan of citations were acquired in full and re-evaluated for eligibility. Two reviewers (EM, AMK) independently screened all full-text papers using the inclusion criteria for reference. The percentage agreement was good (81%), and after discussion all disagreements were resolved.

Data extraction
A single reviewer (EM) extracted the following data using a pre-defined form: study design, date of publication, country of origin, setting, sample size, participant inclusion/exclusion criteria, participant characteristics, name of utilised QoL tool(s), comparator groups (for Question A), factors or attributes correlated with QoL (for Question B), and type/dose of intervention and comparator (for Question C). The following outcome data were extracted where relevant and available: type of outcome, name of outcome measure, direction of scale, mean, standard deviation, number of events, effect size, confidence intervals, p values, and correlation/regression coefficients (for Question B). Where these data were not reported, we extracted narrative descriptions of findings from the published report.

Quality appraisal
We assessed risk of bias at the study level using standard checklists. Different checklists were used depending on the design of the study. We assessed randomised controlled trials (RCTs) with Version 1 of the Cochrane Collaboration's Questionnaire for Assessing Risk of Bias in Randomised Trials [28], cohort and case-control studies with The Newcastle-Ottawa Scale (NOS) [29], and cross-sectional studies and surveys with the Critical Appraisal Questionnaire to Assess the Quality of Crosssectional Studies (AXIS) [30]. To aid the comparison of quality across studies, we assigned each observational study a quality rating following criteria published by Harbour and Miller [31] (Table 1). We did not assign individual study quality ratings to the RCT evidence as this is not recommended [28] and only few RCTs were included (n = 8).

Data analysis
We synthesised data narratively. Combining data in meta-analyses was inappropriate due to differences in outcome measures, participant groups and interventions. In addition, many studies did not report data in a format suitable for meta-analysis (e.g. reporting findings narratively). We report effect sizes, confidence intervals, and/ or p values if these were available in the original reports. To explore heterogeneity of findings for Question A, we conducted two sensitivity analyses post-hoc. The first limited the evidence to studies with a lower risk of bias

Study selection
We identified 2677 records, 1990 after the removal of duplicates, and 107 after screening titles and abstracts ( Fig. 1). Four additional articles were identified through a hand search of reference lists of relevant systematic reviews and included studies. Following a full-text appraisal, 28 studies (reported across 29 individual manuscripts) were included across the three review questions; two papers reported on the same data set [32,33].
Question A: How does QoL in people with PSC compare with QoL in other groups?

Study characteristics
Seventeen studies met the inclusion criteria: 13 crosssectional studies, two case-control studies and two cohort studies (Table 2). Each study compared QoL outcomes between PSC participants and up to seven other comparator groups. These comprised: (1) control groups    to 81% were male (median = 68%), and the proportion of participants with co-occurring IBD ranged from 60 to 100% (median = 77%).

Quality assessment
For the two case-control studies, we rated one as high quality [34], and one as moderate quality [35] due to significant differences in permanent work disability between groups, which may have confounded findings. We rated two cohort studies as moderate quality due to missing QoL outcome data at follow-up: in one study < 80% of postal EQ-5D data returned [36] and in one study < 80% of 15-D data were returned or complete [37]. For the cross-sectional studies, we rated one study as high quality [38], six as moderate quality [11,12,[39][40][41][42], and six as low quality [32,[43][44][45][46][47]. Low ratings were mainly due to a lack of clarity regarding the sample representativeness (e.g. sampling strategy not reported), significant differences between responders and non-responders, and due to the fact that analyses reported in the methods sections were missing from the results sections.

Evidence synthesis
Comparisons with healthy and community controls consistently indicated worse outcomes for people with PSC for: gastrointestinal symptoms [39], autonomic symptoms [41], physical and mental health functioning [11,45,46], and the impact of fatigue [41,43]. There was mixed evidence for the comparison of QoL outcomes between people with PSC and the general population. Three studies reported no significant differences between these groups for QoL [12], fatigue [32], psychological well-being [32], depression [32,39], or anxiety [32]. In contrast, one study suggested poorer mental health functioning for people with PSC [32], and one study found poorer QoL for people with PSC listed for a liver transplant compared with UK population norm values [36]. Another study compared people with PSC on the liver transplant list, who were symptomatic or asymptomatic (but indicated for liver transplant due to suspicious premalignant findings), with the general population in both the pre-transplant and post-transplant phase [37]. The symptomatic group had significantly poorer QoL compared with the general population, both pre-transplant and post-transplant. For the asymptomatic pre-malignant group, no significant difference was found at either timepoint [37]. Unexpectedly, one study found significantly higher levels of fatigue in the general population compared with people with PSC [39]. It should be noted, however, that the PSC sample in this study was small (n = 93) and that the response rate in the general population group was low (44%) which may have biased findings [39]. There was mixed evidence for the comparison of QoL between people with PSC and people with IBD only. One study reported significantly poorer QoL for people with IBD alone [12]. However, 45% of PSC participants in this study were asymptomatic. Two studies suggested no significant difference in QoL between people with PSC and IBD and people with IBD alone [34,35]. However, one of these studies [35] assessed QoL with an IBD-specific measure, which is unlikely to capture PSC specific experiences. Two studies suggested no significant difference between PSC and IBD participants for the impact of fatigue [39,41]. One study indicated no significant difference for psychological well-being [39], depression [39], or gastro-intestinal symptoms [39]. When compared with participants with PBC, PSC participants generally had higher QoL scores, but these differences were rarely significant. Seven studies indicated no significant difference between people with PSC and PBC for QoL [38,44,45,48], fatigue [41,43], or depression [40]. One study found significantly higher levels of daytime somnolence and autonomic symptoms in age-and gender-matched participants with PBC [41], however, the PSC group was small (n = 40). One study found that PSC participants had significantly greater QoL than participants with PBC, however, the PBC group were significantly older (57 vs. 35 years) and had a significantly higher proportion of participants with cirrhosis (46% vs. 26%) which may have biased the findings [47].
Studies comparing people with PSC with people with other liver-related conditions found generally similar levels of QoL [38,48], physical functioning [44] and fatigue [44]. One study however, indicated significantly better mental health functioning in patients with PSC compared to those with hepatitis C [44]. One study comparing people with PSC with people with chronic obstructive pulmonary disorder (COPD), heart disease, and type II diabetes found that people with PSC had significantly greater physical health functioning [45], but their mental health functioning was significantly poorer [45]. Another study reported less severe fatigue in PSC participants compared with people with chronic fatigue syndrome, and more severe fatigue in PSC participants compared with people with vasovagal syncope [43]. However, the authors did not conduct any statistical analyses for these differences.
An initial sensitivity analysis was conducted restricting the evidence to studies judged as being moderate to high quality, however this did not explain the heterogeneity of findings. A further sensitivity analysis was conducted to explore whether any of the contradictory findings for the between-groups comparisons could be explained by differences in the age and gender of PSC participants compared with other groups. Eight of the 17 included studies age-and gender-matched PSC participants to the included comparator groups [12, 32, 35-37, 39, 41, 46], however, the evidence was still mixed when limited to these studies.
In summary, the evidence from this review for Question A suggests that people with PSC have poorer QoL than healthy controls, and generally similar QoL to people with other chronic conditions. When compared with the general population the evidence was mixed; three studies suggested no differences between groups for QoL, fatigue and mental health outcomes, however, one study found poorer mental health functioning in PSC [32] and one study unexpectedly found more severe fatigue in the general population [39].

Question B: What factors are associated with impaired QoL in people with PSC? Study characteristics
Ten studies met the inclusion criteria: nine were crosssectional studies and one was a prospective cohort study, although only cross-sectional data were used in this review (Table 3). Factors associated with QoL comprised: demographic variables, symptoms, co-morbid conditions, clinical features of PSC (e.g. presence of liver cirrhosis), prognostic scoring systems (e.g. Mayo Risk Score [49]), and biochemical and genetic markers. Sample sizes ranged from 29 to 341 participants (median = 107), four of which had sample sizes < 100 participants. Where reported, the mean age of participants ranged from 35 to 55 years (median of means = 43), the proportion of men ranged from 54 to 72% (median = 67%), and the proportion of PSC participants with co-occurring IBD ranged from 61 to 79% (median = 71%).

Quality assessment
Quality appraisal with the AXIS tool indicated four studies as being of moderate quality [11,12,42,47] and six as being of low quality [32,45,46,48,50,51]. Low ratings were mainly due to a lack of information about the selection of participants, the representativeness of the sample, a lack of information about non-responders, a high proportion of non-responders, and significant differences between responders and non-responders which may have biased outcomes.

Evidence synthesis
There was mixed evidence for the association between age, gender and QoL. Three studies suggested that older age was associated with poorer QoL [12,32,46], however two studies found no significant association [11,51]. Two studies suggested women with PSC had significantly poorer mental health functioning [46], physical functioning [51] and limitations on routine activities due to emotional problems [51]. However, three studies indicated no association between gender and physical functioning [11], mental health functioning [11], or overall QoL [12]. One study found a positive association between employment and QoL, and a negative association for marital status [11].
With regards to symptoms, three studies indicated that the experience of itch [11,12,51], pain [12] and jaundice [12] were associated with worse QoL. For fatigue, however, there were contradictory findings: one study suggested a negative association with QoL [12] and one study indicated no significant association [11]. In each of these studies, fatigue was measured in different ways: one study [12] used a single item to assess the presence or absence of fatigue, whereas the other study [11] used the fatigue sub-scale of the PBC-40 [52]. Fragility fractures [50] and the number and severity of co-morbid conditions (of any kind) [32] were both found to be significantly associated with poorer QoL outcomes. Three studies did not find that the presence of co-morbid IBD impacted QoL [12,46,51]. However, another study found that people with PSC with more severe IBD had significantly poorer mental health functioning than those with milder IBD [11].
Two studies found that liver cirrhosis was associated with poorer physical functioning [32,46], but not with mental health functioning [32,46], and that people with large-duct PSC as opposed to small-duct PSC had poorer mental health functioning [32]. In contrast, another study indicated that having a dominant stricture was not associated with impaired QoL [51]. Three studies explored the impact of elevated serum alkaline phosphatase (ALP) on QoL outcomes [11,32,46], however only one of these studies found significantly worse QoL for people with elevated ALP [32]. One study further explored the impact of elevated alanine transaminase, gamma-glutamyl transferase and bilirubin on QoL outcomes, but only found a significant association between elevated bilirubin and one of eight subscales of the medical outcomes study short form (SF-36) (bodily pain) [46]. Four studies stratified disease severity with three prognostic scoring  systems: the Child's-Pugh score [53], the Mayo Risk score [49], and the modified ERC score [54]. Three studies found no association between QoL and disease severity as stratified with the Child's-Pugh score [45,48] or the Mayo Risk score [51]. One study unexpectedly found better QoL among participants with more advanced disease according to the ERC score, although the correlation was weak (β = 0.014; p = 0.045) [12]. One study suggested that people with a genetic polymorphism of the vitamin D receptor had poorer QoL [42]. One further study indicated no association between elevated serum autotaxin and QoL outcomes [47], except for the itch sub-domain of the PBC-40 and PBC-27 tools.
In summary the evidence from this review for Question B suggested that symptoms (e.g. itch and pain), co-morbid conditions, liver cirrhosis and large-duct PSC were associated with impaired QoL. In contrast, prognostic scoring systems and markers of disease progression commonly used as outcomes in clinical trials (e.g. ALP) did not consistently correlate with QoL.

Question C: Which interventions are effective in improving QoL in people with PSC? Study characteristics
Nine studies met the eligibility criteria: eight RCTs and a retrospective case note review (Table 4). Of the RCTs, seven investigated the efficacy of pharmacological interventions: two types of bile acids (ursodeoxycholic acid and nor-ursodeoxycholic acid), an immunosuppressant drug (infliximab), an engineered version of the hormone FGF19 (aldafermin or NGM282), antibiotics (vancomycin and metronidazole), and an antidepressant (fluoxetine). One RCT compared the efficacy of stent dilation with balloon dilation for PSC patients with dominant strictures [55]. The retrospective case note review assessed outcomes in people with PSC and ulcerative colitis who had undergone a restorative proctocolectomy with ileal pouch anal anastomosis compared with those who had not had surgery [56]. Sample sizes ranged from ten to 219 (median = 40). The majority of participants were men (median = 68%), in their early forties (median = 43), and with a diagnosis of IBD (median = 75%). For the RCTs, the length of study follow-up ranged from 10 to 260 weeks.

Quality assessment
For the RCT evidence there was a high risk of attrition bias in five trials [55,[57][58][59][60] due to high level or unequal drop-out, and a high or unclear risk of selective outcome reporting in seven trials [55,[57][58][59][60][61][62] as QoL outcomes were rarely listed in the available study protocols (Table 5). Two trials had a high risk of other bias, due to significant differences at baseline in the proportion of male participants [60,63], significant differences in fatigue scores at baseline [60] and due to a lack of information about how continuous outcomes were dichotomised in the analysis [63]. The retrospective cohort study [56] was judged as having a moderate risk of bias as it is likely that patients with PSC who had surgery had more severe ulcerative colitis than patients who did not receive surgery, and this may have confounded the findings.

Evidence synthesis
For the RCT evidence, five studies indicated no significant differences between intervention groups for QoL [55,57,59,61], the impact of fatigue [59,61], itch [61,62] or cholestatic symptoms [55]. One study [63] reported significant within-group improvements for itch at 12 weeks' follow-up for the vancomycin and placebo group, however, between-group differences were not reported. One study [60] also reported a significant within-group improvement for itch for participants at 12 weeks' follow-up in the high-dose metronidazole group, but not for the other three intervention groups (between-groups differences were not conducted). No within-group improvements were found for fatigue in any of the four intervention groups. In one study [58], QoL data (measured with the Short Form-36) were only available for seven participants and were not analysed.
The retrospective cohort study found poorer QoL for patients who had undergone surgery for ulcerative colitis compared with those who had not had surgery, however, these differences were not significant [56]. There were no differences between groups for male sexual function. The authors were unable to assess female sexual function due to the small number of female participants (n = 7).
In summary, this review for Question C found no studies where interventions improved QoL, the impact of fatigue, or specific symptoms such as itch and pain.

Main findings
Despite the broad scope of this review and our inclusion of studies assessing QoL as well as specific domains of QoL, we identified few studies for each review question, and we were unable to conduct any meta-analyses. Most of the identified studies for Question A compared QoL outcomes in people with PSC with other groups (n = 17). The Question B search found ten studies exploring whether specific factors or attributes were associated with impaired QoL. However, seven of these studies were also included in Question A. For Question C, only eight RCTs were found assessing the effect of pharmacological or surgical interventions on QoL, three of which studies focused only on patient-reported symptoms [60,62,63] as opposed to composite measures of QoL.
In Question A the evidence indicated poorer QoL for people with PSC compared with healthy controls, comparable QoL to people with other chronic conditions, but mixed findings for comparisons with the general population. The evidence in Question B suggested that symptoms, IBD severity, liver cirrhosis, and large-duct PSC were all associated with impaired QoL. No associations were found between QoL and PSC severity measured with surrogate markers of disease progression (e.g. ALP) or one of three prognostic scoring systems. In Question B many of the identified factors were explored in a single study which made it challenging to derive any firm conclusions. In Question C, no interventions were found to improve QoL outcomes. With the exception of five studies [12,32,42,57,59], in each review study sample sizes were small, with no more than 120 participants with PSC. Various QoL measures were used across studies, which may explain some of the heterogeneity of findings.

Strengths and limitations
To our knowledge this is the first paper to systematically review the literature for studies exploring QoL in PSC. We conducted a comprehensive search across five electronic databases, as well as reviewing reference lists of relevant studies and conducting a forward citation search for included studies. Two reviewers (EM, AMK) independently screened 10% of identified references and all full-text papers, with a high-level of agreement. The review was limited in that data extraction and the quality of individual studies was assessed by a single reviewer. Due to resource limitations, we only included English language papers.
In contrast to previous literature reviews [24,25], one of our inclusion criteria was studies where the majority of participants had a PSC diagnosis. This will have limited the available evidence, because a number of studies have explored QoL more broadly with people with cholestatic disease or chronic liver disease, which may include a subset of PSC participants. None of the studies we included had a mixed population, because no identified studies included a majority of PSC participants. We excluded 19 studies in which 3-36% of the sample had PSC, as well as eight studies which did not provide information about the diagnoses of participants. Authors were contacted to request this information and the disaggregated data, but these were only provided for a single study [61]. It is challenging to recruit participants with rare conditions such as PSC [15], however, this inclusion criterion is important because there are key differences between PSC and with other liver conditions. For example, unlike PSC, PBC predominantly affects women, is not associated with IBD, and has a more predictable clinical course [7,64]. Factors such as these are likely to affect QoL, and so extrapolating findings from other liver-related conditions (where participants with PSC are in a minority) may not be appropriate.

Implications and gaps in the literature
It is clear that PSC can have a detrimental impact on QoL. However, findings from the studies identified were mixed for comparisons of people with PSC with the general population, even though participants were age-and gender-matched. These study findings were based on scores from generic questionnaires, such as the SF-36 [65] and the 15-D instrument [66], and it is possible that these measures only have a limited relationship to the experiences of people with PSC. Six of these studies were also small scale: two included < 50 participants, and four included < 100 participants. This reduces our confidence in the findings, with a risk of type II errors (i.e. failing to reject the null hypothesis when it is false). Another possible hypothesis is that, at the group level, QoL for people with PSC is similar to the general population, due to the relatively high proportion of people with PSC who are asymptomatic. Only one study in this review reported on the proportion of asymptomatic participants, which was as high as 45% [12]. These data were missing from the other published reports, and so based on the available evidence this question remains unanswered. A co-morbid diagnosis of IBD in people with PSC was not found to impact on QoL, however the severity of IBD symptoms was found to be independently associated with impaired QoL. Although IBD in PSC tends to be mild [67], this finding suggests it is important to monitor IBD related symptoms and impacts when assessing QoL in PSC [11]. As expected, the experience of symptoms such as itch, pain and jaundice were associated with worse QoL, but there were contradictory findings for the impact of fatigue. Fatigue is a debilitating symptom which is commonly experienced by people with PSC [68]. These findings suggest that generic measures of QoL may not adequately capture the impact of fatigue, indicating a need for a disease-specific measure of QoL for PSC.
Surrogate markers of disease progression are commonly used as primary outcomes in clinical trials in place of "harder" outcomes such as cirrhosis or mortality, which can take a long time to occur making them unsuitable for trial design [69]. Despite the common use of these markers, no single method has been recommended to predict individual patient prognosis in PSC [6]. In this review, disease progression as measured with surrogate markers (e.g. ALP) and with prognostic scoring systems (which include such markers) was not found to correlate with assessments of QoL. In light of these findings we recommend the inclusion of assessments of QoL in all future trials. We only identified five RCTs which used a composite measure of QoL as an outcome, which suggests this integration is currently lacking.
The evidence base was limited across all review questions with few studies included in each comparison, many of which had small samples. In addition, many studies reported QoL data narratively or only provided p values which meant it was not possible to explore the magnitude of significant findings. Key gaps in the literature include the impact of fatigue on QoL, only explored in two studies with conflicting findings, and the impact of IBD severity, only explored in a single study. The available literature only explored the association of clinical or demographic factors with QoL. Studies exploring psychological or social factors, such as self-efficacy or social support, which can also impact on QoL [70], were lacking. We found very limited evidence exploring QoL in people with large-duct PSC compared with small-duct PSC, as well as in people with and without a dominant stricture, features which are associated with the severity of PSC, transplant-free survival and risk of cancer [1,2]. Staging or stratifying PSC is challenging, however, newer methods such as the enhanced liver fibrosis (ELF) test have shown greater sensitivity than existing models in predicting outcomes, particularly in the earlier stages of the condition [71,72]. We did not find any evidence exploring how this method of staging correlates with the lived experiences of people with PSC. Future research studies should address this topic.

Conclusion
This review found few studies exploring QoL for people with PSC. Those included found worse QoL for people with PSC than for healthy controls, and similar QoL to people with other chronic conditions. Comparisons with the general population were mixed. Many studies included small numbers of participants, which is due in large part to the rarity of the condition. Those studies found that symptoms, severity of IBD, liver cirrhosis, and large-duct PSC were all associated with worse QoL. No interventions indicated any evidence of benefit on QoL. Studies assessing disease severity using three prognostic scoring systems and markers of disease progression did not find any correlations with QoL. We recommend that larger-scale clinical research studies are conducted, measuring QoL alongside clinical and laboratory-based outcomes. We also recommend the development of a valid, responsive, and PSC-specific measure of QoL for use in such studies, since generic measures of QoL may not cover all issues of importance to people with PSC, such as fatigue.