Systematic literature review and assessment of patient-reported outcome instruments in sickle cell disease

Background Sickle cell disease (SCD) is a chronic condition associated with high mortality and morbidity. It is characterized by acute clinical symptoms such as painful vaso-occlusive crises, which can impair health-related quality of life (HRQL). This study was conducted to identify validated patient-reported outcome (PRO) instruments for use in future trials of potential treatments for SCD. Methods A systematic literature review (SLR) was performed using MEDLINE and EMBASE to identify United States (US)-based studies published in English between 1997 and 2017 that reported on validated PRO instruments used in randomized controlled trials and real-world settings. The COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) checklist was used to assess the quality of PRO instruments. Results The SLR included 21 studies assessing the psychometric properties of 24 PRO instruments. Fifteen of those instruments were developed and validated for adults and 10 for children (one instrument was used in both children and young adults aged up to 21 years). Only five of the 15 adult instruments and three of the 10 pediatric instruments were developed specifically for SCD. For most instruments, there were few or no data on validation conducted in SCD development cohorts. Of the 24 PRO instruments identified, 16 had strong internal reliability (Cronbach’s α ≥0.80). There was often insufficient information to assess the content validity, construct validity, responsiveness, or test-retest reliability of the instruments identified for both child and adult populations. No validated PRO instruments measuring caregiver burden in SCD were identified. Conclusions The evidence on the psychometric properties of PRO instruments was limited. However, the results of this SLR provide key information on such tools to help inform the design of future clinical trials for patients with SCD in the US. Electronic supplementary material The online version of this article (10.1186/s12955-018-0930-y) contains supplementary material, which is available to authorized users.


Background
Sickle cell disease (SCD) is a lifelong, multisystem condition characterized by hemoglobin polymerization that leads to erythrocyte rigidity, hemolysis, and vasoocclusion. Prevalence estimates for the United States (US) in 2016 suggest that between 70,000 and 100,000 people had SCD [1,2]. Also, a further estimated 3.5 million individuals had the sickle cell trait [1,2], meaning they were carriers of one of several autosomal recessive alleles responsible for the disease. The most common SCD genotype is HbSS, and the disease is most prevalent in people of African ancestry [1].
Vaso-occlusive crises (VOC) and pain associated with such crises are hallmark symptoms in SCD, and typically first manifest in infants around the age of 5 months. These painful episodes can occur without warning and have been described as sharp, intense stabbing or throbbing. The pain can be debilitating, resulting in frequent emergency department (ED) and hospital visits. Furthermore, complications of SCD, such as anemia, infection, stroke can have major physiological, cognitive, and emotional effects on patients [2,3].
Current US guidelines for SCD management focus mainly on health maintenance and treatment of acute and chronic complications [4]. Health maintenance recommendations include prophylactic penicillin treatment and pneumococcal vaccination in patients with asplenia [4], and screening or diagnostic tests for SCD-related complications; and supportive management includes treatment with antibiotics, pain crisis management, and blood transfusions. Stem cell transplantation is the intervention most likely to be curative, but has many risks and is not performed frequently [5][6][7]. Because there is currently no pharmacotherapeutic cure for SCD, and in most cases management of the disease is palliative, a key therapeutic goal is to reduce the occurrence of painful crises. For this, the traditional mainstay treatment has been the antineoplastic agent hydroxyurea. This drug helps prevent crises in both adults and children by increasing the amount of fetal hemoglobin found in patients' red blood cells (RBCs), thus leading to various beneficial effects on RBC structure, content, and function [8,9]. In turn, this reduces the need for transfusion and the likelihood of organ damage. More recently, an alternative therapy, L-glutamine was also approved by the US Food and Drug Administration (FDA) for the treatment of SCD in children and adults, with the aim of reducing severe SCD-related complications [10].
Despite the availability and use of hydroxyurea and L-glutamine, SCD remains a disease with major unmet needs, with many patients experiencing poor clinical outcomes in both the short and longer term. There is also substantial evidence suggesting that SCD is associated with a considerable impairment of patients' burden with SCD. However, characterizing the nature and extent of this humanistic deficit, and whether or how it differs between patient subgroups or with disease stage, is hampered by a lack of clarity about which (if any) of the patient-reported outcome (PRO) instruments used todate are best able to capture patients' experience of SCD. This lack of clarity has major implications for the investigation into potential new treatments for SCD. In particular, it raises questions about how best to assess whether, or to what extent, such interventions affect humanistic outcomes. Therefore, to inform recommendations of PROs that might be suitable for use in future SCD clinical trials, a systematic literature review (SLR) was conducted to identify, summarize, and evaluate PRO instruments that have been developed and/or validated in previous US trials and observational studies of SCD.

Identification of studies
The SLR was conducted using transparent and reproducible methods, in accordance with standards recommended by the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) Statement [11] and the Cochrane Handbook for Systematic Reviews [12]. A single systematic search was conducted in Embase, Embase In-Process, MEDLINE, and MEDLINE In-Process, to identify studies of interest on PRO instruments, published in English between 1997 and 2017. Specifically, search terms for SCD were combined with terms for psychometric properties of PRO instruments. The search strategy (detailed in Additional file 1) included a combination of free-text search terms and controlled vocabulary terms as recommended by the Cochrane Collaboration [13]. No grey literature conference abstracts were considered for the search because these would have provided inadequate detail for the purposes of the review. Bibliographies of all relevant systematic reviews and/or meta-analyses identified during the search were also reviewed to identify any additional relevant publications.

Study selection
To identify the most relevant studies for inclusion in the review, publications identified from the electronic database searches were screened against predefined inclusion and exclusion criteria (detailed in Additional file 2) in a two-stage selection process. First, the titles and abstracts of all unique citations from the searches were reviewed against the selection criteria. Second, the full-text versions of all the publications that had been considered relevant at the first stage were assessed to determine which studies should be included in the review. All records were reviewed by one researcher, with validation of 50% of records at both screening levels being performed by a second researcher. A third researcher resolved any discrepancies and confirmed inclusion or exclusion where appropriate.

Data extraction
Relevant data from the included publications were entered into a standardized predesigned extraction template by one investigator, and then validated by a senior researcher. A third reviewer was consulted to resolve any disagreements.

Quality assessment
Quality assessment was conducted for the identified PRO instruments by one researcher and validated by a second researcher, using an abbreviated version of the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) checklist [14]. This checklist assesses the methodological quality and performance of PRO instruments across various characteristics as reported in psychometric-evaluation studies. The checklist was abbreviated for this study to focus on characteristics that met the FDA criteria [15] for the evaluation of PRO instruments for use in clinical trials: reliability, validity, and responsiveness: Reliability is the degree to which the measurement is free from measurement error, as indicated by the extent to which scores for patients who have not changed are the same for repeated measurement under several conditions [14]. Studies were considered to have strong internal consistency reliability when Cronbach's alpha was ≥0.80; Validity is the degree to which a health-related PRO instrument measures the construct it intends to measure [14]. The FDA also considers whether similar patients to those participating in the clinical trial have confirmed the relevance of items in the PRO instrument [16]; Responsiveness is the ability of a health-related PRO instrument to detect change over time in the construct to be measured [14]. The FDA considers whether responsiveness has been demonstrated in a comparative trial setting [16].

Study inclusion
The electronic database search yielded 504 unique records. After title and abstract screening, 46 citations were considered relevant for full-text review. Following full-text assessment, 19 studies reporting on the psychometric properties of PRO instruments were identified, and two more articles were added from manual searches of bibliographies of published SLRs. Thus, a total of 21 publications met all inclusion criteria. The selection of studies from the initial search yield to the final number of included studies, using the PRISMA guidelines, is presented in Fig. 1.

PRO instruments
The 21 studies included in the SLR reported on a total of 24 PRO instruments that had been developed and/or validated in populations with SCD in the US. Fifteen of the instruments (represented in nine publications) were for use with adults, and 10 instruments (in 12 publications) were for children through age 17 years (one instrument was used in both children and young adults aged up to 21 years, and so was included in both populations). No validated PRO instruments designed for caregivers of children with SCD were identified. For most instruments, there were few or no validation data from studies conducted in SCD development cohorts. All studies evaluating adult PRO instruments were crosssectional studies. Nine studies evaluating pediatric PRO instruments were cross-sectional, and one each was longitudinal, retrospective, or a medical chart review. The most common outcomes evaluated by instruments were coping, self-esteem, or self-efficacy (n = 8), followed by health-related quality of life (HRQL; n = 5), pain (n = 4), and family impact (n = 2). Depression, functioning, spirituality, stigma, and treatment satisfaction were each evaluated with one instrument.

Quality assessment results
As previously mentioned, quality assessment of the PRO instruments was conducted using the abbreviated COSMIN checklist. Of the 24 instruments, 16 were rated strong (nine of the 15 adult instruments and seven of the 10 pediatric instruments). Overall, insufficient information was reported in the included studies to assess the content validity, construct validity, responsiveness, and test-retest reliability of most instruments identified in both adult and child populations. Quality assessment results for the adult instruments are presented in Table 1 and for the pediatric instruments in Table 2.

PRO instruments in adults with SCD
The SLR identified 10 publications [17][18][19][20][21][22][23][24][25][26] reporting on psychometric properties of 15 PRO instruments validated in adults with SCD in the US. Of these instruments, six assessed coping, self-efficacy, or self-esteem [18,21,22,25], three assessed patient pain [17,23,26], and one each assessed depression [20], family impact [22], quality of life [24], spirituality [17], stigma [20], and treatment satisfaction [19]. Five instruments had been developed specifically for adults with SCD [18,19,22,[24][25][26]. Most of the included studies did not provide sufficient information on the psychometric properties to assess construct or content validity, test-retest reliability, or responsiveness of the instruments concerned. However, most studies reported strong or good internal reliability and consistency. None of the included studies provided information regarding the threshold of minimally important change (sometimes called minimal important difference [MID]) in health status for any of the instruments reviewed. An overview of the three psychometric properties included in this assessmentvalidity, reliability, and responsivenessis given below. Additional details about the identified adult PRO instruments are provided in Table 1.

Validity
Content validity Four of the 15 instruments, which measured self-efficacy [25], pain [26], quality of life [24], and treatment satisfaction [19], reported sufficient information to assess content validity. Of these, three instruments [19,25,26] were rated good, while one [24] was rated strong, indicating a higher ability of the instrument to adequately reflect the construct being measured. Three instruments were specifically developed for adults with SCD [19,24,25]; one additional instrument was developed including young adults up to age 21 years [26]. There was not adequate information on the remaining 11 instruments to assess content validity.
Construct validity Two instruments (assessing pain [26] and quality of life [24]) had good construct validity, indicating a higher degree to which the scores of the health-related PRO instrument are consistent with the hypothesis. Both of these instruments were developed for patients with SCD. One instrument [22], assessing family impact and not developed specifically for SCD, had weak construct validity. For the remaining 12 instruments, there was insufficient information available to assess their construct validity.
Test-retest reliability Only one SCD-specific instrument [26] assessing pain had good test-retest reliability, indicating consistency in the test over time. For the remaining instruments, the available information was insufficient to assess their test-retest reliability.

Responsiveness
For none of the identified instruments was there sufficient information available to assess their responsiveness to change in the measured construct.

PRO instruments in children with SCD
Ten PRO instruments that had been validated in children with SCD in the US were identified across 12 studies [26][27][28][29][30][31][32][33][34][35][36][37]. Four of these instruments were related to the assessment of generic HRQL [28,29,[33][34][35][36]; two instruments each assessed children's pain [26,31] and coping mechanisms with SCD [27,30]; and one instrument each assessed the functional impact of experiencing pain [37] and the family impact of caring for a child with SCD [32]. Overall, only three of these instruments were developed specifically for children with SCD [26,27,36]. Most of the included studies provided no information on the psychometric properties of the instruments they reported on, in terms of the construct and content validity, test-retest reliability, or responsiveness. Furthermore,  As this instrument includes young adults up to age 21 years, it was included in both the Adult and Pediatric categories Note: "Weak" indicates poor performance (e.g., evidence of poor reliability) or a weakness that should be considered within the trial design (e.g., requires significant input by research team to administer, or no availability of language translations); "Good" indicates adequate or moderate performance (e.g., adequate reliability) or only mild limitations (e.g., availability of a small number of language translations, absence of evidence in a minority of adult patients (e.g., older adults)); "Strong" indicates excellent performance on all reported indicators (e.g., all subscales report excellent reliability; evidence) or notable advantages for use within a trial (e.g., freely accessible, wide range of language translations); "Unclear" indicates where no or insufficient evidence was reported to assess, or where evidence reported was conflicting (e.g., some subscales show excellent reliability while others did not) Abbreviations: HRQL health-related quality of life, SCD sickle cell disease  Note: "Weak" indicates poor performance (e.g., evidence of poor reliability) or a weakness that should be considered within the trial design (e.g., requires significant input by research team to administer, or no availability of language translations); "Good" indicates adequate or moderate performance (e.g., adequate reliability) or only mild limitations (e.g., availability of a small number of language translations, absence of evidence in a minority of adult patients (e.g., older adults)); "Strong" indicates excellent performance on all reported indicators (e.g., all subscales report excellent reliability; evidence) or notable advantages for use within a trial (e.g., freely accessible, wide range of language translations); "Unclear" indicates where no or insufficient evidence was reported to assess, or where evidence reported was conflicting (e.g., some subscales show excellent reliability while others did not) Abbreviations: HRQL health-related quality of life, SCD sickle cell diseas the reviewed studies did not assess the threshold of MID in health status for any of the instruments. However, internal consistency reliability was considered to be good for most of the instruments reviewed. An overview of the three psychometric properties (validity, reliability, and responsiveness) is given below. Additional details about the identified pediatric PRO instruments can be found in Table 2.

Validity
Content validity Two instruments assessing pain had good content validity, indicating that they were an adequate reflection of the construct to be measured. Of the two, one instrument was developed for children with SCD [26], while the other was not [31]. For the remaining eight instruments, there was insufficient information for assessment of content validity.
Construct validity Two instruments, one measuring generic HRQL (not SCD-specific) [29] and one measuring pain (SCD-specific) [26] reported good construct validity. One instrument developed specifically for children with SCD to measure self-efficacy [27] had weak construct validity. Seven identified instruments did not provide adequate information to assess this component.
Test-retest reliability Only one instrument [26] examining pain, was rated as having good test-retest reliability. This instrument was developed specifically for patients with SCD. For the other nine instruments, insufficient information was reported to evaluate this component.

Responsiveness
For none of the identified instruments was sufficient information available to assess their responsiveness to change.

Discussion
SCD represents a major challenge for patients, their families, and health care professionals. As a lifelong debilitating condition punctuated by severe, potentially life-threatening acute crises, it poses multiple threats to health and well-being. Against this background, and to inform the conduct of future randomized controlled trials in patients with SCD, the current study aimed to provide insights into the psychometric properties of validated PRO instruments used to-date in the disease. Specifically, it systematically identified and evaluated relevant US-based studies that reported on and critiqued PROs spanning HRQL, key symptoms, and attitudinal responses to SCD in children with the condition, their caregivers, and adult patients. Treatment cost and the impact of treatment on overall health care use and costs (i.e., budget impact) are primary considerations when making coverage and reimbursement decisions [38]. Traditionally, US payers have not considered PROs to inform decisions on health care in this setting [39]. However, the market access landscape is changing, and PROs may now be poised to play a more important role in payer decisions, as evidenced by the recognition that PROs are important for evaluating symptoms and therapy impact on functioning [40] and increased patient engagement and participation in treatment decision-making. Assessing the patient perspective, in terms of PROs, is considered one of the primary outcomes to focus on and incorporate into all clinical trials proposing novel interventions, devices, or pharmaceuticals that aim for FDA and other regulatory or reimbursement approval [15]. However, a significant challenge in PRO research is demonstrating the measurement value of these tools that best describes the patient's experience and what is considered as "acceptable measurement criteria" by regulatory and reimbursement bodies [40,41]. Use of poorly developed PRO measures with inadequate psychometric evidence or those designed for a purpose that differs from their actual use can have significant implications and lead to distorted, inaccurate, or equivocal findings [42,43]. Instruments should therefore be chosen based on relevance and their applicability in the context of the proposed disease area to produce reliable estimates of patients' experiences. Although the match between the content coverage and content validity is important, convincing evidence is also needed to confirm the reliability, validity, and responsiveness of PRO measures used in clinical trials. The FDA has displayed an interest in patient-centered drug development in patients with SCD, through the Patientfocused Drug Development initiative [10]. This program aims to gather patient perspectives on SCD, specifically the effects that most impact patients, current available therapies, and participation in clinical trials.
It is also important to note the increase in health technology assessment activities by groups that provide US payers with evaluations related to coverage and reimbursement. Such activities currently focus on clinical efficacy and economic outcomes or budget impact, with limited emphasis on PROs and HRQL. However, it is likely that, in the future, health technology assessment valuations to inform US payers may directly incorporate patient perspectives and efficacy as assessed through PROs. For that reason, this SLR aimed to include only PRO instruments either being developed for use or being validated in a SCD population. Based on the SLR of evidence in a US-based population, guidance on use of currently available PROs and recommendations for further research are listed below.

Guidance for PRO use in SCD populations based on SLR findings
Consider using the PedsQL™ SCD Module to assess SCD-specific impact in children. This instrument allows for evaluation of various concepts, including pain, fatigue, productivity, activity, and emotion. Consider using the Sickle Cell Disease Pain Burden Interview-Youth (SCPBI-Y) to assess the impact of painful crises in children aged over 7 years. Use a short generic pain assessment tool, such as the Brief Pain Inventory, or numerical rating scales for assessing pain intensity for adults. Use the Adult Sickle Cell Quality of Life Measurement System (ASCQ-Me) to assess patient-reported HRQL in adults. This instrument allows for evaluation of patients' pain, fatigue, productivity, activity, and emotion. Use the ASCQ-Me Quality of Care (QOC) instrument to assess patient perceptions of accessibility of care and the quality of interactions with health care providers.

Recommendations for further PRO research in SCD populations
There should be validation of a generic preferencebased PRO instrument to evaluate general aspects of HRQL. There should be validation of specific instruments for younger children (ages 5-12 years) (caregiver report version of the SCPBI-Y and Psychosocial Assessment Tool 2.0) and for adults (ASCQ-Me, SCIPBI-Y). For relevant outcomes of interest (function, psychological well-being), there should be validation of the PedsQL™ 4.0 Generic Core Scales for children and the ASCQ-Me for adults.
There should be validation (or, if none exist, development) of instruments that assess other outcomes of interest (e.g., cognitive impairment, school/work performance and attendance, treatment satisfaction). There should be piloting of administration of identified instruments using electronic devices (e.g., tablets, phone apps).