Skip to main content

LC-PROM: Validation of a patient reported outcomes measure for liver cirrhosis patients

Abstract

Background

The aim of the study is to develop a specific patient-reported scale of liver cirrhosis according to the Patient Reported Outcome guidelines of the Food and Drug Administration (FDA), and to examine its capacity to fill gaps in this field.

Methods

A conceptual framework was developed and a preliminary item pool developed through literature review and interviews of 10 patients with liver cirrhosis. With the preliminary items, we performed a pilot survey that included a cognitive test with patients and interviews with experts; the focus was on content and language of the scale. In the item selection stage, seven statistical methods including discrete trends method, discrimination analysis, exploratory factor analysis, Cronbach’s α coefficient, correlation coefficient, test-retest reliability, Item-Response Theory were applied to survey data from 200 subjects (150 liver cirrhosis patients and 50 controls). This produced the preliminary Liver Cirrhosis Patient-reported Outcome Measure (LC-PROM). In the next stage, we conducted the survey with 620 subjects (500 patients and 120 controls) to validate reliability, validity and acceptability of this scale.

Results

The 55 items and 13 dimensions addressed four domains: physical, psychological, social, and therapeutic. Cronbach’s α coefficients were 0.921 for the total scale; the confirmatory factor analysis, t-tests and ANOVA supported scale validity; the model fit index as Root Mean Square Error of Approximation (RMSEA), Root Mean Square Residual (RMR), Normed Fit Index (NFI), Non-Normed Fit Index (NNFI), Comparative Fit Index (CFI) and Incremental Fit Index (IFI) met the criterion generally. The acceptance ratio and response rate indicated good feasibility.

Conclusions

This study developed an accurate and stable patient-reported outcome scale of liver cirrhosis, which is able to evaluate clinical effects effectively, is helpful to patients in recognizing their health condition, and contributes to clinical decision making both for patients and physicians. Additionally, the LC-PROM can perform as an ultimate assessment of medical and health care effects and can inform clinical trials of new drugs for liver cirrhosis.

Background

Liver cirrhosis (LC) is a potential consequence of the progression of any of various kinds of liver disease, and the high incidence of hepatitis will lead to a large number of patients suffering from liver cirrhosis. LC is characterized by fatigue, digestive disorders, bleeding and anemia, endocrine disorder, hypoproteinemia, portal hypertension and other serious symptoms that cause great pain to patients physically, impacting their daily social life. As an irreversible, chronic, progressive disease. LC can not be cured completely at the present stage. Particularly for weak patients, the common treatments used in the clinical can cause secondary damage in addition to harm caused by the disease itself.

At present, patients’ health status and treatment effects are evaluated by hepatic function test and serological markers, or reflected by hospital stays and symptom improvement over time. However, with the continued development of a biopsychosocial medical model the use of scales to assess patients’ fitness has been widely accepted and applied internationally; that is, patients’ personally reported data, dubbed patient-reported outcome (PRO), are used to measure clinical results. One of the arguments for using questionnaires to ask patients to judge their own health-related quality of life (HRQoL) is that it has been shown that physicians are generally unable to make accurate judgments of patients’ HRQoL. Physicians’ judgments not only deviate from those of patients, they also differ among one another. This latter variability makes it particularly difficult to obtain ‘objective’ judgments of HRQoL [1].

The PRO Harmonization Group, which consists of the Food and Drug Administration (FDA), International Society For Pharmacoeconomics and Outcomes Research (ISPOR), the European Regulatory Issues on Quality of Life Assessment Group (ERIQA), and the International Society for Quality of Life Studies (ISQOL), proposes that evaluation of clinical curative effects should contain data from physicians’ reports, physiological measures, caregivers’ reports, and PROs, which come solely from the patient. In the course of a disease, there are some symptoms that can only be experienced by patients; i.e., these symptoms cannot be reflected by physical measures. In this case, the normal reference values of medicine do not equal true health; additionally, physician report data are always processed through the subjective consciousness and may only include contents related to the physician’s concerns. What’s more, this report is limited by physicians’ knowledge and experience. Therefore, PROs play an important role in clinical practice, and this method is now generally accepted by experts and patients alike. Since the publication of the draft guide for new drug development and curative effect evaluation in February 2006 [2], PROs are becoming more important in assessment of treatment outcome and in new drug registration.

A PRO instrument specific to LC could provide several benefits: it could help improve the evidence base through research assessing effectiveness of LC therapies; facilitate clinician-patient communication and shared decision making; help prioritize patient problems and preferences; monitor changes or outcomes of treatment; measure the performance of healthcare providers and services; and be incorporated in clinical audits [35].

In short, the aim of this study is to develop such a PRO scale that meets the following criteria: (I) specific to liver cirrhosis; (II) addresses all physical symptoms, psychological feelings, daily activities, and therapeutic status related to LC; (III) comprises items that are founded on the patients’ own perspective; (IV) has good internal consistency, a reasonable theoretical framework and can distinguish different severities of the disease; and (V) is of appropriate length and has strong feasibility.

Methods

The Medical Ethics Committee of Shanxi Medical University provided ethics approval, and all participants signed informed consent to participate.

Step 1 item generation

Literature review

We conducted literature searches on databases and network resources for PRO instruments. From the searches, we formed the conceptual framework of the new instrument, called the Liver Cirrhosis Patient Reported Outcome Measure (LC-PROM).

Patient interviews

We conducted semi-structured interviews with ten liver cirrhosis patients (five males and five females; average age 53 years). In the interview, patients were encouraged to talk about their main disease symptoms, physical feelings and symptoms that they most desired to improve, psychological conditions after diagnosis and participation in social activities since diagnosis, adherence to therapy and satisfaction with their status. In addition, patients could speak freely on other relevant topics. Throughout the process, researchers wrote down the interviewees’ original words as far as possible, and audio recordings were made. After the interview, all information was sorted and then an initial list of items was developed.

Cognitive debriefing and discussion with experts

Another ten patients (five males and five females, average age 52 years) were selected to undertake cognitive debriefing. These patients were asked to flag items that were ambiguously worded or difficult to understand, and to suggest items that needed to be added or deleted.

Seven experienced experts including three chief physicians of gastroenterology, one infectious diseases physician, one psychologist, one sociologist, and one ethics expert were invited to discuss whether the initial structural framework was reasonable and whether the items covered all areas of disease evaluation. The correlation of items with their respective dimensions and linguistic issues were considered. We modified the item pool according to the experts’ advice, and the preliminary scales were formed.

Step 2 item selection

Sampling survey

Two hundred subjects were sampled from inpatients of eight different hospitals and communities in Shanxi Province. There were 150 LC patients and 50 health controls.

Patients who were diagnosed with definite LC, who were between 18 and 72 years old, and who were fully able and willing to participate in this study as volunteers were included.

Patients were excluded if they had an uncertain diagnosis, suffered mental illness or disorders of consciousness, were unable to understand questions because of dysgnosia, or were unable to complete the test.

Health controls were healthy volunteers from communities who were not diagnosed with any diseases by physicians and had an age distribution similar to that of LC patients. Health controls also provided informed consent and got some rewards.

The survey was administered by trained investigators. Before beginning, subjects were informed of the survey objective and signed the informed consent form. Next, the participants independently completed the preliminary scale. During the survey, investigators were present to respond to questions. If participants were elderly or had a low education level, investigators read the items to them and wrote down their answers. After the survey, any incomplete scales were filled in by the subjects under the guidance of the investigators.

Scale scoring

Scores were calculated using a five-point Likert scale to reflect frequency of occurrence over the past 2 weeks of the issue presented in each item. The responses were 0 = never, 1 = occasionally, 2 = about half of the time, 3 = often, and 4 = almost every day. The positively-toned items were scored as the original score plus one, and the negatively-toned items were scored as 5 minus the original score. Thus every item score ranged from 1 to 5, with higher scores denoting more positive outcomes.

Statistical methods for item selection

Item reduction was based on both Classical Test Theory (CTT) and Item Response Theory (IRT). This study employed six methods of CTT followed by IRT.

Discrete trend

A low discrete degree means subjects were inclined to select the same answer; that is, the items had a low capacity to test for differences. In general, scores obey a normal distribution, so the standard deviation for every item was calculated. The items with a low standard deviation (<1.0) were deleted.

Discrimination analysis

Items that do not reflect different characteristics of subjects should not remain in the scale. We compared every item score with two independent-sample t-tests (α = 0.05), and the items that were not statistically different were deleted.

Exploratory factor analysis (EFA)

Taking the small sample size into consideration, we did EFA in each domain (physical, psychological, social, and therapeutic) separately, then rotated the solution. According to the eigenvalue and the variance contribution ratio, the number of factors was determined. Items with low factor loading (<0.4) and cross-loading on two or more dimensions were removed.

Cronbach’s α if item deleted (CAID)

Internal consistency was evaluated with CAID and the Corrected Item Total Correlation (CITC). If the α coefficient increased greatly when an item was deleted, the item was reducing the internal consistency of its own dimension. CITC < 0.4 indicates an item poorly contributing to the construct of the scale; therefore such items were deleted.

Correlation coefficient

The representativeness of an item was measured by its correlation coefficient with the dimension to which the item belonged. When the value was less than 0.6, the item was not retained.

Retest reliability

This method considered item stability. Thirty subjects were selected from the sample to take a retest 2 weeks after the first test. Among these, 20 cases whose data were error-free in both tests were used to calculate retest correlation coefficient. The criterion for reliability was 0.7.

Item response theory (IRT)

IRT is part of modern measurement theory and was put forward to overcome defects of CTT [6]. It is also called latent trait theory, and has advantages for item selection and test construction. It claims that there is a functional relationship between subjects’ abilities and their responses to an item. How to define this relationship is the basic idea and the starting point. In brief, IRT can be viewed as a probabilistic method for discussing the relationship between subjects’ potential traits and their responses to items.

If θ represents a subject’s ability, P(θ) is the probability of the subject’s responding to an item correctly; their functional relationship can be reflected by a curve called the item characteristic curve (ICC). Two important parameters on the curve are used in this study: a reflects discriminant degree and b shows item difficulty. On the ICC whose X,Y axes are θ and P(θ), b is the value of θ corresponding to P(θ) = 0.50; this value ranges from −3 to 3. a is the function of the tangent line’s slope at point b; its value ranges between 0.3 and 2, with larger values representing higher degrees of discrimination.

Because the five-point Likert scale was being used, a Graded Response model was constructed, which is appropriate for hierarchical and continuous data, extending a unidimensional model to a multidimensional one [7]. The basic idea of the model [8] is that: assuming the full score of an item is f j , then the number of scores for item j is f j  + 1, that is 0,1,2…,f j . If P ajt * is the probability that the score of item j is greater than t when the ability value is θ a , then P aj0 * = 1, P aj, f j +1* = 0. If P ajt is also the probability that the score of item j is t [9], then P ajt = P ajt *-P aj, t+1* (t = 0,1,2, …, f j ), where P ajt * = 1/{1 + exp[−Da j (θ a -b jt )]}, in which D = 1.7, a j is the discriminant degree of item j, b jt is the difficulty when the score of item j is t, and the difficulty level of item j is monotonically increasing; that is, b j1  < b j2  < … < b j , fj . P ajt * corresponding to an ICC is called the Project type characteristic function in the Graded Response model.

Five parameters can be estimated in our study, namely a,b 1,b 2,b 3,b 4, where b 1 is the parameter of difficulty level between answer 1 and answer 2, and so on, and b 1< b 2< b 3< b 4. Here a must be > 0.60, and b ranges from −3 to 3.

Items supported by at least five methods were retained in the final LC-PROM.

Step 3 validation of the scale

Second Sampling Survey

Six hundred twenty subjects were selected in the second survey, of which 120 were controls. Inclusion and exclusion criteria did not change, nor did the survey process.

Reliability analysis

Reliability reflects the stability and consistency of a scale. In our study, Cronbach’s α coefficients for the total scale and for each domain were calculated, to evaluate the average consistency of the items. The higher the value is, the better the reliability, but if α is too high, it suggests that the items are not simply related but overlap considerably. In the extreme case where α = 1,we should consider whether some items are redundant and could be eliminated. Here we chose 0.80 as the critical value; i.t., the measured results can be considered stable when α exceeds 0.80.

Validity analysis

Validity, also called accuracy, is the other arm of validation of a scale, and reflects the extent to which a scale measures what it sets out to measure. Validity includes subtypes of content validity, criterion validity, construct validity, and discriminant validity. In this article, we chose to measure the latter two.

Construct validity

This index shows whether the scale constructs match those in the initial framework. A scale with good construct validity is able to target true potential traits for measurement. Factor analysis is a major method for construct validity analysis and includes Exploratory Factor Analysis (EFA) and Confirmatory Factor Analysis (CFA). When an item collection is not based on theoretical guidance, EFA has the ability to explore the fields and dimensions belonging to a scale. However, before this study, we had reviewed the literature to formulate a scale framework, and EFA had been applied during the process of item selection, so at this stage CFA was suitable. Factor loading for every item and fit index for every domain were calculated.

Discriminant validity

This is an index of a scale’s ability to discriminate populations with different traits through comparing test results of selected subjects. The statistical method was a simple two-independent samples t-test. The total scores on the LC-PROM and on each domain were compared between cases and controls to judge whether the LC-PROM could distinguish these two groups. In addition, we stratified the time that patients had been sick as less than 1 year, 1 to 3 years, 3 to 5 years, and more than 5 years. ANOVA was then applied to infer the relationship between disease course and scale score. The scale we developed had a good discriminant validity when p ≤ 0.05.

Feasibility analysis

When a scale can be understood and completed by subjects easily, the scale is said to have strong feasibility. This property is assessed with reference to acceptance ratio, response rate, and completion time.

Statistical software

The data analysis was conducted by SPSS16.0, Multilog7.03 and LISREL8.70.

The entire study flow diagram is presented in Fig. 1.

Fig. 1
figure 1

Study flow diagram

Results

Generation of item pool

Literature review and patient interviews

Database searches revealed some liver disease-specific scales, such as the Hepatitis Quality Of Life Questionnaire (HQLQ) [10, 11], the Liver Disease Quality Of Life (LDQOL) [10, 11], the Chronic Liver Disease Questionnaire (CLDQ) [1013], and several related questionnaires such as the WHOQOL-BREF [11], the SF-36 [10, 11], the SCL-90 [12, 13] and the Hospital Anxiety and Depression Scale (HADS) [12, 13].

The LC-PROM focused on 4 domains: Physical (PHD), Psychological (PSD), Social (SOD), and Therapeutic (TRD). This idea is based on the definition of PRO and all the specific scales for liver disease. Meanwhile, taking the Social Avoidance and Distress Scale (SAD) and the Beck Hopelessness Scale (BHS) into consideration, the LC-PROM was divided into a further 13 dimensions, and the initial item pool included 72 items (see Appendix 1). The instrument’s conceptual framework is shown in Table 1.

Table 1 Preconceived conceptual framework for the LC-PROM

Cognitive debriefing and expert discussion

The LC-PROM was regarded as clear and concise, easy to understand and easy for the patients in the cognitive debriefing to complete. Completion time was 10 min on average. Considering patients’ suggestions, we made some modifications to the instrument. Six items in PHD that described atypical symptoms and overlapped with each other were deleted. Symptoms in deleted items included, for example, oliguria, dry eyes, pale skin and mucosa, among others. We also replaced the words “hepatic region” with “right upper abdomen,” to make this text easier to understand. Similarly, two items were reduced in PSD, one item was reduced in TRD, and one item was added in SOD.

Experts agreed that the LC-PROM was reasonable in its construction framework and item attributions, and that it was comprehensive in its content. However, because this was a self-rating scale, it was determined that the items should be expressed in the first person, so a full revision was made by research group accordingly. This second draft of the preliminary LC-PROM included 64 items, 13 dimensions and four fields (see Appendix 2).

Item reduction

Participant characteristics

We sampled 200 participants in this survey; 189 responded, for an acceptance rate of 94.50 %. There were 179 subjects, including 132 patients and 47 controls, whose data were available, for a final response rate of 94.71 %. Baseline data of participants are shown in Table 2. The average length of time since liver cirrhosis diagnosis was approximately 3.02 years.

Table 2 Baseline data for participants in pilot survey

Item selection based on CTT and IRT

When CAID was used, we calculated the initial Cronbach’s α coefficient when all 64 items were retained; this did not result in deletion of any items, the detailed result was not shown here.

In IRT a number of items were suggested for deletion: fourteen in PHD, four in PSD, and seven in TRD; and only one item was retained in SOD according to parameters a and b. Fig. 2 shows the ICC matrix.

Fig. 2
figure 2

Matrix plot of item characteristic curve

Fifteen items were to be deleted based on statistical results, but considering the value of disease-specific symptom information and the contributions of certain items to each dimension, six items were maintained in the final version of the LC-PROM. The final version comprised 55 items within 13 dimensions belonging to 4 domains (see Appendix 2). The detailed screening process is presented in Table 3, and the final construction frame can be seen in Table 4.

Table 3 Item selection outcome based on CTT and IRT
Table 4 Construction frame of the final LC-PROM

Validation of LC-PROM

Demographic characteristics

Another 620 subjects (500 cases and 120 controls) were sampled for the validation. Of the 598 who responded, 576 produced valid data for analysis (464 cases and 112 controls). Participant characteristics are presented in Table 5.

Table 5 Demographic characteristics of 464 patients and 112 controls in LC-PROM validation

As Table 5 shows, males were more numerous than females; subjects’ average age was 50–55 years. There were no statistically significant differences in the distributions of gender, age, or height between the two groups. LC patients had a higher proportion of smoking and drinking, and lower weight. These characteristics are consistent with risk factors for LC. Among the subjects, 269 patients had been sick for 1 to 5 years, the number of patients who suffered from LC less than 1 year and more than 5 years were 97 and 98 respectively, the average length of time was 3.70 years.

Reliability analysis

Cronbach’s α coefficient is one of the indicators for evaluating reliability, with a generally acceptable value of greater than 0.70. Our LC-PROM met this standard, except in the TRD domain (see Table 6).

Table 6 Cronbach’s α coefficient of four domains and total scale

Validity analysis

a. Construct validity: Results of CFA are listed in Tables 7 and 8, and show factor loadings of items and goodness of fit of domains in the final LC-PROM.

Table 7 Maximum Likelihood Estimation of CFA for LC-PROM
Table 8 Goodness of fit statistics of LC-PROM

As the tables show, standard factor loadings of each item were above 0.50, except for SOD3; therefore, the goodness of fit for LC-PROM is satisfactory.

b. Discriminant validity: Discriminant validity analysis was conducted by comparing average scores across different domains as well as total scale scores between patients with various disease courses and the health controls.

In Table 9, the scores of patients are lower than those of controls, suggesting that LC severely affected patients’ quality of life. With SOD as the exception, scores were significantly different, as seen in Table 10, and longer clinical courses were associated with lower scores. Perhaps because LC is the final stage of liver disease progression, by the time patients have received a definite diagnosis, they may already have lost the ability to engage in social activity; therefore scores in this domain did not differ. Of course, measurement error cannot be excluded as an explanation, but it had little effect on discriminant validity. In summary, the LC-PROM was well able to differentiate health and LC patients in varying clinical courses.

Table 9 Score comparisons between LC patients and health controls
Table 10 Scores obtained using the LC-PROM instrument in varying disease courses of LC

Feasibility analysis

The acceptance rate and response rate for the LC-PROM tool were 96.45 % and 92.32 %, respectively. Its average completion time was 10 min.

Discussion

LC is a chronic disease characterized by progressive liver injury which imposes a heavy burden on medical and health services. Bajaj J. S. etal revealed that patients had significant impairment on all domains apart from anger and anxiety compared with caregivers and US norms. Decompensated patients had significantly worse sleep, pain, social and physical function scores compared with compensated ones [14]. Therefore, objective evaluation of clinical effects and patients health conditions is critically important.

We performed reviews of the literature, then collected symptoms of greatest concern and with greatest likelihood of improvement, along with psychological conditions and life states from the patients’ perspective. From these, we formed the preliminary item pool for the LC-PROM instrument. Cognitive debriefing and discussions with experts were employed to ensure reasonableness of the conceptual and the structural framework. Next we applied this scale to two samples (n1 = 120, n2 = 620) that represented different populations. We considered seven statistical methods and clinical relevance when selecting final items for this tool. In current study, the final version of the LC-PROM comprised 55 items in 4 domains (18 items in PHD, 16 items in PSD, 12 items in SOD, 9 items in TRD) that represent 13 dimensions. Validation of reliability, validity, and feasibility indicated that the LC-PROM was accurate, reliable and easy to use, showing great potential for clinical application.

Through our literature search, we confirmed that the LC-PROM instrument is the first specific scale for LC. The existing PROs for liver diseases are adapted from quality of life measurement scales that are classified as a universal QOL scale and a specific HRQOL scale. For example, WHO Quality of Life-BREF(WHOQOL-BREF), Short Form 36 (SF-36), Nottingham Health Profile (NHP),and the sickness impact profile(SIP) are universal scales, and the Chronic Liver Disease Questionnaire (CLDQ), Hepatitis Quality Of Life Questionnaire (HQLQ), and Liver Disease Quality Of Life (LDQOL) are specific HRQOL scales. All the scales mentioned above have different degrees of defects and in any case do not apply to LC patients. Some studies have indicated that the WHOQOL is widely used by researchers to study QOL of liver transplant recipients, while the NHP focuses on more severe levels of disability and has thus has been known to be less sensitive to changes in conditions where effects are relatively mild [15, 16]. The SIP, in contrast, has a broad coverage of topics, but is therefore very long [17]. The SF-36 is applicable to a broader range of conditions, but has the common disadvantage of generic instruments; namely, they are not designed to identify disease-specific domains that may be important to establish clinical changes [18]. The HQLQ consists of the widely validated generic SF-36 with five added disease-specific subscales, but it excludes patients with a chronic liver disease other than HCV. The CLDQ is a short and therefore feasible questionnaire, but is unable to discriminate between more advanced stages of liver disease. The LDQOL addresses a variety of domains, but is therefore very long (101 items) [10]. The LDSI 2.0 developed by Van der Plas etal. is short, straightforward(only 18 items) and focuses on symptom severity and symptom hindrance, evaluating how patients experience these specific symptoms during daily activities[19]. But in this study, we intend to measure other aspects in addition to symptoms. The translated CLDQ is also used to measure quality of life of Hepatitis B patients [20], and although its reliability and validity have been evaluated, the cultural gap is difficult to bridge. In addition, the instrument has some inherent defects that make it inapplicable to LC patients.

The above-mentioned instruments are designed for chronic liver disease, but not for LC specifically. There is difference between these two disease types. Another point worth noting is that Japanese-related research has found no statistically significant differences among different severity levels of liver disease [13]. However, the LC-PROM tool differs from the scale these researchers used, which was translated directly from English. The LC-PROM is designed specifically for LC, and its item pool took shape through deep interviewing and cognitive testing of patients. Therefore, our instrument may be accepted by respondents more easily, and it performs better for measuring patients’ health status.

At present, liver disease questionnaires mainly focus on “physical”, “psychological” and “limitation” dimensions. The CLDQ also includes just six subdomains: abdominal symptoms, fatigue, systemic symptoms, activity, emotional function, and worry [21]. The LC-PROM contains a vital addition—a therapeutic domain to obtain information about treatment satisfaction, compliance and drug side effects. The satisfaction with treatment is the major outcome index in new drug clinical trials; this additional field provides information about effects that the trial drug has on targeted patients’ health (such as appetite symptoms, cognitive ability, independence, anxiety and depression, and confidence) and points out the compliance characteristics of the new drug among patients. These are valuable data for clinical therapeutic drug development. Additionally, optimal therapy can be selected according to these measurement data. In the social domain, the family relationship was emphasized reminding readers of the important role of family support during patient recovery.

During the item selection process, in addition to using subjective methods like cognitive tests and expert discussions, we combined seven kinds of statistical methods to refine the item pool to ensure that items retained were maximally accurate, objective and reliable. Methods employed to develop related scales are still limited to CTT. The innovation of our study is to put IRT into use in addition to CTT. IRT is able to make up for some disadvantages of CTT, allowing acquisition of items that reflect potential traits of the population more accurately.

The instrument demonstrated excellent discriminant ability among LC patients with varying courses of disease. At a basic level, physicians can judge different stages of disease according to the results of the LC-PROM. This will save time relative to the method of full reliance on laboratory indicators.

In a word, the LC-PROM instrument we developed fills a gap in patient-reported clinical outcomes of LC, and lacks the deficiencies seen with existing liver disease PRO tools. It also has the capacity to discriminate disease course, and to evaluate clinical effects and HRQOL accurately; therefore, it will provide valuable data to new drug development for LC.

However, this study still has quite a few limitations that will be addressed and improved in further research

To begin, Cronbach’s α coefficient for the therapeutic domain in the LC-PROM was less than 0.70, which suggests that the internal coherence of this domain needs to be improved further. As seen in the CFA results, the factor loading for item SOD3 (“I have told my worries to my family”) is only 0.35, but in consideration of its special meaning—support from family during illness—we kept it in the final scale. In fact, in the item selection phase, SOD3 was already suggested for deletion with SOD1 (“Friends and relatives take care of my disease”), but we maintained this item for the same reason. Besides, there is no items about sexual function in the scale. The participants expressed that these types of questions were a little sensitive and that it was difficult to respond. We worried about the low response rate and bad overall reliability and validity; therefore we did not include these information in the scale. In order to expand the scope of use, a scale containing this item will be generated in a revised version.

A second limitation relates to criterion validity. The LC-PROM instrument was designed for LC patients, and although participants at different stages of the clinical course were sampled, LC is the final stage of liver disease progression, and patients are often too weak to complete a lengthy scale. Introducing too many tests leads to test fatigue and noncompliance, which increases both survey cost and patients’ exhaustion levels; both influence survey results negatively. Therefore, we did not conduct criterion validity analysis in this study;

Last but not the least, because of limited resources, our samples were recruited from restricted regions and therefore may not be representative of all patients with LC.

Conclusions

Our study provides strong evidence for excellent reliability and validity of a PRO instrument for LC. We do not suggest that the LC-PROM can replace other related questionnaires on liver disease, but it can obtain valuable information on patients’ health conditions, evaluate clinical effects, inform therapeutic method selection and new drug development, as well as health service deployment and clinical research.

Abbreviations

BHS:

beck hopelessness scale

CAID:

cronbach’s α if item deleted

CFA:

confirmatory factor analysis

CFI:

comparative fit index

CITC:

corrected item-total correlation

CLDQ:

the chronic liver disease questionnaire

CTT:

classical test theory

EFA:

exploratory factor analysis

ERIQA:

European regulatory issues on quality of life assessment group

FDA:

food and drug administration

GT:

generalizability theory

HADS:

the hospital anxiety and depression scale

HQLQ:

hepatitis quality of life questionnaire

HRQoL:

health-related quality of life

ICC:

item characteristic curve

IFI:

incremental fit index

IRT:

item response theory

ISPOR:

international society for pharmacoeconomics and outcomes research

ISQOL:

international society for quality of life studies

LC:

liver cirrhosis

LC-PROM:

liver cirrhosis patient-reported outcome measure

LDQOL:

liver disease quality of life

NFI:

normed fit index

NHP:

Nottingham health profile

NNFI:

non-normed fit index

PHD:

physical domain

PRO:

patient-reported outcome

PSD:

psychological domain

QOL:

quality of life

RMR:

root mean square residual

RMSEA:

root mean square error of approximation

SAD:

social avoidance and distress scale

SCL-90:

symptom check list-90

SF-36:

short form 36

SIP:

the sickness impact profile

SOD:

social domain

TRD:

therapeutic domain

WHOQOL-BREF:

WHO quality of life-bref

References

  1. Sprangers MA, Aaronson NK. The role of health care providers and significant others in evaluating the quality of life of patients with chronic disease: a review. J Clin Epidemiol. 1992;45(7):743–60.

    Article  CAS  PubMed  Google Scholar 

  2. U.S. Department of Health and Human Service,FDA,Center for Drug Evaluation and Research,Center for Biologics Evaluation and Research,Center for Devices and Radiological Health. Guidance for Industry-Patient Reported Outcome Measures: Use in Medical Product Development to Support Labeling Claims [EB/OL]. (2009-03-07) [2015-05-10]. http://hqlo.biomedcentral.com/articles/10.1186/1477-7525-4-79.

  3. Gorecki C, Brown JM, Cano S, et al. Development and validation of a new patient-reported outcome measure for patients with pressure ulcers: the PU-QOL instrument. Health Qual Life Outcomes. 2013;11:95. http://www.hqlo.com/content/11/1/95.

  4. Greenhalgh J. The applications of PROs in clinical practice: what are they, do they work, and why? Qual Life Res. 2009;18:115–23.

    Article  PubMed  Google Scholar 

  5. Velikova G, Booth L, Smith A, Brown P, Lynch P, Brown J. Measuring quality of life in routine oncology practice improves communication and patient well-being: A randomised controlled trial. J Clin Oncol. 2004;22:714–24.

    Article  PubMed  Google Scholar 

  6. Yanbo Z. Latent Variable Analysis[M]. Beijing: Higher Education Press; 2009. p. 1–5.

    Google Scholar 

  7. Hambleton RK, Hariharan S. Item Response Theory: Principles and Applications. Boston: Kluwer Nijhoff Publishing; 1985. p. 1–9.

    Book  Google Scholar 

  8. Dodd BG, De ARJ, Koch WR. Computerized adaptive testing with Polytomous items. Appl Psychol Meas. 1995;19(1):5–22.

    Article  Google Scholar 

  9. Seock-Ho K, Cohen AS. A comparison of linking and concurrent calibration under the graded response model. Appl Psychol Meas. 2002;26(1):25–41.

    Article  Google Scholar 

  10. Gutteling JJ, De Man RA, Busschbach JJ, Darlington AS. Overview of research on health-related quality of life in patients with chronic liver disease. Neth J Med. 2007;65:227–34.

    CAS  PubMed  Google Scholar 

  11. Bao Z, Qiu D, Ma X. Evaluation of QOL Scale of Chronic Liver Disease[J].Chinese. Hepatology. 2008;13(4):332–3.

    Google Scholar 

  12. Haimiao Z, Jingping Z, Yongai Z. Research on the Influencing Factors of the Quality of Life in Hospitalized Patients with Hepatic Cirrhosis [J]. J Med Res. 2013;42(10):110–2.

    Google Scholar 

  13. Atsushi Tanaka, Kentaro Kikuchi, Ryo Miura etal. Validation of the Japanese version of the Chronic Liver Disease Questionnaire (CLDQ) for the assessment of health-related quality of life in patients with chronic viral hepatitis[J]. doi:10.1111/hepr.12524.

  14. Bajaj JS, Thacker LR, Wade JB, et al. PROMIS computerised adaptive tests are dynamic instruments to measure health-related quality of life in patients with cirrhosis. Aliment Pharmacol Ther. 2011;34(9):1123–32.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Hunt SM, McEwen J, McKenna SP. Measuring health status: a new tool for clinicians and epidemiologists. J R Coll Gen Pract. 1985;35(273):185–8.

    CAS  PubMed  PubMed Central  Google Scholar 

  16. Hunt SM, McKenna SP, McEwen J, Backett EM, Williams J, Papp E. A quantitative approach to perceived health status: a validation study. J Epidemiol Community Health. 1980;34:281–6.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Bergner M, Bobbitt RA, Carter WB, Gilson BS. The Sickness Impact Profile: development and final revision of a health status measure. Med Care. 1981;19(8):787–805.

    Article  CAS  PubMed  Google Scholar 

  18. Jenney ME, Campbell S. Measuring quality of life. Arch Dis Child. 1997;77(4):347–50.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Simone M, Simone M, Der Plas V, Hansen BE, De Boer JB, et al. The Liver Disease Symptom Index 2.0; Validation of a disease-specific questionnaire. Qual Life Res. 2004;13:1469–81.

    Article  Google Scholar 

  20. Chuanghong W, Qiwen D, Xiaoshu J, et al. Preliminary Use of the CLDQ in Chronic Hepatitis B Patients [J]. Chin J Clin Psychol. 2003;11:60–2.

    Google Scholar 

  21. Younossi ZM, Guyatt G, Kiwi M, et al. Development of a disease specific questionnaire to measure health related quality of life in patients with chronic liver disease. Gut. 1999;45:295–300.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

This study was supported by the grant from the National Natural Science Foundation of China (grant no: 81273180).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yanbo Zhang.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

All authors participated in the design of the study; YZ participated in data analysis and drafted the article; YYY and JL collected and analyzed data; YBZ put forward the original concept for this study, supervised the data analysis and revised the paper. All authors read and approved the final manuscript for this study.

Appendices

Appendix 1

Table 11 Formation of LC-PROM item pool of 72

Appendix 2

Table 12 Final version of LC-PROM

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, Y., Yang, Y., Lv, J. et al. LC-PROM: Validation of a patient reported outcomes measure for liver cirrhosis patients. Health Qual Life Outcomes 14, 75 (2016). https://doi.org/10.1186/s12955-016-0482-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12955-016-0482-y

Keywords