Subjective assessments of comorbidity correlate with quality of life health outcomes: Initial validation of a comorbidity assessment instrument
© Bayliss et al. 2005
Received: 08 July 2005
Accepted: 01 September 2005
Published: 01 September 2005
Interventions to improve care for persons with chronic medical conditions often use quality of life (QOL) outcomes. These outcomes may be affected by coexisting (comorbid) chronic conditions as well as the index condition of interest. A subjective measure of comorbidity that incorporates an assessment of disease severity may be particularly useful for assessing comorbidity for these investigations.
A survey including a list of 25 common chronic conditions was administered to a population of HMO members age 65 or older. Disease burden (comorbidity) was defined as the number of self-identified comorbid conditions weighted by the degree (from 1 to 5) to which each interfered with their daily activities. We calculated sensitivities and specificities relative to chart review for each condition. We correlated self-reported disease burden, relative to two other well-known comorbidity measures (the Charlson Comorbidity Index and the RxRisk score) and chart review, with our primary and secondary QOL outcomes of interest: general health status, physical functioning, depression screen and self-efficacy.
156 respondents reported an average of 5.9 chronic conditions. Median sensitivity and specificity relative to chart review were 75% and 92% respectively. QOL outcomes correlated most strongly with disease burden, followed by number of conditions by chart review, the Charlson Comorbidity Index and the RxRisk score.
Self-report appears to provide a reasonable estimate of comorbidity. For certain QOL assessments, self-reported disease burden may provide a more accurate estimate of comorbidity than existing measures that use different methodologies, and that were originally validated against other outcomes. Investigators adjusting for comorbidity in studies using QOL outcomes may wish to consider using subjective comorbidity measures that incorporate disease severity.
The goal of caring for persons with chronic medical conditions is frequently to maximize quality of life (QOL) rather than to 'cure' illness. Therefore interventions to improve processes of care for this population often assess QOL outcomes such as physical functioning, overall health status, and emotional well being. These outcomes are, by definition, subjective. The values assigned to these outcomes are most meaningful to the patients themselves. However, these subjective outcomes have been shown to correlate with mortality, health care utilization, job loss, and many other more 'quantifiable' outcomes [1–3].
The outcomes of a chronic condition may be affected by coexisting (comorbid) chronic conditions as well as the index condition of interest and analyses must adjust for this effect of comorbidity. Multiple instruments have been developed and validated to quantify comorbidity for purposes of statistical adjustment and clinical decision making. The majority of these use medical record review or administrative data as sources of information; observation during clinical encounters and self report have also been used for this purpose. These instruments have primarily been validated against 'objective' health outcomes such as mortality, length of stay, and cost of care [4–13]. We are aware of two such instruments that have been validated against QOL outcomes [5, 14]. In addition, many of these instruments were designed for use in hospitalized patients or populations characterized by specific illnesses.
Self-reported information about comorbidity and the burden it imposes can provide information about the concurrent impact of multiple disease states on QOL outcomes. Self-reported comorbidity information is also efficient in studies in which other information, such as QOL outcomes, is collected by survey. Instruments designed to assess comorbidity by self-report have reported significant correlations between comorbidity score and utilization, QOL, mortality and hospitalization [15–20].
It is important to incorporate assessment of disease severity into comorbidity measurement . Some self-report instruments incorporate various weighting systems for this purpose and two of these have been validated in hospitalized populations [15, 18]. We have developed a self-report instrument that incorporates disease severity by quantifying the respondent's subjective 'disease burden' which we define as the number of self-identified comorbid conditions weighted by the degree to which each condition limits daily activity. We hypothesized that a subjective measure of comorbidity such as this may be more strongly correlated with QOL outcomes than measures of comorbidity previously validated against other, more objective, health outcomes.
Our goals in this investigation were to validate this newly-developed instrument against a presumed 'gold standard' of chart review, and to conduct an initial comparison of this instrument with other well known measures of comorbidity (chart review of number of conditions, the Charlson Comorbidity Index and the RxRisk score) by correlating these measures with selected QOL outcomes.
Study setting and sample selection
The study setting was a Health Maintenance Organization (HMO) in the United States that provides primary, specialty and hospital care for persons of all ages. Due to the use of an electronic medical record, both primary and specialty providers can enter diagnoses and assessments into a single patient record. Participants were selected from a stratified random sample of HMO members age 65 or older with 0 (8%), 1 (10%), 2 (12%), or 3 or more (69%) chronic medical conditions. We sampled this age group based on the high prevalence of comorbid conditions in older adults . The stratification was performed with a modified version of the RxRisk comorbidity assessment instrument that uses administrative pharmacy data to determine an estimated disease count . As one of the goals of our investigation was to assess issues of importance to persons with multiple comorbidities, we oversampled members with a greater number of chronic conditions. Due to the pilot nature of the study, we used consecutive random sampling in increments of single mailings until we had sufficient sample size to evaluate the instrument. We calculated that we would need a sample size of 139 for an expected proportion (sensitivity and specificity) of 0.90 to have a 95% confidence interval with a total width of 0.10.
Sensitivity and Specificity of Self-Report Relative to Chart Review (N = 1511)
Medical condition 2
Mean Self-Report Disease Burden
Self-Report n (%)
Chart Review n (%)
Angina/coronary artery disease
Cancer (within the past 5 yrs)
Colon problem (e.g., diverticulitis, irritable bowel)
Congestive heart failure
Hard of hearing
Poor circulation (e.g., peripheral vascular disease)
Rheumatic disease, other
Stomach problem (e.g., gastritis, peptic disease)
We pre-tested the instrument for clarity and ease of completion with volunteers who were age 65 or older and had more than one chronic medical condition. Pre-testing was conducted in one-on-one interviews in which the volunteer completed the survey and then provided detailed feedback to the interviewer on the content and comprehension of the measure. Any recommended changes were incorporated into the subsequent version of the instrument. It was then mailed to respondents as a component of another pilot survey that assessed potential barriers to the medical self-care process. The complete questionnaire included validated questions that assessed physical functioning and general health status, a depression screen, and an adapted and concurrently validated assessment of general self-efficacy. We used the physical functioning measure and the general health status single question from the Short-Form 36®, the depression screen from the Behavioral Risk Factor Surveillance System, and a concurrently validated adaptation of the general self-efficacy scale (our coefficient alpha = 0.76) [1, 27–29]. We used these assessments as our primary and secondary QOL outcomes of interest for the current investigation. The investigation was approved by the Institutional Review Board of the participating HMO and informed consent was obtained from all participants.
Comparison with chart review
We compared each participant's responses with diagnoses listed in their electronic medical record. We reviewed assessments from all outpatient encounters over the two years preceding the survey and accepted at least two chart-documented assessments of a chronic condition as an active diagnosis. Requiring two rather than one chart diagnosis may reduce the sensitivity of self-report . However, we based our decision on the assumption that a recurrence of a chronic diagnosis would reasonably have been communicated to the patient, and therefore he or she might be expected to list that diagnosis in their response to our survey. Either two recorded outpatient diagnoses or one inpatient diagnosis have been suggested as a reasonable standard for a confirmed diagnosis . In our chart review, we also counted previously documented chronic conditions that were likely to persist (e.g. hearing loss). We did not count diagnoses of chronic problems that had been surgically corrected and required no further management (e.g. cataract surgery).
Comparison with other measures of comorbidity
In addition to calculating comorbidity with our instrument, for each respondent we quantified level of comorbidity using two other validated comorbidity measurement tools. These two methods were the RxRisk score and the Charlson comorbidity index [4, 7]. We chose these based on both their common use and the contrast they provided in methodologies since they use different methods of data collection and have been validated against different outcomes. The RxRisk score is a measure of comorbidity that incorporates age, gender, health insurance benefit status and an RxRisk category based on diagnoses derived from administrative pharmacy data. It was originally developed and validated to identify chronic conditions and to predict cost of health care, and subsequently revised to assess disease burden in certain populations [4, 32]. We used administrative pharmacy data to apply the RxRisk tool to our study population. The Charlson comorbidity index is a widely used comorbidity measure that was originally developed to predict one-year mortality following hospitalization. The score is based on chart review for specified diagnostic criteria. It has been subsequently adapted and revalidated to assess longer term mortality, disability, hospital readmission and length of stay and has been revised into formats that utilize either ICD-9 diagnosis codes or questionnaire [6–8, 25]. We calculated the Charlson comorbidity score using chart review.
We calculated sensitivity and specificity for each condition using the chart report as the 'gold standard.' We also calculated sensitivity and specificity for each participant to indicate the percent of positive and negative conditions on which the respondent and chart agree relative to the total positive or negative conditions in the chart. Thus specificity and sensitivity by condition reflect respondents' overall tendency to accurately report a given condition relative to chart report, and sensitivity and specificity by participant reflect respondents' overall tendency to accurately report on all of their conditions in comparison to the gold standard of chart review. (Note that sensitivity and specificity analyses used self-reported presence or absence of conditions for comparison rather than the weighted disease burden score.) In order to further compare self-reported disease burden with our 'gold standard' of chart review, for each condition we entered self-reported disease burden followed by chart report of that condition into limited logistic regression models to assess the relative contributions of each of these independent variables to the predictive accuracy of the model for each of our outcome measures .
We calculated Spearman correlations between disease burden from the new instrument, disease count by chart review, the Charlson index and the RxRisk score, with our QOL outcomes of interest: measures of overall health status, physical functioning, positive depression screen, and level of self-efficacy.
Characteristics of study population (N = 156)
Age (mean, range)
Missing, chose not to answer
Missing, chose not to answer
Did not graduate high school
High school graduate
Missing, chose not to answer
Household income (mean category)
Less than $15,000
More than $90,000
Missing, chose not to answer, don't know
Missing, chose not to answer
Missing, chose not to answer
Level of Comorbidity (mean, range of each)
Number of Self-Reported Conditions
Self-Reported Disease Burden*
One hundred fifty-one respondents reported at least one of the conditions and 6 reported none. In analyses by condition, median sensitivity of patient report of a condition relative to a 'gold standard' of chart review was 75% (range 35% to 100%) and median specificity was 92% (range 61% to 100%). In analyses by respondent, sensitivities (agreement on number of conditions positive relative to chart review) ranged from 14% (n = 1) to 100% (n = 53); the median was 83%. Sensitivities were not calculated for the ten respondents who did not agree with the chart on any conditions, including those who agreed with their medical record that they had none of the conditions (n = 2). Specificities by respondent ranged from 59% (n = 1) to 100% (n = 34); the median was 91%. Sensitivity and specificity of self-report of each condition relative to chart review are reported in Table 2. (Not included on the table are results for 2 of the original 25 conditions: liver disease and alcoholism. Two respondents and one separate chart reported alcohol abuse, and no respondents or charts reported liver disease.)
In order to assess the relative contributions of self reported diseases and disease count by chart review to the outcomes of general health status and physical functioning, we entered these two variables into limited logistic regression models. In these models containing only these two variables, the predictive accuracy of the model (as measured by the c-statistic) was not significantly different using each of the two variables, implying comparable contributions of either measure. C-statistics for overall health status ranged from 0.521 ("other rheumatic disease) to 0.669 ("osteoarthritis"); and for physical functioning ranged from 0.515 ("other rheumatic disease") to 0.679 ("overweight").
Correlations Between Measures of Comorbidity and QOL Outcomes (N = 1562)
Self reported disease burden1
Chart review number of conditions
Self reported number of conditions3
Charlson comorbidity score 
Rx-risk score 
Overall health status* (n = 150)
0.60 p < 0.001
0.56 p < 0.001
0.477 p < 0.001
0.48 p < 0.001
0.17 P = 0.037
Physical functioning* (n = 137)
-0.63 p < 0.001
-0.52 p < 0.001
-0.482 p < 0.001
-0.41 p < 0.001
-0.18 p = 0.035
Depression screen* (n = 153)
-0.29 p < 0.001
-0.25 p = 0.002
-0.240 p = 0.003
-0.12 p = 0.140
-0.05 p = 0.559
Self-efficacy* (n = 145)
-0.32 p < 0.001
-0.22 p = 0.008
-0.305 p < 0.001
-0.14 p = 0.096
0.10 p = 0.234
It is important to incorporate assessment of comorbidity into studies involving QOL outcomes for persons with chronic medical conditions, as coexisting conditions may substantially affect outcomes of interest such as physical functioning, overall health status, depression and self-efficacy. In our study population, patients with multiple chronic medical conditions accurately reported a majority of common comorbid conditions relative to chart review. In addition, they were aware of most of their own diagnoses. Furthermore, self-reported disease burden correlated well with QOL outcomes, and correlated more strongly than did the two other measures of comorbidity that we used for comparison. This is consistent with our hypothesis that, for investigations using QOL outcomes, it is most appropriate to adjust for comorbidity using a subjective measure of comorbidity.
Previous investigations that have compared self-report with administrative data reported 59–79%, 72–73%, and 78–83% agreement on diagnoses of hypercholesterolemia, diabetes, and hypertension respectively; and 56% and 69% agreement on stroke and myocardial infarction [30, 34]. In our investigation we expanded the number of conditions for comparison to 23 and additionally assessed respondents' tendencies to accurately report all of their own conditions. Certain diagnoses were reported with high levels of sensitivity and specificity, while others were not.
A sensitivity greater than specificity may be due to either 'over-reporting' by participants or 'under-reporting' in the chart. Examples from our list included asthma, back pain, overweight and hard-of-hearing. We suspect that, for the first case, some participants reported COPD as asthma. For the remaining cases, we suspect that the conditions were under-reported in the chart - either because they had not been brought to medical attention or because they had not been assessed as isolated problems in the context of medical visits during the period covered by the chart review.
Sensitivity was substantially less than specificity for angina, nerve conditions, cancer and kidney disease. Although there may be a tendency to under-report chronic conditions, and respondents are more likely to report conditions with more severe symptoms [17, 35]; we re-reviewed charts of persons with these diagnoses to see if we could determine the cause of the discrepancies. From these repeat chart reviews, we concluded that these discrepancies were due to wording based more on symptoms than diagnosis (angina), under-reporting of conditions with stable or few symptoms (renal and neurological), and possible perceptions of cure or remission after acute treatment (cancer). In addition we analyzed the demographic and health characteristics (from Table 1) of respondents for each of these four conditions to see if any demographic or disease characteristics were likely to predict a low agreement with chart review and found no patterns.
In our assessments of sensitivity and specificity, we assumed that the presence of a diagnosis in the chart was a 'gold standard' - an assumption that may not be entirely accurate. We suspect that diagnoses for which there are obvious medical treatments - especially medications - are more likely to be recorded in the chart. Chart diagnoses may be less accurate for conditions for which a person is less likely to seek (or for which a provider is less likely to offer) specifically biomedical solutions.
We found a high correlation between our measure of disease burden and our QOL outcomes of interest, as compared to lower correlations between two other comorbidity indices and these same outcomes. However, the correlations between the other comorbidity indices and health status and physical functioning were also significant and have been noted previously . The correlations between the Charlson and RxRisk scores and our secondary outcomes of interest (depression screen and self-efficacy) were not significant. Based on the pattern of these associations, we suggest that assessment of comorbidity is a function of the outcome of interest, the population studied, and the different (subjective versus objective) aspects of comorbidity measured by each instrument. The effect of comorbidities on QOL outcomes may be most accurately assessed when subjective measures are used to adjust for comorbidity. In contrast, for situations in which mortality, for example, is the outcome of interest, comorbidity should be assessed using instruments that have been developed for that purpose. These suggestions are consistent with the notion that 'complete' measurement of all health states requires both self-reported and objectively reported measures .
It is certainly possible that one comorbidity measure may work for many situations. Other self-report instruments have been shown to predict mortality and hospitalization in addition to QOL [15, 16, 18]. We are also aware of at least two investigations in which comorbidity measured by chart review correlated with QOL outcomes [5, 14]. The two instruments with which we compared our own instrument use different methodologies and were originally developed to assess comorbidity in studies investigating the objective outcomes of mortality and cost of care respectively [4, 7]. The Charlson index has been subsequently validated against length of stay, post operative complications, discharge to nursing home, disability, hospital readmission and hospital charges [6, 8, 38–40]. The RxRisk score has subsequently been adapted and validated against administrative data on diagnoses and disease burden in certain populations [4, 32]. Our investigation adds to the growing body of knowledge on measuring comorbidity by highlighting the different results that may be obtained when using different methodologies to adjust for comorbidity in studies assessing QOL outcomes.
We did not incorporate additional measures of comorbidity, such as those that use administrative data into our analysis [8, 12, 13]. Previous comparative studies suggest that chart-review-based measures may be slightly more accurate than administrative data-based comorbidity measures in predicting objective outcomes such as mortality and length of hospital stay [6, 38, 41]. Further investigation is necessary to assess association of comorbidity measured by administrative data with QOL outcomes.
As with any initial validation effort, the generalizability of our conclusions is limited by the characteristics of the population studied - a relatively small HMO population aged 65 years or older. It is possible that this population is relatively 'well-educated' regarding the number and type of their medical conditions. If so, some of the sensitivities we report may be at the upper end of the spectrum that may be anticipated from self-report. In addition, we terminated the sampling process when we attained a sample size sufficient to test our primary hypothesis, without maximizing response rate. Thus, the findings in this sample may not represent the associations of a broader population. Although respondents did not differ significantly from non-respondents on RxRisk comorbidity score, more motivated or knowledgeable participants may have been more likely to respond promptly to our survey. Correlations and sensitivities could be lower when examined in a less motivated population or those with a lower knowledge base. Specifically, self-report may be less reliable in the geriatric sub-population that may suffer from cognitive impairment. Additional validation studies will be required in order to assess the usefulness of this instrument in other populations and for different QOL and other outcomes. We anticipate that these changes will strengthen our results for sensitivity in comparison to chart review and that they will not change the overall correlations with our outcomes of interest.
Disease burden (as we defined it) may in itself constitute a substantial portion of any patient's assessment of health status and physical functioning. Our incorporation of perceived limitation into a disease count may be similar to other investigations that have coupled a simple disease count with a health status measure such as the SF-36® and found that doing so strengthened the relationship between comorbidity and utilization and mortality [16, 19]. However, models that attempt to explain the relationship between symptom burden, overall quality of life and physical functioning note that these outcomes are also affected by environmental characteristics, individual personality, expectations, values, and social and psychological supports [42, 43]. What we refer to as disease burden explains part, but not all, of our QOL outcomes as is illustrated by the values of our c-statistics. To the extent that investigations that use QOL outcomes concentrate on participants with one index condition and need to adjust for comorbidities, a subjective measure of disease burden using self-report may be an accurate way to account for the effect of other coexisting conditions with regard to that outcome.
Finally, depression is both an important potential comorbidity for anyone with chronic illness as well as an equally important component of the QOL outcome of emotional well being. We chose to treat it as the latter. As depression severity independently contributes to general QOL over and above other coexisting chronic illness, we suspect that including depression on our list of conditions would have increased the strength of correlations between self-reported disease burden and general health status [44, 45].
Assessing comorbidity is relevant to investigations of populations with multiple medical conditions and should be incorporated into the associated analyses. Not only is self-report likely to give a reasonable estimate of comorbidity, for investigations using QOL outcomes, self-reported disease burden (or other subjective assessments of comorbidity) may provide a more accurate comorbidity adjustment than measures that have been validated against other outcomes. If this finding is confirmed by additional investigation, subjective measures of comorbidity that incorporate disease severity should be added to QOL assessments for populations with high rates of comorbidity.
This project was funded by an internal research grant from Kaiser Permanente, Colorado.
Portions of this material were previously presented in poster format at the annual HMO Research Network Conference, Santa Fe, NM. April 2004.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.