- Open Access
Assessing response bias from missing quality of life data: The Heckman method
Health and Quality of Life Outcomesvolume 2, Article number: 49 (2004)
The objective of this study was to demonstrate the use of the Heckman two-step method to assess and correct for bias due to missing health related quality of life (HRQL) surveys in a clinical study of acute coronary syndrome (ACS) patients.
We analyzed data from 2,733 veterans with a confirmed diagnosis of acute coronary syndromes (ACS), including either acute myocardial infarction or unstable angina. HRQL outcomes were assessed by the Short-Form 36 (SF-36) health status survey which was mailed to all patients who were alive 7 months following ACS discharge. We created multivariable models of 7-month post-ACS physical and mental health status using data only from the 1,660 survey respondents. Then, using the Heckman method, we modeled survey non-response and incorporated this into our initial models to assess and correct for potential bias. We used logistic and ordinary least squares regression to estimate the multivariable selection models.
We found that our model of 7-month mental health status was biased due to survey non-response, while the model for physical health status was not. A history of alcohol or substance abuse was no longer significantly associated with mental health status after controlling for bias due to non-response. Furthermore, the magnitude of the parameter estimates for several of the other predictor variables in the MCS model changed after accounting for bias due to survey non-response.
Recognition and correction of bias due to survey non-response changed the factors that we concluded were associated with HRQL seven months following hospital admission for ACS as well as the magnitude of some associations. We conclude that the Heckman two-step method may be a valuable tool in the assessment and correction of selection bias in clinical studies of HRQL.
The potential impact of missing survey responses is often ignored in health-related quality of life (HRQL) studies [1, 2]. Missing data from study participants can cause bias in parameter estimates of models predicting HRQL outcomes [2–4]. Unfortunately, regression models are frequently interpreted with the assumption that available data are representative of the entire study population. Researchers may compare clinical characteristics of respondents with and without missing surveys, but rarely attempt to assess the impact of these differences on the regression model parameter estimates and ultimately on the results of the study. The assumption that there are minimal or no effects on parameter estimates is only reasonable if one can demonstrate that data missing from a study are truly missing at random, making them ignorable, which is rarely the case [2, 4].
Newer statistical techniques have been developed to assess and correct for bias resulting from missing HRQL surveys [2, 3]. One technique which has received little attention in the medical literature to date is the Heckman two-step method [5–8]. The Heckman method was developed by an economist, James Heckman, to address problems of self-selection among women participating in the labor force. This method makes it possible to assess whether selection bias is present, identify factors contributing to the selection bias, and to control for this bias in estimating the outcomes of interest. The Heckman method attempts to control for the effect of non-random selection by incorporating both the observed and unobserved factors that affect non-response.
The objective of this study was to demonstrate the use of the Heckman two-step method to assess and correct for bias due to missing HRQL surveys. To accomplish this goal, we evaluated HRQL outcomes in a cohort of patients with acute coronary syndromes (ACS).
We analyzed data from the VA Access To Cardiology study, which was a multi-center prospective cohort study of 2,733 veterans with a confirmed diagnosis of acute coronary syndromes (ACS), including either acute myocardial infarction or unstable angina . Baseline patient characteristics (demographic, cardiac history, non-cardiac history and hospitalization variables) were collected at the time of ACS hospitalization. HRQL outcomes were then assessed by the Short-Form 36 (SF-36) health status survey which was mailed to all patients who were alive 7 months following ACS discharge. A second mailed survey was sent to non-respondents. If no response was obtained from the mailed surveys, attempts were made to contact the patients by phone. Of the 2,733 patients in the study, 1,660 (61%) completed the survey, 306 (11%) died, and 767 (28%) were alive and did not complete the survey. Of those completing the survey, most responded to the first mailing with much smaller numbers responding to the second mailing or to phone calls.
The outcome variables were the Physical Component Summary (PCS) and Mental Component Summary (MCS) scores from the 7-month SF-36 health status survey. The PCS and MCS scores reflect a patient's overall physical and mental health status, respectively [10, 11]. The PCS and MCS scores are continuous variables with a range of 0–100, where higher scores indicate better health status. We constructed a dichotomous variable to indicate whether the patient responded to the SF-36 or not. It is important to note that models of HRQL outcomes may be biased both because subjects may have died before survey administration (survivor bias) or because of survey non-response in subjects that were alive at the time of survey administration . However, the best way to handle patients who die prior to administration of the HRQL survey remains controversial . Since the focus of this paper was to demonstrate the use of the Heckman model rather than methods of dealing with death in HRQL studies, we included only those patients who survived 7-months and were therefore eligible to complete the survey. Candidate predictor variables included a wide array of demographic, cardiac, and non-cardiac variables from the index hospitalization, and selected variables from the interim period between discharge and the 7-month SF-36 health status survey (Table 1). These variables were derived from the established literature on risk factors for adverse post-MI outcomes (mortality, functional status, and HRQL) [13–20]. Patient demographic and clinical data from the index hospitalization and 7-month follow-up period were abstracted from the electronic medical record and/or from national VA patient care databases.
Baseline characteristics of the patients who did and did not complete an SF-36 were compared using analysis of variance for continuous variables and chi-square for categorical variables (Table 1). Then, a series of regression models were developed, including 1) initial PCS and MCS models which did not account for potential bias due to missing surveys, 2) Heckman selection models (modeling response to the SF-36), and finally 3) final PCS and MCS models (accounting for potential bias due to missing surveys). We used robust regression for all equations (Stata version 8.0 SE), controlling for cluster sampling by VA medical center. In prior analyses, we established that the intra-class correlation, the measure of the effect of clustering by medical center in this case, was significant. As a result, it was necessary to control for bias due to autocorrelation, or similarity among patients within a medical center, compared to patients at a different medical center. Stata uses the Huber-White estimator to control for the bias due to clustering. This technique deflates the standard errors of the parameter estimates, in this case the coefficients, correcting the inference statistics.
Overview of the Heckman method
There are two steps in the Heckman method. The first step is the development of a selection equation (i.e. a model of factors associated with survey non-response). This step includes derivation of a variable from the selection equation called the Inverse Mills Ratio (IMR). The second step of the Heckman method is the insertion of the IMR variable into the initial regression models (e.g. those not accounting for potential bias due to missing surveys) from a given study in order to assess for, and attempt to control for, selection bias.
Heckman method: Step one
The first step in the Heckman method is to create the selection model, which estimates whether or not the quality of life survey was completed. The Heckman selection equation is usually estimated using a probit estimator [5, 21]. The probit estimator requires a binary outcome variable, in this case whether the patient responded to the SF-36 or not (coded 1 for responder, 0 for non-responder). The candidate predictor variables for the selection model were those listed in Table 1. Although the Heckman selection equation will usually have multiple variables, some of which will be the same variables that enter the multivariable models of HRQL outcomes, it is important that the Heckman selection equation contain at least one variable that can legitimately be excluded from the initial models to safeguard against colinearity between the Heckman selection equation and the initial regression models. This means that this variable (or set of variables) is, in theory, a factor influencing whether someone responded to the questionnaire, but not a factor in predicting their component scores on the SF-36. This variable or set of variables is called an instrument in econometrics, and should be a strong predictor of response in the selection equation. We therefore stress that it is essential that the candidate variables considered for the Heckman selection equation be as comprehensive as possible, not omitting any variables that may contribute to whether a person responds to the survey.
Once the Heckman selection equation is estimated, the residuals (error term) from this equation are used to form a new variable called the Inverse Mills Ratio (IMR). The formula to create the IMR variable depends on the distributional assumption in the outcome equation. In most HRQL applications, the quality of life score is the outcome of interest and is usually estimated using multivariable linear regression. In this case, the distributional assumption of the error term is the standard normal distribution, so that the ratio of the standard normal probability density function (pdf) and cumulative density function (cdf) applied to the residuals for each individual in the data set is created. The ratio of pdf/cdf is the IMR.
Each individual in the study sample receives an individual value of the IMR based on the residual observed for that individual in the selection equation. In this study, the value of the IMR for each individual represented the predicted probability that they completed the 7-month SF-36 survey. It is important to note that the IMR is a function not only of observed or measured variables that are included in the selection equation, but also of unobserved or unmeasured variables. These are captured through the error term or residual in the selection equation, and included through the non-linear function used to estimate the IMR. As a result, adding the IMR into the outcome equation introduces a term that attempts to capture both observed and unobserved variables that affect selection, or non-response.
We estimated the Heckman model using the maximum likelihood estimation method in Stata version 8.0. In this approach, the outcome and selection models are estimated jointly, which can result in slightly different selection models for different outcomes, in this case the PCS and MCS scores from the SF-36. However, for clarity of presentation of the Heckman process, we present only one table of selection equation results (the probit estimation of the probability of returning the SF36), assuming that the patient survived to the 7-month survey point.
Heckman method: Step two
The second step of the Heckman method is to include the IMR as a separate predictor variable in the initial regression models. In this study, the IMR variable derived from our Heckman selection model was inserted into the initial PCS and MCS models. Once this variable is inserted, two factors can be evaluated to help determine whether there is significant bias from missing responses in the initial models. First, one can examine the significance of the IMR variable itself. If significant, it suggests there was significant bias in the initial model. However, one potential limitation of the Heckman method is that if the Heckman selection model is not well-specified, and the variables in the selection model do not predict response/non-response well, the IMR may be weaker than expected and the Heckman method may have limited power to detect bias. Therefore, a second factor to examine following the addition of the IMR variable into the initial outcome models is whether or not there have been significant changes in any of the parameter estimates of the other predictor variables in the model. While somewhat arbitrary, changes in parameter estimates of >10% may indicate that these estimates were biased due to missing surveys. Where possible, one should apply clinical judgment about whether changes in parameter estimates are 'biologically important' .
With these factors taken together, the insertion of the IMR variable into the initial risk models allows the assessment of whether or not there was bias in the initial models, and suggests which initial predictors may have been most associated with this bias. Furthermore, by including the effect of unmeasured as well as measured variables from the selection equation, bias due to selection is controlled.
Compared to patients that completed the 7-month SF-36 survey, patients who did not respond to the survey were older, more likely to be current smokers and more likely to have a history of alcohol or substance abuse, dementia, stroke, or depression (Table 1). Furthermore, the non-responders were more likely to have had ST-segment elevation on their ECG, more likely to have been admitted to a tertiary care VA hospital, and were more likely to have had a do not resuscitate order during their index hospitalization. Survey non-responders were less likely to have a history of prior coronary artery bypass graft (CABG) surgery, prior percutaneous coronary intervention (PCI), or chronic obstructive pulmonary disease (COPD), and were less likely to receive coronary revascularization during index hospitalization.
Initial PCS and MCS models
Variables significantly associated with better 7-month physical health status included ST-segment elevation MI on electrocardiogram and coronary revascularization during the index ACS hospital admission. Variables significantly associated with worse 7-month physical health status included older age, history of prior CABG surgery, chronic heart failure, arthritis, COPD, stroke, depression, diabetes, and elevated serum creatinine during index ACS admission.
Variables significantly associated with better 7-month mental health status included older age and ST-segment elevation MI on electrocardiogram. Variables significantly associated with worse 7-month mental health status included a history of prior MI, alcohol and/or substance abuse, COPD, and arthritis.
Heckman selection model
The Heckman selection model (modeling response to the SF-36) is presented in Table 2. Older age, prior PCI, and history of COPD were associated with an increased likelihood of survey response, whereas a history of alcohol or substance abuse, depression, and have had a do not resuscitate order during their index hospitalization were associated with a decreased likelihood of survey response.
Final PCS and MCS models
The final multivariable models for PCS and MCS (after addition of the IMR variables from the Heckman selection model) are presented in Tables 4 and 6. There was little evidence of selection bias for the PCS model. None of the results of inference testing for significance changed between the initial model and the model with the IMR variable added. Furthermore, the changes in magnitude of parameter estimates were not large, and the parameter estimate on the IMR variable itself was not significant.
By contrast, when the IMR variable was inserted into the initial MCS model, we found evidence of selection bias. In this case, the parameter estimate for history of alcohol or substance abuse changed from significant to insignificant with the introduction of the IMR variable. Therefore, it appears that a history of alcohol or substance abuse was associated with lower likelihood of responding to the survey, but not directly associated with mental health status. In addition, a number of parameter estimates changed quantitatively, with larger changes than those observed in the PCS findings. Finally, the coefficient on the IMR variable itself was significant, and was negatively associated with MCS, implying that unobserved variables in the selection equation appear to be associated with worse MCS scores.
The goal of this study was to demonstrate the use of the Heckman two-step method to assess and correct for bias due to missing HRQL surveys in a clinical study of acute coronary syndrome patients. We created initial multivariable models of 7-month post-ACS physical and mental health status using data only from survey respondents. Then, using the Heckman method, we modeled survey non-response, derived an Inverse Mills Ratio variable for each patient that captured the likelihood of survey response, and incorporated this variable into our initial models to assess and correct for potential bias from survey non-response.
We found that our initial model of 7-month physical health status was not biased due to survey non-response. In contrast, our initial model of 7-month mental health status was biased. After controlling for bias due to non-response, a history of alcohol or substance abuse was no longer associated with mental health status. Furthermore, the magnitude of the parameter estimates for several of the other predictor variables in the MCS model changed after accounting for bias due to survey non-response.
Given these results, biased parameter estimates of the association between these variables and mental health status would have been reported if we had used the standard approach to evaluating the predictors of HRQL outcomes in this population. Furthermore, we might have concluded that alcohol/substance abuse was significantly associated with mental health status outcomes following ACS, and may therefore be an important target for interventions to improve post-ACS HRQL (e.g. improving alcohol screening and treatment). While alcohol/substance abuse may be important for other reasons, it would have been incorrect to conclude that it was associated with HRQL in our study population. Rather, it was a marker for survey non-response. This analysis demonstrates the utility of the Heckman method in its application for assessing and correcting survey response bias in clinical studies of HRQL.
Health-related quality of life data are usually not missing at random, and failure to account for missing HRQL assessments can bias estimates of associations and may lead to inappropriate conclusions about the determinants of HRQL outcomes [1–3]. Often, HRQL data are missing in systematic ways that can be estimated and controlled for. This study demonstrates the use of one technique to accomplish this, the Heckman two-step method [5–8]. To date, the Heckman method has rarely been utilized in studies reported in the medical literature, although it was previously used in one study assessing the impact of selection on medication use among older patients .
There are other statistical techniques, or approaches, to assess and correct for bias resulting from missing HRQL surveys, including index function models, propensity scores, instrumental variables, and multiple imputation methods [2–4, 23]. The Heckman method is one example of an index function model. Generally, the index function approach to missing HRQL data is to model whether or not HRQL surveys were completed (i.e. the dependent variable is survey completion). This allows an estimation of the 'likelihood' that a given patient would complete a survey based on their clinical characteristics and/or other process or structure of care variables. This information, in turn, is used to assess and correct for bias in the primary model of interest (i.e. the model of quality of life outcome). Therefore, a primary strength of the Heckman method is that it not only permits the assessment of selection bias, it corrects for the bias, and does so in an informative way that may yield new insights into the association between patient characteristics or processes of care and outcomes of interest such as HRQL. In the Heckman method, the assumption is made that the error term in the outcome equation is standard normal, the distribution assumed in classical linear regression. Other index functions allow other distributional assumptions to be made for the error term in the outcome equation, such as logistic.
Propensity score approaches can be analogous to the Heckman method in that a multivariable model of survey non-response is developed and the probability of survey non-response is used to stratify the study population and/or the propensity score is used as an independent variable in the primary HRQL models. In other words, propensity scores are similar to the Heckman method in that the predicted probability of non-response is used as the basis for assessing the impact of missing data and controlling for it . Unlike a propensity score, however, which is often entered directly into the outcome equation as a predictor, the non-linear transformation from the prediction into the IMR variable in the Heckman method is one of the safeguards against colinearity in the outcome equation.
Instrumental variable approaches are also used to address similar questions to those addressed by the Heckman method. In the full instrumental variable approach, a single exogenous variable (called the instrument) is used to stratify the full sample, removing the effect of the correlated error terms that lead to biased estimates . An instrumental variable approach can be a very powerful approach to controlling selection bias. However, it can be very difficult to find an appropriate and adequate instrumental variable. The Heckman approach offers a more flexible, if less powerful, approach, and adds information about the underlying processes by which selection arose. It should be noted that propensity scores can be used as instrumental variables, when a suitable instrument is found.
Finally, multiple imputation methods can be employed to address missing HRQL data . In contrast to the Heckman and other approaches described thus far, multiple imputation methods derive missing values from existing data in the dataset, thereby creating a 'complete' dataset and eliminating the need to drop patients from analysis. Imputation thereby eliminates bias from missing data per se (i.e. there is no longer missing data), but is highly dependent on the validity of deriving the missing HRQL survey data from the existing dataset. It is important to note that in this paper, we are focused on missing surveys rather than incomplete surveys (i.e. missing data elements within a survey). In this regard, multiple imputation will most often be employed in studies with serial measurements of HRQL over time, such that HRQL data before and/or after the time point of interest can inform the missing data imputation. The Heckman method can be used even for a single point in time, cross-sectional assessment of HRQL, as in our analysis in which we measured HRQL at only one time point.
The Heckman method has several limitations. First, the selection equation must have at least one variable that is associated with survey response but not the outcome of the study (i.e. HRQL). In some clinical applications, the inability to identify such a variable may make it difficult to use the full Heckman method to control bias. However, it is still possible to use the first step of estimating a selection equation to assess the degree to which selection bias may affect the parameter estimates in an outcome equation. If there are variables that are significant in both the selection equation and the outcomes equation, it is likely that there is bias due to selection effects in the outcome equation. Acknowledging this and commenting on the likely magnitude of effect may provide helpful guidance to clinicians and other researchers. Another limitation of the Heckman method is that this technique depends heavily on the quality of the data available for the selection equation. If the amount of variance explained is relatively low, then there is a possibility that selection bias in the outcomes equation may not be detected. In other words, the Heckman method can be under-powered for the detection of bias in some cases.
Finally, the Heckman method is very sensitive to how the model is specified; in other words, omitting variables that are associated with either non-response or with the outcome of interest (in this case, health related quality of life measures) can lead to inaccurate findings and biased estimates of the parameters in the final models. Careful attention to specifying the models, and ensuring that model specification follows what is known in the literature to be associated with the outcomes of interest is essential .
This study demonstrated the use of the Heckman two-step method to assess and control for bias from missing HRQL surveys in a clinical study. We found that our mental health status model was significantly biased due to missing HRQL assessments. Recognition and correction of this bias changed the parameter estimates of association and the factors that we concluded were associated with HRQL seven months following hospital admission for an acute coronary syndrome. We conclude that the Heckman two-step method may be a valuable tool in the assessment and correction of selection bias in clinical studies of HRQL.
Fairclough D, Peterson H, Chang V: Why are missing quality of life data a problem in clinical trials of cancer therapy? Stat Med 1998, 17: 667–677. 10.1002/(SICI)1097-0258(19980315/15)17:5/7<667::AID-SIM813>3.3.CO;2-Y
Fairclough D: Design and Analysis of Quality of Life Studies in Clinical Trials. Boca Raton, FL: Chapman and Hall/CRC Press 2002.
Fairclough D, Peterson H, Cella D, Bonomi P: Comparison of several model-based methods for analysing incomplete quality of life data in cancer clinical trials. Stat Med 1998, 17: 781–796. 10.1002/(SICI)1097-0258(19980315/15)17:5/7<781::AID-SIM821>3.3.CO;2-F
Little R, Rubin D: Statistical analysis with missing data. 2002.
Heckman J, MaCurdy T: New methods for estimating labor supply functions: A survey. Research in labor economics 1981, 4: 65–102.
Heckman J, MaCurdy T: Labor econometrics. In: Handbook of Econometrics (Edited by: Griliches ZIM). New York: Elsevier 1986, 1918–1975.
Grotzinger KSBC, Ahern F: Assessment and control of nonresponse bias in a survey of medicine use by the elderly. Med Care 1994, 32: 989–1003.
Heckman J: Sample selection bias as a specification error. Econometrica 1979, 47: 153–161.
Rumsfeld JS, Magid DJ, Plomondon ME, O'Brien MM, Spertus JA, Every NR, Sales AE: Predictors of quality of life following acute coronary syndromes. Am J Cardiol 2001, 88: 781–784. 10.1016/S0002-9149(01)01852-5
Ware J, Kosinski M, Keller S: SF-36 Physical and Mental Health Summary Scales: A User's Manual. 2 Edition Boston, MA: The Health Institute, New England Medical Center 1994.
Ware JJ, Snow K, Kosinski M, Gandek B: SF-36 Health Survey: Manual and interpretation guide. Boston, MA: The Health Institute, New England Medical Center 1993.
Diehr P, Patrick D, Hedrick S, Rothman M, Grembowski D, Raghunathan TE, Beresford S: Including deaths when measuring health status over time. Med Care 1995, 33: AS164–172.
Braunwald E, Antman EM, Beasley JW, Califf RM, Cheitlin MD, Hochman JS, Jones RH, Kereiakes D, Kupersmith J, Levin TN, Pepine CJ, Schaeffer JW, Smith EE III, Steward DE, Theroux P: ACC/AHA Guidelines for the Management of Patients with Unstable Angina and Non-ST-segment Elevation Myocardial Infarction: A Report of the American College of Cardiology/American Heart Association Task Force on Practice Guidelines. J Am Coll Cardiol 2000, 36: 970–1062. 10.1016/S0735-1097(00)00889-5
Ryan TJ, Antman EM, Brooks NH, Califf RM, Hillis LD, Hiratzka LF, Rapaport E, Riegel B, Russell RO, Smith EE III, Weaver WD: ACC/AHA guidelines for the management of patients with acute myocardial infarction: 1999 update: a report of the American College of Cardiology/American Heart Association Task Force on Practice Guidelines. [http://www.acc.org]
Lee KL, Woodlief LH, Topol EJ, Weaver D, Betriu A, Col J, Simoons M, Aylward P, Van de Werf F, Califf RM, for the GUSTO-I Investigators: Predictors of 30-day mortality in the era of reperfusion for acute myocardial infarction; results from an international trial of 41,021 patients. Circulation 1995, 91: 1659–1668.
Mark DB, Naylor CD, Hlatky MA, Califf RM, Topol EJ, Granger CB, Knight JD, Nelson CL, Lee KL, Clapp-Channing NE, Sutherland W, Pilote L, Armstrong PW: Use of medical resources and quality of life after acute myocardial infarction in Canada and the United States. N Engl J Med 1994, 331: 1130–1135. 10.1056/NEJM199410273311706
Westin L, Carlsson R, Israelsson B, Willenheimer R, Cline C, McNeil TF: Quality of life in patients with ischaemic heart disease: a prospective controlled study. J Intern Med 1997, 242: 239–247. 10.1046/j.1365-2796.1997.00203.x
Rawles J, Light J, Watt M: Quality of life in the first 100 days after suspected acute myocardial infarction–a suitable trial endpoint? J Epidemiol Community Health 1992, 46: 612–666.
Maeland JG, Havik OE: Self-assessment of health before and after a myocardial infarction. Soc Sci Med 1988, 27: 597–605. 10.1016/0277-9536(88)90007-X
Wiklund I, Herlitz J, Hjalmarson A: Quality of life five years after myocardial infarction. Eur Heart J 1989, 10: 464–472.
Maddala G: Limited-dependent and qualitative variables in econometrics. Cambridge: Cambridge University Press 1983.
Mickey RGS: The impact of confounder selection criteria on effect estimation. American Journal of Epidemiology 1989, 129: 125–137.
Rubin D: Estimating causal effects from large data sets using propensity scores. Ann Intern Med 1997,127(8 Pt.2):757–763.
McClellan M, Newhouse JP: Overview of the special issue. Health Serv Res 2000, 35: 1061–1069.
This study was supported by a grant from the Health Services Research and Development Service, Department of Veterans Affairs, ACC 97-079. The views expressed in this article are those of the authors and do not necessarily represent the views of the Department of Veterans Affairs. Dr Rumsfeld is supported by a VA Health Services Research and Development Advanced Research Career Development Award (ARCD 98341-2).
AES conceived of the study, participated in design and coordination, conducted analyses, and drafted the manuscript; MEP conducted analyses and contributed to the manuscript; DJM and JAS reviewed and contributed to the manuscript; JSR participated in the design and coordination of the study and participated in the drafting and revision of the manuscript. All authors read and approved the final manuscript.