Mapping analysis to estimate EQ-5D utility values using the COPD assessment test in Korea

Background There is no research on mapping algorithms between EQ-5D and COPD assessment test (CAT) in Korea. The purpose of this study was to develop mapping algorithms that predict EQ-5D-3 L utility from the CAT in patients with COPD. Methods Survey data of 300 COPD patients were collected from three tertiary teaching hospitals in Korea. To predict EQ-5D-3 L utility from the CAT, various models were assessed. Models were developed using randomly split training samples. Subsequently, the models were validated based on root mean square error (RMSE) and mean absolute error (MAE) in validation samples. The models were also validated using the bootstrap method, which involves iterative splitting, training, and validating of the sample data at least 10,000 times. Average RMSEs and MAEs were used as criteria for model selection. Results The recommended mapping algorithms were based on ordinary least squares (OLS) regression models, which revealed five CAT items (chest tightness, breathlessness, activity, leaving home, and energy) as statistically significant on the EQ-5D-3 L. The mapping models estimated the overall mean of EQ-5D-3 L utilities effectively, but EQ-5D-3 L utilities for severe (low utility) patients (< 0.6) were overestimated as the observed EQ-5D-3 L utilities were often distributed over 0.6. Conclusion Mapping algorithms can be used to predict EQ-5D-3 L utilities from the CAT. However, mapping algorithms should be used cautiously when applied to groups with greater disease severity. Electronic supplementary material The online version of this article (10.1186/s12955-019-1148-3) contains supplementary material, which is available to authorized users.


Introduction
Chronic obstructive pulmonary disease (COPD) has a high global prevalence and mortality, and its socioeconomic burden continues to rise. In 2007 − 2012, the COPD prevalence in Korea was 13.1% in people aged over 40 years. Currently, 20.5% of men and 7.3% of women suffer from COPD [1]. The symptoms of COPD include bronchial closure and shortness of breath, which are known to reduce the mobility of patients, to negatively impact mental health, and, as a consequence, to reduce quality of life (QoL) [2]. In a study comparing the quality of life of patients with chronic diseases, COPD patients scored the lowest [3].
The COPD assessment test (CAT) was designed to provide a simple and reliable measure of health status in COPD patients. The CAT questionnaire consists of eight simple items about coughing, phlegm, chest tightness, breathlessness, home activities, leaving home, sleep problems, and energy. Each item is scored on a scale of 0-5, such that the total CAT score will fall within a range from 0 to 40. A score of 0 represents the best possible health status, and higher scores indicate worsening health. The St. George's Respiratory Questionnaire (SGRQ) shares several similarities with the CAT, but it is a more exhaustive screening assessment because it contains 50 questions [4]. Recently, the COPD guidelines recommended using the CAT to assess COPD patients. Hence, the use of the CAT is being expanded in many clinical practices and studies [5].
EQ-5D-3 L (developed by the EuroQoL Group) is a measure of health status comprising five dimensions. Each dimension has three levels: no problems (1), some problems (2), and extreme problems (3). EQ-5D-3 L can indicate 243 health status combinations, from 11,111 (perfect health) to 33,333 (worst health). Preference-based utilities can be obtained from the EQ-5D-3 L health status, allowing calculations of the quality-adjusted life year (QALY) for cost-utility analyses. Valuation studies for the 243 EQ-5D-3 L health status indicators were launched in the U in 1995; studies targeting Koreans began in 2006 [6,7].
The CAT does not show the respondents' quantitative utilities or preferences because it measures health status related to lung disease. However, measures of quantitative utility are required for economic evaluations, such as cost-utility analysis (CUA). In cases where only CAT data is available for CUA, a mapping algorithm that can estimate quantitative utility measures from the CAT, such as the EQ-5D utility index, can be useful. Previous mapping studies to estimate EQ-5D-3 L utilities for COPD patients have developed algorithms from the SGRQ [8][9][10], the clinical COPD questionnaire (CCQ) [11], and the CAT [12].
In this report, we describe the first mapping study of Korean patients with COPD. Further, this study used the CAT and EQ-5D-3 L data of 299 patients to develop mapping algorithms. In Korea, an economic evaluation is required to approve new drug reimbursements and guidelines recommend using a Korean-based utility index. Therefore, this mapping algorithm may be useful for economic evaluations of Korean COPD patients and similar patient populations.

Data
A total of 300 COPD patients from three tertiary teaching hospital were surveyed from July-December, 2014. All three hospitals are located in Seoul, the metropolitan city, and therefore most of the patients are urban residents.
The survey was approved by the Institutional Review Board (IRB) of each institute. Patients were eligible for the study if they fulfilled the following criteria: 1) were over the age of 40, 2) had been diagnosed with COPD before 2013, 3) had continuously received outpatient care since January 2013, 4) had an FEV1 (forced expiratory volume in 1 s)/FVC (forced vital capacity) ratio of less than 0.7 immediately after using bronchodilators, and 5) had more than 10 pack-years of smoking history. Patients were excluded if they suffered an acute exacerbation in the six weeks prior to the survey or a cardiovascular event (myocardial infarction or arrhythmias) in the three months prior to the survey.
The survey was conducted by nurses from the respective hospitals using the Korean version of the CAT and EQ-5D-3 L questionnaire. The nurses explained the questionnaires and conducted in-person interviews with the patients. After the interview, the patients' characteristics (sex, age, duration of COPD, lung function measurement, complication, prescription drugs, resource usage, etc.) were recorded by reviewing their medical records. One of the 300 patients withdrew their consent, so the results from 299 patients were included in this study. For the CAT instrument used in this survey, a Korean version had been evaluated for validity by Lee et al. [13], Hwang et al. [14]. Both evaluations concluded that the Korean version of CAT had good internal consistency and could be used to assess the impacts of COPD on patient health. Table 1 shows the descriptive statistics of patient characteristics that contains sex, age, BMI, smoking history and FEV1% predicted. The mean age of the patients was 69.2 years, and greater than 74% of the study population consisted of patients over 65 years. The proportion of males was 86.3%. The mean BMI was 22.85, and 8.7% of the patients reported a BMI of less than 18.5 (underweight). Mean smoking history was 36.9 pack-years, and 85% of the patients were current or ex-smokers. The mean duration of The results of the EQ-5D-3 L and CAT questionnaire survey of 299 people are presented in Table 2. The most frequent response for every EQ-5D-3 L item was 1 (42.1-72.6%), and very few respondents (0.7-2.3%) selected option 3. Eighty-two respondents (27.4%) chose option 1 for all five items. Subsequently, the EQ-5D-3 L utilities were calculated using the method for Korean population developed by Lee et al. [7].

Model development
The mapping models were developed using the EQ-5D-3 L utilities as the dependent variables and either the total CAT score or eight scores of each CAT item as the explanatory variables in the following formulas: Models 1 and 2 used the total CAT score, age, and sex as explanatory variables. In contrast, Model 3 used eight scores of each CAT item instead of the total CAT score. Backward stepwise selection of explanatory variables was used with significance defined as α = 0.05. The following estimation methods were used: ordinary least squares (OLS), generalized linear models (GLM), Tobit models, and beta regression. Because EQ-5D-3 L utilities have skewed and censored values, we used and compared GLM, Tobit and beta regression as well as OLS. The probability distributions and link function of GLM that we investigated are Gaussian-log, Poisson-log, gamma-inverse, quasi-identity.
A two-part model was also considered as an alternative estimation method to analyze skewed data [8]. In this study, a large proportion (27%) of observed EQ-5D-3 L utilities had a value of 1, which indicated perfect health status. The first part of the two-part model, logistic regression, would determine the probability of having a perfect health status. The second part would use previous OLS, GLM, Tobit and beta regression estimations to predict EQ-5D-3 L utilities. The EQ-5D-3 L utility score was calculated using the following equation: •Two-partmodel : EQ-5D-3LUtility ¼ Pðper fecthealthÞ þ½1-Pðper fecthealthÞ Ã Predicted-EQ-5D-Utility P(perfect health) is the probability of perfect health obtained from logistic regression. The Predicted-EQ-5D-3 L-Utility value is derived from previous OLS, GLM, Tobit and beta regression estimations.
All statistical analyses were conducted using R (ver 3.3.3; R Foundation for Statistical Computing).
The R code used for this analysis is provided as Additional file 2.

Model validation
The datasets of 299 COPD patients were randomly split into a training set of 150 patients (50%) and a validation The EQ-5D utility scores were calculated using the equation from Lee et al. [7]. Eight each CAT item is scored on a scale of 0-5, and the total CAT score ranges from 0 to 40 set of 149 patients (50%). The training dataset was used to develop the models. The validation set was used to validate the models through calculations and comparisons of root mean square errors (RMSE) and mean absolute errors (MAE). The bootstrap method was used to generate a robust estimate of RMSE and MAE in the limited sample size. The previously described framework, which consists of random splitting, training and validation, was iterated 10,000 times to collect 10,000 RMSEs and MAEs. The means of the collected RMSEs and MAEs were used as criteria for model selection. As such, the model with the lowest RMSE or MAE was selected as the most suitable method. Table 3 presents the results of mapping models using OLS and Gaussian log-link GLM. All RMSEs and the MAEs values represent the means of 10,000 bootstrap values. Among the four types (Gaussian-log, Poisson-log, gamma-inverse, and quasi-identity) of GLM results, the Gaussian-log model generated the best (lowest RMSE and MAE) values. Both the Tobit and beta regression reported higher RMSE and MAE than OLS or GLM. Additionally, the two-part models showed worse performance (higher RMSE and MAE) than the corresponding single equation models. The results of Tobit and beta regression that are not presented in Table 3 are provided as Additional file 1.

Model development
Among the total CAT score models (Models 1 and 2), OLS1 resulted in the lowest RMSE and GLM1 resulted in the lowest MAE. Model 2, which included the square of the total CAT score as an explanatory variable, performed worse than Model 1, which only included the total CAT score. As for the selected items model (Model 3), OLS3 resulted in lower RMSE and GLM3 resulted in lower MAE. Due to the simplicity of use as well as the accuracy of the estimation, we would recommend OLS1 and OLS3 models rather than GLMs. Of which, the selected CAT items model (OLS3) provided more accurate estimates than the total CAT score model (OLS1). However, the OLS1 model must be used to estimate EQ-5D-3 L utilities when the total CAT score is the only known value. Table 4 presents the recommended mapping models-OLS1 and OLS3 models. To create a best-fit mapping algorithm for EQ-5D-3 L utility predictions, OLS equations were estimated using the full 299 patient dataset. Both models included the age variable as a negative coefficient value. The sex variable was not statistically significant and was excluded. In the selected items model, the third, fourth, fifth, sixth, and eighth CAT items had significantly negative effects on the EQ-5D-3 L utilities, whereas the other items did not. The significant effects of the third, fifth, sixth and eighth CAT items were consistent with results from a previous mapping study by Hoyle et al. [12]. In this study, one additional item (breathlessness) was included in the model. The equations of the two recommended mapping models are as follows.   Five items whose responses had significant effects on the EQ-5D-3 L utility scores Scatter plots of predicted and observed EQ-5D-3 L utilities are shown in Fig. 1. As illustrated by the scatter plots, the mapping model overestimated EQ-5D-3 L utilities for severe status patients (observed EQ-5D-3 L utilities < 0.6). The range of observed EQ-5D-3 L utilities (0.095 − 1) differed from the range of predicted EQ-5D-3 L utilities (0.6 − 1). Table 5 presents observed and predicted EQ-5D-3 L utilities categorized by COPD severity. Both OLS1 and OLS3 produced similar utilities to EQ-5D-3 L for moderate and severe health status; however, the utility was underestimated for mild COPD and overestimated for very severe COPD.

Discussion
This study developed mapping algorithms to predict EQ-5D-3 L utility using CAT responses from 299 Korean COPD patients' survey data. Unlike previous mapping studies that used data collected from two or more preceding randomized clinical trial studies, this study used survey data conducted at three tertiary teaching hospital. Under the Korean healthcare system, patients are not obligated to visit a general practitioner to get a referral and are free to visit any hospital. Under this circumstances, patients without serious symptoms are allowed to visit tertiary hospitals. For this reason, symptoms of COPD can vary in severity, even if the data is collected from a tertiary hospital.
The mapping model can estimate the effects of lung health on QoL, and it can be used for economic evaluations that require a quantitative utility measure. RMSE and MAE results, which serve as indicators of algorithm performances, were comparable to previous mapping studies in COPD [8][9][10][11][12]. To identify appropriate models, we investigated a wide range of mapping algorithms, including OLS, GLM, Tobit, beta regression, and two part models. In addition, we used the bootstrap method to generate robust estimates of RMSE and MAE in the limited sample size of 299 patients. Based on the results, OLS regression models are the recommended mapping algorithms as more complex models failed to improve results.
The recommended OLS models estimated the overall mean of EQ-5D-3 L utilities accurately. However, the recommended and the other mapping algorithms used in this study overestimated EQ-5D-3 L utilities for low utility patients (< 0.6). The predicted EQ-5D-3 L utilities have a floor effect of 0.6. The cause of this floor effect could be from the conceptual overlap and differences between CAT and EQ-5D. The three CAT items about cough (Q1), phlegm (Q2) and sleep (Q7) were little correlated with each five EQ-5D-3 L dimension, conceptually and in practice. And, the pain/discomfort dimension  of EQ-5D-3 L were little correlated with each eight CAT item. These differences between CAT and EQ-5D means a limitation of predicting EQ-5D using CAT. Additional causes of the floor effect could be (i) the lack of women in the sample (Table 1) and (ii) the lack of severe patients in the sample that only 11 patients in the sample had utilities below 0.6 ( Fig. 1). This overestimation for patients with severe health states has been reported in previous mapping studies; as such, this is recognized as a general problem with mapping studies [8][9][10][11][12]. Therefore, when using the mapping models to predict EQ-5D-3 L utilities by severity, these predictions are likely to be biased. This study is the first mapping study of Korean patients with COPD. We conducted iterated estimation using the bootstrap method. The general procedure for model development consisted of sample splitting, training, and validating. The validation step is a weak point because the model performance ranking based on goodness-of-fit can change if the split sample changes. Therefore, selecting a model based on a single result has a probability of flawed model selection (as the model will be well-fitted to the specific validation set only, but not well-fitted generally). The iterated estimation process using the bootstrap method limits the impact of sample split and reduces the probability of poor model selection. From the bootstrap method, we ranked the mapping models based on their mean RMSEs and MAEs. These rankings were the same as rankings based on RMSE and MAE estimations using the full data set. From this finding, we questioned whether the sample split is necessary in the model development process.
It is known that EQ-5D-3 L utility is not distributed as a Gaussian distribution, for which OLS is suitable. Many previous mapping studies considered other algorithms as more suitable alternatives for the censored distribution of EQ-5D-3 L utility. However, OLS-based algorithms were recommended by most (80%) of the previous mapping studies as well as this study [15]. Because the OLS estimator minimizes the sum of squared error (which minimizes RMSE), OLS shows the lowest RMSE and would be selected as the best model when RMSE is used as a criterion. Considering other algorithms, other model selection criteria are needed, not just RMSE.
The study had some limitations, including a relatively small sample size (299 patients), inclusion of few severe status patients, and access to only one survey dataset (which could not conduct external validation). This skewed distributed sample might have introduced bias in the severe patient group estimation. Despite these limitations, the mapping algorithms in this study can be used to predict the mean level of EQ-5D-3 L utility in similar COPD patient groups.