Longitudinal assessment of utilities in patients with migraine: an analysis of erenumab randomized controlled trials

Background Cost-effectiveness analyses in patients with migraine require estimates of patients’ utility values and how these relate to monthly migraine days (MMDs). This analysis examined four different modelling approaches to assess utility values as a function of MMDs. Methods Disease-specific patient-reported outcomes from three erenumab clinical studies (two in episodic migraine [NCT02456740 and NCT02483585] and one in chronic migraine [NCT02066415]) were mapped to the 5-dimension EuroQol questionnaire (EQ-5D) as a function of the Migraine-Specific Quality of Life Questionnaire (MSQ) and the Headache Impact Test (HIT-6™) using published algorithms. The mapped utility values were used to estimate generic, preference-based utility values suitable for use in economic models. Four models were assessed to explain utility values as a function of MMDs: a linear mixed effects model with restricted maximum likelihood (REML), a fractional response model with logit link, a fractional response model with probit link and a beta regression model. Results All models tested showed very similar fittings. Root mean squared errors were similar in the four models assessed (0.115, 0.114, 0.114 and 0.114, for the linear mixed effect model with REML, fractional response model with logit link, fractional response model with probit link and beta regression model respectively), when mapped from MSQ. Mean absolute errors for the four models tested were also similar when mapped from MSQ (0.085, 0.086, 0.085 and 0.085) and HIT-6 and (0.087, 0.088, 0.088 and 0.089) for the linear mixed effect model with REML, fractional response model with logit link, fractional response model with probit link and beta regression model, respectively. Conclusions This analysis describes the assessment of longitudinal approaches in modelling utility values and the four models proposed fitted the observed data well. Mapped utility values for patients treated with erenumab were generally higher than those for individuals treated with placebo with equivalent number of MMDs. Linking patient utility values to MMDs allows utility estimates for different levels of MMD to be predicted, for use in economic evaluations of preventive therapies. Trial registration ClinicalTrials.gov numbers of the trials used in this study: STRIVE, NCT02456740 (registered May 14, 2015), ARISE, NCT02483585 (registered June 12, 2015) and NCT02066415 (registered Feb 17, 2014).


Background
Cost-effectiveness analyses are often used by reimbursement agencies to make decisions on whether to reimburse new healthcare interventions. Health-related quality of life (HRQoL) values can be expressed as utility scores, which capture social preferences for different health states [1]. Often, studies with HRQoL outcomes have repeated assessment over time and are longitudinal in nature. Analysing longitudinal data can present various challenges, such as missing data or the variations in patient HRQoL over time [2]. Therefore, it is important to consider the appropriate model when analysing longitudinal HRQoL data.
Regression models have been used to estimate treatment-effect impact on HRQoL [3]. Simple linear regression models, however, may not be optimal because health utility measures, including measures of HRQoL, may be multimodal and have ceiling effects, floor effects or skewed distributions [3][4][5]. In these circumstances, multivariable analyses, such as linear mixed models, may be more appropriate to estimate the changes in HRQoL over time. Linear mixed models, which are an extension of simple linear models and contain both fixed and random effects, can overcome the limitations associated with a longitudinal data set [2]. Recent analyses have demonstrated the suitability of the use of linear mixed models to measure HRQoL in longitudinal cohorts in a range of disease areas [2,6]. Anink et al. applied linear mixed models to examine HRQoL data in patients with juvenile idiopathic arthritis [6]. Wailoo et al. demonstrated the use of bespoke mixed models to model the 5-dimension EuroQol questionnaire (EQ-5D) in patients with ankylosing spondylitis [4]. Griffiths et al. estimated utility values from mixed regression models using EQ-5D data in patients with chronic heart failure [2].
According to the National Institute for Health and Care Excellence 'Guide to the Methods of Technology Appraisal', EQ-5D is the preferred method for measuring utilities [7,8]. Utilities can be estimated from individual patient-level data collected as part of clinical studies (or extrapolated in the absence of long-term data) [9]. However, the collection of EQ-5D utilities is not always appropriate or possible in every disease state, so other methods may be used [7,8]. Limited guidance exists on approaches to extrapolating outcomes such as utilities [10]. Applying existing algorithms is one of the options to derive utilities for health-state estimates when they are not available from the original data set [10,11]. Extrapolation methods should, however, consider processes that influence utilities that may not be linked to clinical events (e.g. past medical history of a patient or changes in clinical practice over time that may affect current practice) [10]. These considerations are particularly relevant to migraine, a chronic neurological disorder with episodic attacks of headache and an array of other symptoms [12]. Migraine is a debilitating disease in which utilities are typically measured via the Health Utilities Index (HUI) or the EQ-5D [13][14][15][16]. Migraine has considerable negative effects on a person's HRQoL, in addition to a high economic burden due to high direct costs (physician visits, emergency department visits, etc.) and indirect costs (lost work days, decreased productivity at work, etc.) [17]. Migraine can be divided into two categories based on the number of days on which patients have a headache in a 28-day month. Chronic migraine (CM) is defined as experiencing ≥15 monthly headache days (MHD) for ≥3 or more months, 8 of which meet the criteria for migraine and/or respond to migraine-specific treatments [12]. Episodic migraine (EM) is defined as experiencing ≤14 MHD [18][19][20].
Reduction in the frequency of monthly migraine days (MMD) is an important measure in the efficacy of migraine prophylaxis; however, there are limited data on the relationship between migraine frequency and health status [15]. Furthermore, patient-level data collected within the time frame of a clinical study often cover too short a duration to assess the likely costs and benefits that may yield over an individual's entire lifetime [10]. Preventive treatment can reduce the burden and disability associated with migraine [21]. Erenumab is a fully human monoclonal antibody that specifically blocks the calcitonin generelated peptide receptor complex [22] and has been shown to have a favourable safety and efficacy profile in phase 2 and phase 3 clinical studies [23][24][25]. In 2018, erenumab was approved by the US Food and Drug Administration for the prevention of migraine in adults [26].
The pivotal erenumab clinical studies included endpoints that recorded HRQoL data. This study aimed to leverage the HRQoL data from these studies to estimate patient utility values associated with specific levels of MMD. Various models for utilities in the longitudinal framework were compared using the observed utility data. Quantifying how the primary outcomes of the clinical studies, that is, MMDs, relate to utility values is important to inform cost-effectiveness analyses of preventive therapies such as erenumab [27].

Data source
The populations assessed in the models are the populations of three pivotal erenumab clinical studies [23,24,28]. In the phase 3 (NCT02456740) STRIVE (Study to Evaluate the Efficacy and Safety of Erenumab in Migraine Prevention), 955 patients with EM were enrolled. In the phase 3 (NCT02483585) ARISE (A phase 3, Randomized, double-blind, placebocontrolled Study to Evaluate the efficacy and safety of AMG 334 in migraine prevention), 577 patients with EM were enrolled and in the phase 2 study, 667 patients with CM were enrolled. The EM studies recruited individuals with ≤14 MHD and 4-14 MMDs per 28 days and the CM study recruited individuals with ≥15 HDs per 28 days and > 8 MD. To generalize the influence that MMD frequency has on patient utility values, the patient-reported outcomes for the placebo and erenumab (70 mg and 140 mg) arms of the three studies were combined to produce a complete migraine data set. Patient-level data were obtained for the participants in each study, with the following variables extracted for use in the analysis as the covariate set: participant identification (ID), study ID, age (continuous), sex (categorical), race (categorical), MMD at baseline (count), MMD (count) and treatment status (categorical). Covariates were selected based on known associations and clinical advice from experts in the field [29,30]. Study-level effects were originally included in the hierarchical models, but as they demonstrated a negligible amount of variability between the studies, this layer was removed. As the objective of the analysis was to estimate patient utility based on MMD across the full migraine spectrum, combined models based on both EM and CM were fitted. Furthermore, the trials were comparable in terms of patients characteristics [31], therefore only patients level data were retained in the multilevel models presented here.

Data description
Patient utilities in the model were estimated as a function of MMD. For this analysis, MMD refers to the number of migraine days during a 28-day period. In the three studies, patients' HRQoL and daily functioning were collected in a monthly assessment, using the Headache Impact Test (HIT-6™) [32] and the Migraine-Specific Quality of Life Questionnaire (MSQ) [33]. The HIT-6 is designed to provide a global measure of adverse headache impact. Via a HRQoL questionnaire, the HIT-6 evaluates six content areas: pain, role functioning, social functioning, energy/fatigue, cognition and emotional distress [34]. The MSQ is a 14-item HRQoL questionnaire that measures three dimensions of functional status (role prevention, role restrictive and emotional function) specific to migraine [33,34]. Both the MSQ and the HIT-6 have been shown to be valid and reliable tools for measuring the adverse impact of headache [32,35]. Disease-specific patientreported outcomes from the studies were mapped to the EQ-5D using published algorithms. The mapping algorithms applied here have been previously published by Gillard et al., and these algorithms have been validated to support the analysis of onabotulinum toxin A (Botox®) trial data (see Online Resource: Additional file 2: Table S1) [34]. The size of the prediction error of the validated models was assessed using root mean squared error (RMSE) and mean absolute error (MAE).
In addition to the complete case analysis, a multivariate imputation by chained equations (MICE; fully conditional specification [FCS] algorithm) was performed with the assumption that data were missing at random. The MICE-FCS technique is a standard methodology for dealing with missing data and is also appropriate in the context of longitudinal data. The variables used in the imputation model were mapped MSQ, mapped HIT-6, treatment, baseline MMD, MMD, visit, age, sex and race. This imputation assessed the robustness of the results according to the presence of missing data and was constructed on a FCS [36] and based on 15 multiple imputed data sets [37].

Utility regression models
For this analysis, four models were assessed: (1) a linear mixed effects model with REML, (2) a fractional response model with logit link, (3) a fractional response model with probit link and (4) a beta regression model. Multilevel modelling approaches were chosen in order to take account of the longitudinal framework of the three trials, which included measurements collected from the same participants at repeated intervals over the course of the studies. These multilevel modelling approaches were used to enable the clustering of observations at the patient level.
In all models, the covariate set was examined. The mean predicted utilities by MMD (and by treatment status) were estimated with standard errors using the delta method [38]. Multilevel modelling techniques estimate the differences between individuals, acknowledging that measurements from the same person over time are much more likely to be correlated than measurements from different individuals [39].
In all four models the covariates included were as follows: treatment status (erenumab 70 mg or 140 mg vs placebo), age, sex (female vs male), race (black, Asian, other vs white), MMD at baseline, MMD at each visit and visit. The mean predicted utilities by MMD (and by treatment status) were estimated with standard errors using the delta method [38].

Linear mixed effects model with REML
A linear mixed effects model has been estimated with the REML method as a random-effects at the patient level, to estimate subject-specific effects and to provide distilled estimates of the specified covariates (the fixed component of the model) and estimates of the random variation according to the individuals [2,40]. Acknowledging that a standard linear regression model (although hierarchical) is not well suited for an outcome that has a delimited unit interval such as utility values, which are typically characterized by a truncated support at both ends of the distribution (usually ranging between 0 and 1) and with heteroscedasticity (i.e. the variance of the residuals is not constant) as an integral part of such limited dependent variables [41], models fitted under the generalized linear model (GLM) framework have been shown to produce better estimates than those estimated by the linear model [42]. Fractional response models with a logit link function or a probit link function

Linear mixed effects model with REML
Another valid strategy for handling proportions data in which zeros and ones may appear (as well as intermediate values) [43] is the fractional response model [44]. This model can be estimated via the GLM suite using the logit link function (i.e. the logit transformation of the response variable) or the probit link function [45]. Robust standard errors have been estimated allowing for clustering at individual participant level.
Fractional response models with a logit link function or a probit link function where G(.) is a probit or logit function.

Beta regression model
The fourth model fitted is a beta regression that is useful to model continuous, 0-1 bounded and beta distributed outcomes. In the data set for this analysis, outcomes were constrained to have values higher than 0 and less than 1. Because some patients had a mapped utility (EQ-5D) value of 1, these values were decreased by 1.110 e-16 , a marginal decrease to ensure minimal difference from the original values. As for the fractional response models, robust standard errors were estimated. The density of the beta-distributed dependent variable U conditional on covariates X can be written as Where μ X = E(U ti | X) is linked to the covariates set by g(μ X ) (a logit function of the linear predictor described above) and φ X is the scale parameter of the conditional variance of U.
Goodness of fit of the regression models was assessed by RMSE, MAE and visual assessments.

Baseline characteristics
The analysis sample included data from 2199 patients. Characteristics of the patients from the three studies are presented in Table 1. Baseline characteristics were similar across the three studies. For example, the average age was in the range 40.4-42.9 years across the three studies. The majority of patients in all studies were white and female, as is typical in migraine.

Validated mapping algorithms
In episodic migraine, the HIT-6 and MSQ algorithms explained 8 and 14% of the variance, respectively, as measured by adjusted R 2 , and had similar prediction errors (RMSE of 0.32). In chronic migraine, the HIT-6 and MSQ algorithms explained 19 and 30% of the variance, respectively, and had similar prediction errors (RMSE of 0.33 and 0.32).

Comparison of regression outputs and utility values
Four regression models were fitted using mapped utility values, MMD and treatment group (erenumab 70 mg, 140 mg and placebo), adjusting for age, sex, race and baseline MMD in the various time periods considered. Mapped utility values are described in Table 2 Table 2).
The predicted mean utility values by number of MMD are shown in Figs 1and 2 after mapping from MSQ and HIT-6, respectively. Mapped utility values for erenumab patients were consistently higher than for placebo patients with the same number of MMD. All models tested showed similar fittings and fit the observed data well (Figs 1 and 2). Because of the different likelihood functions used for the four regression models proposed in this analysis, the fittings could not be compared via Akaike Information Criterion (AIC) or Bayesian Information Criterion (BIC). The regression outputs for the utility values after mapping from MSQ and HIT-6 are shown in Tables  3 and 4, respectively. In all models tested, the treatment effect of erenumab 140 mg compared with placebo was significantly higher when mapped from HIT-6 (Table 4). Treatment with erenumab 140 mg compared with placebo was also significantly higher in all models tested when mapped from MSQ (Table  3). Treatment with erenumab 70 mg compared with placebo was not significant when mapped from MSQ or HIT-6 (Tables 3 and 4). Baseline MMD was significant in all models tested apart from the linear

Multiple imputation analyses
The analyses based on the multiple imputed data sets were substantially similar to the complete case analyses for MSQ and HIT-6 (see Online Resource: Additional file 3: Table S2 and Additional file 4: Not all patients completed all the scheduled visits: 25 (3.7%), 88 (9.2%) and 30 (5.2%) patients did not complete the planned assessments in the phase 2 study (which was planned for four visits), STRIVE (planned for seven visits) and ARISE (planned for four visits), respectively.

Discussion
Our analysis describes the assessment of longitudinal approaches in modelling utility values that go beyond simple linear models. In all cases, utility values decreased as the number of MMD increased, and these associations were non-linear with potential ceiling effects. The improvement in average utility values over time in the placebo groups is consistent with the placebo effect on mean MMD frequency observed in the clinical studies [23,24,28]. Consistently, mapped utility values for patients treated with erenumab 70 mg and 140 mg were higher than those for participants treated with placebo with the same number of MMD, although only the 140 mg dose of erenumab was significant. This finding is consistent with utility values applied in a previous economic model for onabotulinumtoxin A, which assumed an additional treatment effect of active treatment compared with placebo [46]. This additional treatment effect is most likely driven by improvements in migraine duration and severity, that may not be fully captured by the primary clinical endpoint, MMD.
All models tested showed very similar fittings, although the beta regression model may be considered as the optimal candidate for longitudinal and bounded data, because the beta regression model has the flexibility of a beta distribution model and has previously been used to model quality-adjusted lifeyears in health economic studies [41,47]. To determine the generalizability of this model, it would be necessary to examine mapped utility values from other study data.
Some limitations of the analysis have to be acknowledged. Firstly, the HIT-6 and MSQ scores were captured only monthly in the three clinical studies. It may be beneficial to capture HRQoL data more frequently to accurately capture patients' experiences within the 1-month time periods [34,48]. This is particularly relevant, because time and other factors can influence how individuals with migraine can report their HRQoL [34]. Secondly, the use of likelihood-based statistics such as AIC/BIC could not be used to compare models with different likelihood functions. The analysis is further limited by the duration of the erenumab clinical studies: longer studies may be able allow more robust models to be fitted. Finally, because there were very similar RMSEs between the models, it was important to assess the non-linear associations between utilities and MMD. In future studies, it would be useful to assess any longer-term time trends, introducing a specific fixed-effect covariate and assessing the potential     interaction between treatment and MMD. Exploration of such models as response mappings to predict the levels of utilities would be of interest. Future studies that examine mapping from a measure such as the Migraine Physical Function Impact Diary, which is a daily, migraine-specific measurement of patient-reported outcomes, would also be worth considering [48]. The analysis described here has applications for economic evaluations. Cost-utility analysis is widely recognized as a useful approach for measuring and comparing the efficiency of different health interventions [49]. Furthermore, longitudinal approaches for modelling utilities can be appropriate when considering economic evaluations because they can capture changes in health utility over time. In using utility values that are useful for decision-making bodies, the robust findings of this analysis, consistent across the models fitted, demonstrate the value of this data for health economic evaluations for migraine prevention and treatment.

Conclusions
Our analysis showed that all models fitted the observed data well. Mapped utility values for patients receiving erenumab were higher than those for patients with the same number of MMD receiving placebo, indicating that treating migraine may have benefit beyond simply reducing the number of migraines a patient experiences and may translate into improvements in HRQoL. Linking patient utility values to the number of MMD allows utility estimates for different levels of MMD to be predicted, for use in economic evaluations of preventive therapies. More broadly, the analysis demonstrates the application of different models for fitting utilities from study data.