Estimating health related quality of life effects in vitiligo. Mapping EQ-5D-5 L utilities from vitiligo specific scales: VNS, VitiQoL and re-pigmentation measures using data from the HI-Light trial

Background Vitiligo is reported to affect 2% of the world’s population and has a significant impact on health related quality of life (HRQoL). The relationship between HRQoL and clinical outcomes used in vitiligo require further examination. Mapping condition specific measures of HRQoL: vitiligo specific quality of life instrument (VitiQoL), vitiligo noticeability scale (VNS) and vitiligo re-pigmentation scores (RPS) to the EQ-5D have not yet been reported. Methods Data collected from a randomised clinical trial (HI-Light) in vitiligo was used to develop mapping algorithms for the EQ-5D-5 L and the relationship between HRQoL, clinical outcomes and EQ-5D were investigated. Two EQ-5D-5 L value sets (Van Hout and Alava) using linear and non-linear models were considered. Logistic regression models were used to model the probability of vitiligo noticeability (VNS) in terms of RPS, EQ-5D and VitiQoL scores. Results Mapping from RPS appeared to perform better followed by VNS for the Alava crosswalks using polynomial models: Mean observed vs. predicted utilities of 0.9008 (0.005) vs. 0.8984 (0.0004) were observed for RPS. For VNS, mean observed vs. predicted utilities of 0.9008 (0.005) vs. 0.8939 (0.0003) were observed. For VitiQoL, mean observed vs. predicted utilities of 0.9008 (0.005) vs. 0.8912 (0.0002) were observed. For patients with the least re-pigmentation (RPS < 25%), a Total VitiQoL score of about 20 points gives around an 18% chance of vitiligo being no longer or a lot less noticeable. Conclusion The algorithm based on RPS followed by VNS performed best. The relationship between effects from vitiligo specific HRQoL instruments and clinical RPS was established allowing for plausible clinically relevant differences to be identified, although further work is required in this area. Supplementary Information The online version contains supplementary material available at 10.1186/s12955-023-02172-4.


Introduction
Vitiligo affects 2% of the world's population with significant impact on health related quality of life (HRQoL) [1].Responses from the EQ-5D (EQ-5D-5 L or EQ-5D-3 L) are converted to 'health utilities' , required by various health technology assessment (HTA) bodies, to assess the value of health technologies [2,3].In particular, the EQ-5D is used to estimate gain (or loss) in quality-adjusted life years (QALYs) when determining cost-effectiveness (cost per QALY gained).The EQ-5D, however, is not always collected.This can happen where condition specific HRQoL measures (CSM) are considered more important, or, when EQ-5D is considered irrelevant to (payer) decision making [3,4].Moreover, effects from some HRQoL measures have not been evaluated in detail when anchored against clinical outcomes such as re-pigmentation scores (RPS).
When EQ-5D utilities are not collected, alternative methods such as 'mapping' may be used [4][5][6].Mapping involves the development of an algorithm through statistical methods used to predict utilities from CSMs collected in a different trial [7,8].In practice, an algorithm is published so that patient level utilities can be computed and used to generate QALYs.Mapped utilities can therefore be superior to those determined from different populations.The benefits and limitations of mapping have been reported elsewhere [8][9][10][11].
There are several CSMs used for vitiligo patients.The dermatology life quality index (DLQI) is a 10-item validated questionnaire to measure how much the skin problem has affected patients [12].The European Academy of Dermatology and Venerology Task Force (EADV) evaluated the use of HRQoL instruments in vitiligo, noting the DLQI as the most 'frequently' used instrument [13].However, the EADV also noted some items in the DLQI are irrelevant for most patients with vitiligo (e.g., itching).Consequently, other vitiligo specific HRQoL instruments may be more relevant: the Vitiligo Impact Scale (VIS) [14], the Vitiligo Area Scoring Index (VASI) [15], Vitiligo specific quality of life instrument (VitiQoL) [16] and Vitiligo Noticeability Scale (VNS) [17].
The VIS [14] is a validated instrument with a key limitation being lack of generalizability because responses to the questions revolve around ethno-centric aspects of vitiligo in India [14].The VitiQoL is a 16-item instrument where participants (over the past month) rate their vitiligo using a 7-point scale ("Not at all" to "All of the time") [16].The VNS measures treatment success, over a 5-point scale [17].In practice, an image (photograph) is shown to a patient prior to treatment (baseline) and at several points, post treatment.VNS scores of between 3 and 5 suggest good outcomes.
Currently, no mapping algorithm exists in a vitiligo specific population, although several are published in psoriasis or atopic dermatitis [18][19][20][21][22][23][24].These algorithms used conversions (crosswalks) between EQ-5D-3 L and EQ-5D-5 L using methods (Van Hout) [25] no longer fully supported by the NICE DSU [26].None of these algorithms used data from randomized controlled trials (RCT); and none relate utilities to clinical outcomes (such as percent re-pigmentation).Consequently, a vitiligo specific mapping algorithm that reflects current methods is needed.There is also a need to identify the relationship between effects from vitiligo specific HRQoL, generic HRQoL and clinical outcomes.
We use data from the HI-Light trial [1], a RCT in adults and children, aged ≥ 5 years; with vitiligo affecting < 10% of skin, to develop new mapping algorithms from VNS, VitiQoL and re-pigmentation scores (RPS).The HI-Light trial took place in the United Kingdom (UK).The HI-Light protocol was approved by the East Midlands-Derby Research Ethics Committee (14/EM/1173), MHRA (EudraCT 2014-003473-42) with ISRCTN: 17,160,087.
Utilities from two currently used crosswalks: Van Hout et al., (2012) [25] and Alava et al., (2022) [27] are compared.The findings from this research aim to fill an important knowledge gap allowing health utilities to be derived from vitiligo specific HRQoL instruments as well as examining the relationship between clinical and HRQoL effects.

Data collection
Data were collected from 517 participants (398 adults; 119 children) from the HI-Light trial [1] ; 173 randomised to Topical Corticoid Steroids (TCS); 169 to handheld Narrowband Ultraviolet B (NB-UVB (a form of phototherapy)) and 175 to a combination of potent topical corticosteroid (TCS) + NB-UVB (1:1:1 allocation).Participants (aged > 5 years), had nonsegmental vitiligo (≤ 10% body surface area), and at least one vitiligo patch that had been active in the last 12 months.Data collected at baseline (screening), 9 and 21 months post randomization were used in this analysis.The primary outcome was participant-reported treatment success ('a lot less noticeable' or 'no longer noticeable') at the target patch of vitiligo, after 9 months of treatment, using the VNS.The full results of the trial have been reported elsewhere [1], where only the combination treatment (TCS + NB-UVB) was statistically superior to TCS (p = 0.032).

HRQoL instruments EQ-5D-5 L
The EQ-5D-5 L is a generic HRQoL measure used in economic evaluations.The 5 L is measured on a 5-point scale for each of the following domains: Mobility, Self-Care, Usual Activities, Pain/Discomfort, and Anxiety/ Depression.Scoring the EQ-5D-5 L is well established [2]: each of the scores generate a health state between '11111' (best possible health state) to '55555' (worst possible health state) converted to a score between − 0.594 (worst possible health state) to 1.00 (best possible health state) using the UK (or a country specific) tariff [25].However, since the scoring (tariff ) for the 5 L version for the UK remains to be finalised [28], the Van Hout (VH) crosswalk between the EQ-5D-3 L (3 L) and 5 L is used, which 'converts' 5L utilities to 3L utilities as an intermediate step [25].Reporting of EQ-5D in the HI-Light trial was based on the VH crosswalk [1].The Alava crosswalk is also used in this analysis for the purpose of 'converting' 5L utilities to 3L utilities [27].

VNS
The VNS is a patient-reported measure (5-point scale) of vitiligo treatment success [17].Patients provide responses as: [1] More noticeable, [2] As noticeable, [3] Slightly less noticeable, [4] A lot less noticeable, and [5] No longer noticeable, in relation to vitiligo.Scores of 4 or 5 represent treatment success.VNS scores of 4/5 have been used as primary/secondary outcomes in trials [17].The relationship between a score of 4 or 5 and clinical outcomes such as RPS has not been previously fully explored.

VitiQoL
The VitiQoL is a 16-item HRQoL assessment where patients rate (7-point scale) various aspects of their vitiligo from: "Not at all" to "All of the time" [16].The total score (range 0 to 90) is derived, with high scores indicating worse HRQoL.No clear clinically relevant effect size is reported.

Re-pigmentation measures
Percentage re-pigmentation was assessed using blinded clinician assessment of digital images.Re-pigmentation scores measured in the HI-Light trial were computed for each lesion on the face, hands/feet and 'other' body parts.In practice, the total area per lesion is computed at baseline and post baseline as a percentage of the body surface area (BSA) and the differences in these measures between baseline and post-baseline are expressed as a percentage change.Each percentage change falls into one of 4 classification categories: 0-24%, 25-49%, 50-74% and 75-100%.This is equivalent to the vitiligo area score index (VASI) [15].Re-pigmentation in the trial was assessed on the hands/feet, head/neck (i.e., face) and rest of the body.Mapping was based primarily on combining the data across hands/feet and face as this is where the impact on HRQoL was considered to be greatest.

Mapping model specification
Mapping was undertaken (by instrument and crosswalk) following commonly accepted methods [7][8][9][18][19][20][21][22][23][24].Firstly, observed EQ-5D-5 L responses were converted to utilities (Alava and VH).Secondly, after plotting utilities, several models were considered: Linear, and Non-Linear in a frequentist and Bayesian framework.Thirdly, model performance was based on metrics such as: observed vs. predicted mean utilities and QALY estimates, mean absolute error (MAE), root mean square error (RMSE), Akaike / Bayesian Information criterion (AIC/BIC), Deviance Information Criterion (DIC for Bayesian).The number of health states observed were reported and plots of predicted vs. observed utilities were generated for each model.Covariates such as age were considered in early model selection criteria, however age and gender were found to have no statistical significance when included in the final VitiQoL and VNS models.In addition, a covariate for the location of the lesion (face vs. hands/feet) was included (but subsequently dropped as not statistically significant).
Cross validation methods were used on 50% of the data where data were randomly split into half and models were built with one half of the sample to predict the estimates that could be compared with the observed values in the other half of the sample.

Model structure: VitiQoL
Linear models (M1 and M2) for Total VitiQoL score (TVS) and each of the 16 VitiQoL component scores in a frequentist and Bayesian (TVS only) framework (M3) were modelled.For M3: Bayesian Linear Models (BLM), we assume a model of the form: where β i is the fixed effect for TVS, γ j is the random subject effect ~ N(0,σ γ 2 ); ϵ j ~ N(0,σ 2 ), the random error and µ is the overall mean; we assume further, a noninformative prior for β i such that: The priors on the variance terms σ 2 and σ 2 γ are assumed to be inverse gamma (IG): The normal prior on the parameters for the fixed effect (β i for TVS) is assumed to have a large variance suggesting little or no knowledge about the regression parameters β i .The priors for the variance terms use the IG with ω and θ for the shape and scale parameters respectively, reflecting lack of knowledge about the variance coefficients.The DIC was used to evaluate model fit as an approximate to the AIC [29].Hence, three models : M1, M2 and M3 were fitted for VitiQoL.

Model structure: VNS and RPS
For VNS and RPS, due to the discrete nature of the categories and initial plots (Supplementary Figure 2), Linear (M4), Non-Linear (M5) and Polynomial regressions (M6) were fitted for RPS and VNS.The non-linear form (M5) of the model is part of the 4 parameter models described in Ratowsky (1990) [30] allowing greater flexibility for curve fitting of conditional mean models: The parameters, α, β, γ and δ each refer to a general intercept, scale, shape, and asymptote (e.g., to ensure estimates to not exceed 1).Models M6 were fitted as follows: Since VNS has 5 categories and RPS has 4, a polynomial regression of orders 4 and 3 were fitted to ensure adequate degrees of freedom to estimate all parameters.For RPS, a covariate for vitiligo location was included to compare utility predictions between hands/feet and head/face, this covariate was not statistically significant and not included in the final model however results for the inclusion of this covariate are reported in Supplementary Table 5.The results for the inclusion of age in the RPS polynomial model are also reported in Supplementary Table 5.

HRQoL effects and relevance to re-pigmentation
The relationship between responses based on RPS, TVS, EQ-5D and VNS were investigated using modelbased estimates.Logistic regression models were used to determine whether any viable cut-off scores for TVS or EQ-5D utilities could be associated with a VNS score of 4 or 5.All analyses were conducted using SAS ® v9.4 [31].Where RPS scores suggested de-pigmentation (i.e., worsening/negative scores), these were taken into account.

Mapping model performance
For all models, the performance metrics are reported in Tables 3 and 4. Final model estimates (coefficients) are reported in Supplementary Tables 3a and 3b, and final mapping algorithms are presented in Supplementary Table 4. Results from the Alava crosswalk, in general provided a 'better' mapping algorithm:

Mapping from VitiQoL
The results for VitiQoL were broadly similar between models, however the 'best' fitting model was considered to be the BLM (M3) using the TVS (Alava) (Table 3, Figs. 1 and 2): DIC= -1399; observed mean (SE) vs. predicted means were: 0.9008 (0.005) vs 0.8912 (0.002); difference in mean QALY of 0.0233; RMSE of 0.149 and MAE of 0.096.The predicted EQ-5D utility scores based on VitiQoL are therefore best estimated as (Supplementary Table 4):

Mapping from VNS
Results from each of the models for VNS (Table 4, Figs. 1  and 2) show better performance and fit with non-linear models.In particular, the polynomial models (M6) were best.Moreover, the fit at a specific VNS score category was also good (Supplementary Fig. 2): AIC = -2899; observed vs. predicted mean utility was 0.9008 vs. 0.8939; the RMSE was lowest for M6 (RMSE = 0.0022).Figure 2 shows the distribution of differences in observed vs. predicted utilities.The predicted EQ-5D utility scores based on VNS can therefore be estimated as:

Mapping models for RPS
For RPS (Table 4, Figs. 1 and 2), a slightly better fit was observed with M6, again with the Alava crosswalk: AIC= -2888; observed vs. predicted mean utilities: 0.9008 vs. 0.8984 and in particular, predicted mean utilities at the RPS categories (Supplementary Fig. 2) were closest to the observed for M6: 0.895 vs. 0.891 for RPS 1 (< 25%); 0.924 vs. 0.917 for RPS II (25 − 49%); 0.898 vs. 0.896 for RPS III (50-74%) and 0.936 vs. 0.939 for RPS IV (75 − 100%) for observed vs. predicted mean utilities.Mean QALY differences were also small for M6: 1.5887 vs. 1.5699 for observed vs. predicted QALYs, respectively.Consequently, the predicted EQ-5D utility scores based on RPS can be estimated for VH and Alava (Supplementary Table 4) as: Although a covariate for Vitiligo location was considered, this was not statistically significant (p = 0.420) and had no statistically significant impact on model predictions when included in M6 Alava (mean difference of 0.0004, p = 0.420) and was therefore not included in the final model.In general, vitiligo location was not statistically significant in any of the models.

Testing the final models using independent data (Cross Validation)
The non-linear VNS and RPS models (M6) using out of sample predictions performed best.Model M6 (VNS) yielded mean (SE) observed vs. predicted values of 0.894 (0.007) vs. 0.894 (0.0004); mean difference of 0.0008 (Supplementary Table 6); for RPS, these were 0.901 (0.007) vs. 0.899 (0.0006); mean difference of 0.002.Mean QALYs differences were also broadly similar between Alava and VH (Supplementary Table 6).Figure 4 shows the chance of vitiligo noticeability in terms of TVS and utility for varying RPS responses.For patients with the least re-pigmentation (RPS < 25%), a TVS of about 20 points gives around an 18% chance of vitiligo being no longer/a lot less noticeable; for patients with higher re-pigmentation (RPS ≥ 75%), an EQ-5D utility of around 0.80 provides around 19% chance of vitiligo being no longer or a lot less noticeable.For RPS

Discussion
We have provided a way of estimating EQ-5D utilities from two vitiligo specific HRQoL instruments and shown a relationship between CSMs of HRQoL and clinical outcomes such that plausible clinical cut-off scores for RPS can be associated with HRQoL.Several useful mapping algorithms with adequate performance characteristics when compared to available mapping algorithms [18,19,21] in literature have been presented.The performance of these vitiligo specific algorithms appear more favourable compared to those reported in published DLQI algorithms [18,21]: for example, MAEs for DLQI ranged between 0.1873 and 0.2009 [18], somewhat higher that those reported in this analysis.We have also for the first time compared the algorithms between two crosswalks: VH and Alava and showed these to provide broadly similar results.In addition, we demonstrated a coherent relationship between VitiQoL, VNS, RPS and VNS which may be useful for designing future research in vitiligo; and finally, we offer an approach that could provide a way of relating HRQoL benefits in vitiligo with clinical effects that can yield tangible cut-off scores.It is important to note, appropriate adjustments to RPS classification may be required before implementation of the RPS mapping algorithms, depending on the type of re-pigmentation response recorded.For example, if de-pigmentation occurs, this can be incorporated by creating extra category/categories reflecting a worsening condition.There are several limitations to this research.Firstly, all published mapping algorithms when applied to independent data will reflect differences in factors such as patient characteristics, trial conditions and assessment points.Secondly, by 9 months only around 60% of the data in the HI-Light trial were available for the mapping model and the data did not include a complete range of EQ-5D-5 L health states; thirdly, scales such as the VNS and RPS are discrete and as such patient level modelling tends to classify predictions into a limited number of possibilities.Fourthly, in an attempt to anchor clinical outcomes with HRQoL, the chance of no or little noticeability did not exceed much more than 20%, despite high HRQoL benefits in measures such as EQ-5D and Total VitiQoL score.Finally, there is a concern that a mapped utility has no known immediate health state (a type of double mapping): for example, a utility value of 0.193 has a health state of '11153' , whereas a predicted (mapped) utility of 0.866 has no known health state profile (UK tariff ).However, this is also true for any mean utility computed from known health states and therefore would not seem to be a valid objection.Despite these limitations, the mapping algorithms presented are a 'first' in vitiligo that compares two different crosswalks with performance metrics similar or better to that reported in mapping literature.In addition, we provide a basis for further research in delineating cutoff scores that can anchor clinical outcomes such as RPS with HRQoL.

Conclusion
We have shown that mapping EQ-5D with each of VitiQoL, VNS and clinical measures such as RPS is plausible.Mapping with VNS and RPS appears to show superior performance than with VitiQoL and a relationship between EQ-5D-5 L utility, VNS, VitiQoL and RPS can be considered as a basis for defining clinically meaningful HRQoL differences.

Table 1
Baseline characteristics SD Standard deviation, TCS Topical Corticoid Steroids.NB-UVB Narrowband Ultraviolet B (a form of phototherapy)

Table 2
Summary of health states & EQ-5D-5 L utility scores HS Number of health states, VH Van Hout Crosswalk, SE Standard Error N = 914

Table 3
Model performance: VitiQoL mapping algorithms M1 Linear Model, M2 Linear Multivariate Model, M3 Bayesian Linear Model, MAE Mean Absolute Error, RMSE Root Mean Squared Error, QALY Quality Adjusted Life Year, AIC Akaike Information Criterion, SD Standard Deviation, SE Standard Error.Difference (Mean): Observed -Predicted, QALY estimates derived from baseline, month 9 and month 21 data

Table 4
Model performance: VNS, RPS mapping algorithms MAE Mean Absolute Error, RMSE Root Mean Squared Error, QALY Quality Adjusted Life Year, AIC Akaike Information Criterion, SD Standard Deviation, SE Standard Error, VNS Vitiligo Noticeability Scale, RPS Re-pigmentation Score.Difference (Mean): Observed -Predicted; QALY estimates derived from baseline, month 9 and month 21 data