Multivariate multilevel modeling of quality of life dynamics of HIV infected patients

Background Longitudinal quality of life (QoL) is an important outcome in many chronic illness studies aiming to evaluate the efficiency of care both at the patient and health system level. Although many QoL studies involve multiple correlated hierarchical outcome measures, very few of them use multivariate modeling. In this work, we modeled the long-term dynamics of QoL scores accounting for the correlation between the QoL scores in a multilevel multivariate framework and to compare the effects of covariates across the outcomes. Methods The data is from an ongoing prospective cohort study conducted amongst adult women who were HIV-infected and on the treatment in Kwazulu-Natal, South Africa. Independent and related QoL outcome multivariate multilevel models were presented and compared. Results The analysis showed that related outcome multivariate multilevel models fit better for our data used. Our analyses also revealed that higher educational levels, middle age, stable sex partners and higher weights had a significant effect on better improvements in the rate of change of QoL scores of HIV infected patients. Similarly, patients without TB co-infection, without thrombocytopenia, with lower viral load, with higher CD4 cell count levels, with higher electrolytes component score, with higher red blood cell (RBC) component score and with lower liver abnormality component score, were associated with significantly improved the rate of change of QoL, amongst HIV infected patients. Conclusion It is hoped that the article will help applied researchers to familiarize themselves with the models and including interpretation of results. Furthermore, three issues are highlighted: model building of multivariate multilevel outcomes, how this model can be used to assess multivariate assumptions, involving fixed effects (for example, to examine the size of the covariate effect varying across QoL domain scores) and random effects (for example, to examine the rate of change in one response variable associated to changes in the other).


Background
Acquired immune deficiency syndrome (AIDS) affects not only the patients' physical condition but also their mental health, financial aspects and social relationships [1]. However, the arrival of ART and its widespread availability in many settings has improved quality of life (QoL), suppressed viral loads and reduced the mortality rate among people living with HIV/AIDS [2]. Notably, QoL is predictive of good adherence to life-long therapy [3,4] and is also an important marker of the well-being of patients living with a chronic disease for which there is currently no cure [5]. As the longevity of people living with HIV (PLWH) improves as a result of ART, assessment of QoL of the patient has become an important issue for policymakers and clinicians to plan for accessible, comprehensive and effective HIV care services [6]. It is also important to identify the factors affecting a patient's QoL that are most amenable to intervention in this context.
QoL is a multi-dimensional and dynamic conception [7], including significant personal examinations on comprehensive aspects of patients well beings, physical health status, functional capacity, social-relationship, emotional wellbeing [8] and even spiritual well-being over time [9]. Herein, we define QoL as per world health organization (WHO) definitions as an individual's awareness of their position in life in the context of culture and value systems in which they live and in relation to their expectations, goals, standards, and concerns [10]. According to WHO, QoL involves a wide-ranging assessment of the perception of an individual regarding the psychological, physical health, social-relationships, level of independence, environment and personal belief domains. Based on this, WHO QoL Brief version instrument was developed to capture multi-dimensional perspectives of QoL [11].
Several studies have employed linear regression for cross-sectional data [12][13][14][15][16][17] and linear mixed effects for longitudinal data [18][19][20][21][22] to evaluate the QoL of patients. These studies have assumed independence between QoL domain scores, estimating a separate model for each QoL domain score. However, independence is an unlikely assumption since health conditions may concurrently affect multiple aspects of life. For instance, physical health scores (i.e severe pain, fatigue and lack of energy) may be affected by psychological well-being scores (i.e. anxiety and depression). Furthermore, as noted [23], the residual correlation from such a univariate regression or linear mixed model shows the violation the conditional independence outcome. Consequently, the violation of this independence assumption can result in biased p-values, incorrect confidence intervals, and inflated effect sizes [24]. The multivariate multilevel methods play an important role to get rid of these kinds of restrictions by allowing variation at different levels and accounting for multivariate correlations [25].
Few researchers employ multivariate techniques to evaluate cross-sectional QoL outcomes. In particular, Conigliani et al. [26] and Hernández-Alava and Pudney [27] model EuroQoL 5-dimension (EQ-5D) domain scores using multivariate ordered probit models and Copula ordinal regression, respectively. However, those papers were confined to the use of multivariate method for cross-sectional, discontinuous, non-normal and truncated data at both ends (i.e. the domain scores range from − 0.594 to 1). In our study, we extended the multivariate method in such a way that the correlations among the observations within a cluster (subject) and the dependence between the QoL domains scores are well accounted for. This is done by modelling the longterm trend of QoL domain scores based on multivariate multilevel models. In addition to that, the effect of several prognostic factors on the baseline and rate of change of QoL domain scores of HIV infected patients was investigated including white blood cell, red blood cell, and blood chemistry parameters. Another important case that we examined is whether the predictor variables have different relationships across QoL domain scores over time.

Study population
The data is from CAPRISA 002 Acute HIV infection study which is an ongoing prospective cohort study conducted among HIV-infected women. Patients were recruited at two sites in KwaZulu-Natal-South Africa, a rural site in Vulindlela and an urban site in the city of Durban. The original study, which started in 2004, enrolled a cohort of HIV uninfected women whose age was greater than 18 years with the aim to describe immunologic, clinical and virologic characteristics of HIV-1 disease [28]. Patients who seroconverted during the HIV uninfected stage were enrolled in the Acute HIV infection phase, and then followed-up during chronic infection, ART initiation, and up to 6 years on ART. Participants without well documented estimated date of HIV infection, and those who did not have at least two follow-up clinical attribute measurements, were excluded. Finally, two hundred and nineteen (219) participants were included in the study. Further information about the ongoing prospective HIV cohort study (CAPRISA-002), including women eligibility criteria and the enrollment procedure, were reported in the past studies [28,29].
CAPRISA 002 trails initially enrolled HIV-negative (phase I) women into different study cohorts. The women who seroconverted were enrolled into the acute infection phase (i.e. phase II: weekly visits up to 3 months post-infection), then into early infection (i.e. phase III: monthly visits from 3 to 12 months) subsequently into established infection (i.e. phase IV: quarterly visits for more than 12 months) and afterward on cART (phase V). For the purpose of this study, samples for immunologic, virologic, QoL domain scores and clinical parameters were measured at each visit (i.e. from phase II to V). These longitudinal QoL outcomes, immunologic, virologic, and clinical measurements were measured for several followed-up visits. Of all the 9256 follow-up visits, we dropped 496 visits (5.4%) as patients did not have complete responses for QoL domain scores and clinical parameters, resulting a total of 8760 followed-up visits from 219 HIV infected women.

Variables and measurements
Once enrolled in the study, face-to-face interviews were conducted in either isiZulu or English by trained counselors, nurses or research clinicians and study data were obtained at various stages depending on the types of measures being assessed. QoL was assessed using the World Health Organization quality of life (WHO-QoL) HIV BREF instrument [10]. All items are on a 5-point Likert scale, in which 1 indicates low and 5 indicates high perceptions. The average scores of all items in each domain were multiplied by 4 to convert domain scores to the range of 4 to 20 to make it comparable with the scoring pattern of WHO-QoL-100 [11]. For the purpose of this study, we used the following domains score. Firstly, the physical health scores, that measure the impact of the disease on the activities of daily living, dependence on therapeutic substances, presence of pain, fatigue, lack of energy and initiative and perceived working capacity. The second is about the psychological score domain that assesses the patient's thoughts about body appearance, positive and negative feelings, self-esteem and personal beliefs, higher cognitive functions, spirituality, anxiety, suicide, and depression. The third domain is about social relationships which assesses personal relationships, social contacts, social support, and sexual activity. The fourth domain is devoted to the level of independence and assesses areas such as mobility, activities of daily living, dependence on treatments and work capacity. The QoL of each patient was assessed repeatedly over the study period.
The effect of several prognostic factors on long-term dynamics of QoL was investigated including (1) demographics: date of the clinical visit, age, gender, marital status, and educational status, (2) risk variables: sex under the influence of alcohol, contraceptive use, and substance use, (3) past opportunistic illness: tuberculosis and hypertension (4) clinical attributes: white blood cell parameters (lymphocyte count, neutrophil, leucocyte count, monocytes and eosinophils), RBC parameters (Hb, hematocrit, MCHC, MCV, MCH, RDW), physical examination parameters (blood pressures (BP), pulse rate (PR), weight and height), protein and other parameters, etc. (Fig. 1).

Factor analysis
Since our data have a large number of clinical variables, we used exploratory factor analysis in order to group and minimize the number of variables. Factor analysis was done by creating the principal components of the original variables. By using the Kaiser-criterions [30] and visual scree plot analysis [31], principal components (PC) with eigenvalues greater than 1 and PC that do not belong to the scree, were kept. A maximum likelihood extraction method with varimax rotation was used. Factor loadings describe the relationship of each clinical variable with each factor. The Factor loading is considered strong if it is greater than 0.6, moderate if it is in the range 0.4-0.6, and weak if less than 0.4 [32]. Each observation was assigned a score for each rotated factor on the basis of the loadings of the subject's original variable levels. Accordingly, from the 24 clinical variables in the study, we managed to group them in order to create 9 PC, defined as granulocytes components, mononuclear components, eosinophils component, Hb and haematocrit component, red blood cell indices, liver abnormality component, electrolyte component, lipid component, and protein component. (See Table 1).

Multivariate multilevel model
The multilevel framework for medical data is important for both practical and methodological reasons. Practically, we are frequently interested in variability amongst advanced level factors, such as individual-to-individual variations. Methodologically, the multilevel framework leads to the relationship amongst the observations within a group. The multilevel framework can accommodate a cohort data and the correlation among observations [25,33,34]. However, the multilevel multivariate model, which extends the multilevel model to more than two response variables, has not been widely adopted within the medical and epidemiological studies [33,35]. This work presented multivariate multilevel models for the long-term dynamics of QoL and illustrate how to interpret and fit models and compare the effects of covariates across the outcomes. The separate multilevel models for each QoL domain score can be stated as: where y ktj represents the k th QoL domain scores at time t for patient j. The term u kj represents a random effect of patient-specific differences for the kth QoL domain score at baseline, while v kj represents a random effect of patient-specific differences for the kth QoL domain score in the rate of change and e kjt represents random error. Since the data in this work are longitudinal cohort data, the observations for each individual may be correlated. The random-effects stated above to allow us to capture this correlation. Specifically, in Eq. (1) suppose that the QoL score variables are independent, and the random effects for each outcome are normally distributed. Assuming the residual errors for each of the outcomes are also normally distributed [25] it follows that, By fitting the above independent models in Eq. (1-3), we implicitly assume that the error terms are independent. However, the four domain scores from the same individuals are almost surely correlated. For example, the longitudinal change in QoL domain scores are more likely correlated with each other; thus, we can estimate the correlation amongst the random slopes. Moreover, because the QoL domain scores are repeated measures on the same individual, there will likely be a correlation between the residual errors [36]. A multivariate multilevel structure will address the above-mentioned limitations because it allows us to model the association between the outcome variables via correlations amongst the random effects and amongst the errors [33,35].
In the separate analysis, we assume that the random effects are drawn from different four normal distributions. In a multilevel multivariate model, the random effects are drawn from a single multivariate normal distribution [37]. The specification of the covariance among random effects and the residual errors is as follow: where Ω G represents the variance-covariance matrix of all eight random effects and Ω R represents the variancecovariance matrix of all four residuals. Furthermore, the multivariate multilevel framework allows us to estimate the variance-covariance matrix of patients intercepts and slope for each of the QoL domain scores. These variance-covariance structures can be interpreted substantively and give estimation improvement accuracy for the regression parameter and test power for the detection of covariates effect. In this study, the normality assumption of longitudinal QoL domain scores was checked via the Q-Q plot using the transformed residuals errors based on the fitted model (as discussed by [38]). Secondly, we fit multivariate multilevel models for QoL outcome data. Thirdly, we considered the significant covariates effects for physical health, physiological score, level of independence and social relationship score in multivariate multilevel results. Finally, we compared the effects of covariates across the outcomes using a post-analysis contrasts.
All of the analyses were implemented using R-3.6.2. Statistical analysis was conducted assuming a two-sided 5% level of significance. Table 2 shows the distribution of the socio-demographic and variables at the beginning of patient follow-up. All participants were black females (n = 219), with a median age of 25 years (Interquartile range, IQR, [22][23][24][25][26][27][28][29][30]. The majority of participants were married or with a stable sexual partner 174 (80%), not co-infected with TB 201 (91.8%), not with anemia 208 (95.0%) and overweight or obese 137 (62.8%) based on their body mass index (BMI) measurements. Over half 153 (69.9%) reported having completed Grades 11 of schooling. With regard to the immunological state, 40.6 and 36.5% of patients an initial CD4 count of 350-499 and ≥ 500 cells/mm3, respectively. The log 10 copies/ml VL count of the participants ranged from 1.47 to 6.81 with the first quartile of 3.84, a median of 4.46 and the third quartile of 5.06. (Table 2).

Longitudinal change of QoL scores of HIV-infected patients
The line plot in Fig. 2a, displays the trends of QoL of all HIV patients, over time. From these plots, we can see that the overall QoL scores among HIV-infected patients increased throughout the duration of follow-up periods. The longer a patient stayed under follow up, the more she experienced an increase of her QoL scores. From Fig. 3(a-d), we note that the overall long time trends of QoL scores of patients with higher education levels had increased with greater rates as compared to those patients with lower education (< 8 grade). The trend of QoL (mainly physical health and social relationship) of younger and older adult women increased at slower rates compared to those who were middle-aged (Fig. 3eh). Long term QoL trends were evaluated for patients with different categories of clinical factors. Figure 4(a-d) depicts, increased QoL scores rate of change over time for patients with higher CD4 cell count levels as compared to those patients with lower CD4 cell counts. Figure 4(e-h) shows that the trends of QoL scores increased with greater rates for patients without TB as compared to those who were TB co-infected.

Examining the relationship between QoL domain scores
The multivariate multilevel framework allows us to estimate the correlation matrix between the patient's slopes for each of the QOL domain scores ( Table 3) Table 4 presents the fit statistics for the independent outcome and multivariate (related outcome) model for QoL domain scores of HIV infected patients. Both models include a linear effect of time (month) and the covariates. However, comparing the log-likelihood and AIC values shows that the multivariate multilevel structure was significantly better than separate multilevel models and this implies that there is a strong association among the four markers. Furthermore, model diagnostics have been performed and the residual and influence diagnostics affirmed no violation of implicit and explicit assumptions in our model.

Multivariate multilevel model results
The results of the multivariate multilevel model are presented in Table 5. Patients with normal disease stages were significantly associated with a higher baseline physical health (β = 3.33; 95%CI:2.11, 4.55) and psychological Table 2 Baseline characteristics of ART cohort in the CAPRISA 002 study scores (β = 1.43; 95%CI:1.13, 3.18) as compared to those with severe diseasing stage. Similarly, patients with mild disease stage were significantly associated with a higher baseline physical health and psychological score as compared to those patients with severe diseasing stage. Furthermore, patients with thrombocytopenia were significantly associated with a smaller baseline level of independence. Patients without anemia and TB coinfection had tended to have a better average baseline level of independence as compared to those with anemia and TB co-infection.
Patients with higher education levels tended to have a better baseline physical health scores (β = 0.22; 95%CI: 0.01-0.99) and psychological scores (β = 1.26; 95%CI: 1.18, 2.89), as compared to those patients with lower education. Patients who reported stable sexual partners (β = 2.14; 95%CI:1.01, 3.87) were significantly associated with higher baseline psychological scores as compared to those who report many sexual partners. The result further showed that patients within an age group 21-39 had a smaller baseline physical health, level of independence and social relationship scores as compared to those patients within age group ≤20. Moreover, increased weight is associated with increased baseline physical health score and psychological score. With regard to clinical parameters, we noted that patients having high liver enzyme abnormality were significantly associated with lower baseline physical health and psychological scores. The result further showed that an increased mononuclear and electrolytes components are associated with increased physical health and level of independence, respectively.
We noted that an increase Hb and haematocrit component increases the rate of change of physical health (β = 0.02; 95%CI:0.01, 0.03), psychological scores (β = 0.02; 95%CI:0.01, 0.03), level of independence (β = 0.02; 95%CI:0.01, 0.03) and social relationship (β = 0.01; 95%CI:0.001, 0.02) through the follow-up time. As the latent variable related to electrolytes increases, the rate of change in physical health scores increases with time.  The results further showed that patients with thrombocytopenia had a smaller rate of change of level of independence score as compared to those without thrombocytopenia. Patients with normal and mild disease stage tended to have a better rate of change of physical health score and level of independence score through the follow-up time. Moreover, an increase in viral load decreases the average rate of change in physical health score and psychological score. Similarly, an increase in liver enzyme abnormality decreases the rate of change in physical health through the follow-up time.
Patients with higher education levels tended to have a better rate of change of level of independence score with time, as compared to those with a low level of education (< 8 grade). An increase in weight increases the rate of change in physical health and psychological scores through the follow-up time. Patients within the age group 21-39, tended to have a better rate of change of physical health, level of independence and social relationship scores as compared to those patients with age ≤ 20. Moreover, patients having sex whilst under the influence of alcohol had a smaller rate of change of level of independence and social relationship scores with time, as compared to those who don't have sex whilst under the influence of alcohol.
Comparing the effects of covariates across the outcomes Table 6 presents the results of the effect of covariates across the outcomes. We considered the significant covariate effects for physical health, physiological score, level of independence and social relationship score in Table 5. CD4 count (normal versus severe) has a differential effect across categories but, in general, the direction of the effect is the same for all "baseline physical health score", "baseline social relationship scores" and "baseline psychological scores". Similarly, the effects of the mild diseasing stage are significantly larger for baseline physical health score than for baseline social relationship scores. The results further showed that education (higher education versus lower education), age (21-39 age category versus < 21 age category) and viral load have a significant differential effect across QoL domain scores.
The effect of time slope covariates across the outcomes were also compared. As seen from Table 6, we noted that the effects of viral load are significantly larger for the rate of change of physical health score than for psychological scores. Likewise, the effects of Hb and haematocrit are significantly larger for the rate of change of physical health score and psychological scores than for the rate of change of social relationship score. Moreover, age (21-39 age category versus < 21 age category) have a significant differential effect across QoL domain scores.

Discussion
In this paper, we presented the multivariate multilevel model which accounts for both the multivariate nature of the outcome and the hierarchical structure of the data. A multivariate multilevel model offers advantages over a separate model for each outcome. The multivariate multilevel approach allows us to gain clinically    meaningful adjusted association parameters and more efficient parameter estimates. Moreover, it provides the correlations among the outcomes at the baseline and follow-up time levels. As Tate and Pituch [39] also noted, the multivariate multilevel model provides more accurate standard errors and powerful tests of the covariates as compared to a separate modelling approach. Griffiths et al. [23] also showed that the multivariate modelling assumptions are not violated and the parameter estimates are stable, which we did not find with the separate modelling approach. Furthermore, as noted [39], the multivariate multilevel approach allows the researchers to take into account any missing data for the outcome variables, eliminating the need to delete observations in separate analyses. A further advantage of the multivariate multilevel model is that it facilitates tests of the equivalence of covariate effects across the outcome variables and could also be used to test the effect of covariates at different time points in a longitudinal design. However, since the dimension of the random effects is often high and the density functions of the multivariate multilevel model can be highly complicated, evaluation of the likelihood (integral) can be a major challenge and very intensive.
Our results showed that patients with higher educational level had a significant effect on the overall longterm trends of QoL scores of patients. Tomita et al. [40], Gaspar et al. [41] and Belak Kovačević et al. [42] also found that higher education promotes better QoL,  Key:-Statistical significance: (*)P < 0.05; (**)P < 0.01; (***)P < 0.001 possibly due to better knowledge about their treatment and disease, access to health services, or functional status. QoL scores (mainly physical health and level of independence score) of patients with higher CD4 cell count levels, increased at a greater rate as compared to those with lower CD4 cell counts. These findings are in line with findings reported in previous studies [22,40,[43][44][45][46], where it was reported that increased CD4 cell count over time, led to better QoL outcomes. The trends of QoL scores of patients without TB, increased with greater rates as compared to those who were coinfected, a finding that is in accordance with previously reported research [47], who found that co-infected patients had a lower QoL scores as compared to those living with HIV without TB. Hemoglobin abnormality was also a factor of long-term QoL trend scores of patients, a finding that is in accordance with previous reports [48], where it was found that an increase of hemoglobin level was associated with greater improvement in QoL scores. Patients without thrombocytopenia, was associated with a better level of independence. Similarly, higher scores of latent variables related to electrolytes latent and RBC parameters, lower scores of latent variables related to liver abnormality and lower viral loads significantly improve the rate of change of QoL (mainly physical health score) HIV infected patients.
This study has some limitations, including the missing data, which are expected for a long term follow-up study conducted on data collected from patients' files involving many variables. Another limitation of this study is that we did not include religion domains scores to our model because more than 40% of observations for those domains score are missing. Moreover, some sociodemographic variables that influence long term QoL trends, may not have been included, for example, cites (rural/urban cites). Furthermore, the study findings were limited to adult females.

Conclusions
Overall, from a methodological perspective, this article presents applications for the analysis of hierarchical multivariate response data. As we have shown, these hierarchical multivariate assumptions can be addressed with a multivariate multilevel model in ways that are not possible with univariate (separate) models. This approach provides wide-ranging information about the overall QoL for HIV patients. Though this research presented the usefulness of the multivariate multilevel model for analyzing HIV-infected patients QoL, the approach is applicable to a wide variety of chronic diseases. There is a need for increased research in terms of methods, so hopefully, this article will be helpful to applied researchers (for medical research) to familiarize with the method and interpretation of the results therefrom. From a clinical perspective, modelling of longterm QoL of HIV-infected patients is very useful for improving patient outcomes and formulating an ART management plan.