Multimorbidity and health-related quality of life (HRQoL) in a nationally representative population sample: implications of count versus cluster method for defining multimorbidity on HRQoL

Background No universally accepted definition of multimorbidity (MM) exists, and implications of different definitions have not been explored. This study examined the performance of the count and cluster definitions of multimorbidity on the sociodemographic profile and health-related quality of life (HRQoL) in a general population. Methods Data were derived from the nationally representative 2007 Australian National Survey of Mental Health and Wellbeing (n = 8841). The HRQoL scores were measured using the Assessment of Quality of Life (AQoL-4D) instrument. The simple count (2+ & 3+ conditions) and hierarchical cluster methods were used to define/identify clusters of multimorbidity. Linear regression was used to assess the associations between HRQoL and multimorbidity as defined by the different methods. Results The assessment of multimorbidity, which was defined using the count method, resulting in the prevalence of 26% (MM2+) and 10.1% (MM3+). Statistically significant clusters identified through hierarchical cluster analysis included heart or circulatory conditions (CVD)/arthritis (cluster-1, 9%) and major depressive disorder (MDD)/anxiety (cluster-2, 4%). A sensitivity analysis suggested that the stability of the clusters resulted from hierarchical clustering. The sociodemographic profiles were similar between MM2+, MM3+ and cluster-1, but were different from cluster-2. HRQoL was negatively associated with MM2+ (β: −0.18, SE: −0.01, p < 0.001), MM3+ (β: −0.23, SE: −0.02, p < 0.001), cluster-1 (β: −0.10, SE: 0.01, p < 0.001) and cluster-2 (β: −0.36, SE: 0.01, p < 0.001). Conclusions Our findings confirm the existence of an inverse relationship between multimorbidity and HRQoL in the Australian population and indicate that the hierarchical clustering approach is validated when the outcome of interest is HRQoL from this head-to-head comparison. Moreover, a simple count fails to identify if there are specific conditions of interest that are driving poorer HRQoL. Researchers should exercise caution when selecting a definition of multimorbidity because it may significantly influence the study outcomes.


Background
The presence of multiple chronic conditions, also known as multimorbidity, is in the health care spotlight due to its increasing prevalence, complex management and large economic disease burden [1,2]. Approximately 25% adults have at least two chronic conditions, and more than half the elderly have three or more conditions simultaneously [3]. Although the prevalence of multimorbidity is higher among adults aged 65 years and over, more than half of individuals with multimorbidity are younger than 65 years [4,5], which makes multimorbidity an issue across the lifespan.
Health-related quality of life (HRQoL) is a holistic concept that aims to capture a range of health status indices. To date, the impact of multimorbidity on HRQoL has been investigated based on two general categories of multimorbidity: i) the number of chronic conditions (count definition) and ii) the cluster of chronic conditions (cluster definition) [6,7]. Although HRQoL scores decrease with an increasing number of co-occurring chronic conditions [8], the full impact of multimorbidity on HRQoL is unlikely to be captured by the simple count method [9]. Meanwhile, some specific clusters of multimorbidity, such as the combination of mental and physical conditions [10], have been shown to have a notable effect on HRQoL. However, the impact of the different definitions of multimorbidity on HRQoL in a primary care setting is still unclear [8].
Comparing how the aforementioned categorizations of multimorbidity effect the sociodemographic profile and health status (HRQoL) will provide a conclusive definition of multimorbidity, and consequently, improve health care planning in the context of multimorbidity to match healthcare services with patients' needs. Therefore, using a large, nationally representative dataset, this study examined the performance of the count and cluster definitions of multimorbidity in determining the sociodemographic profile and HRQoL in a general population.

Study design and participants
Our study was a cross-sectional analysis of a nationally representative dataset, the 2007 National Survey of Mental Health and Wellbeing (NSMHWB), which consisted of a series of face-to-face interviews conducted by the Australian Bureau of Statistics (ABS) from August to December 2007. Respondents were randomly selected from a stratified, multistage area probability sample of respondents' homes. More methodological information could be found elsewhere [11]. There were 14,805 eligible dwellings out of an initial sample of 17,352 dwellings due to all household members being out of scope or vacant dwellings. Of these, the final data set consisted of 8,841 respondents (60% response rate) aged 16 to 85 years of age and living in private dwellings [12]. No missing data strategy was used due to the low rate of missing data (2.6%): 21 due to no HRQoL score, 34 due to log-transformed HRQoL score, 180 due to BMI and 6 due to exercise level.

Multimorbidity
Multimorbidity was identified from a pre-specified list including the following self-reported conditions that significantly contribute to the global burden of illness and injury [12][13][14]: asthma, cancer, stroke, heart or circulatory conditions (CVD), gout, rheumatism or arthritis, diabetes or high sugar levels, major depressive disorder (MDD) and anxiety disorder (including agoraphobia, with or without panic disorder, generalized anxiety disorder (GAD) and social phobia). Each chronic condition was coded as present or absent [12]. The diagnosis of mental disorders was established using the World Mental Health Survey Initiative version of the World Health Organization's Composite International Diagnostic Interview, version 3.0 (WMH-CIDI 3.0) [14], which is a comprehensive and fully structured diagnostic interview. The timeframe was a diagnosis in the 12 months prior to the interview. Diagnosis of physical chronic conditions was determined from a pre-specified list by whether the respondent had ever been told by a doctor or nurse that they had these conditions, and stroke was assessed using self-reported stroke symptoms [11].
In the count method, multimorbidity was defined as "two or more" chronic conditions occurring at the same time. To test the validation in cut-off of count based method, the definition of multimorbidity "having 3+ chronic conditions at the same time in one individual" (known as complex multimorbidity) [15] was used as well. In the cluster-based method, hierarchical clustering was used to identify the common clusters of multimorbidity as chronic health conditions can co-occur via some sharing underlying genetic, environmental, or behavioural risk factors [16][17][18]. Assuming N variables, the hierarchical approach initially treated each variable as a cluster before merging the two closest variables into a new cluster. This step was repeated until all variables were merged into one cluster of size N. Jaccard's coefficient was used to calculate the distance of the binary variables (absence or presence of conditions) [16,19]. The results may vary depending on the different distance calculation methods. Both Ward's and the average linkage methods have been widely used, with the former considered more appropriate for clusters with equal numbers of observations [19] and the latter recommended to avoid large or tight compact clusters that result from the single linkage and the complete linkage methods [19]. Therefore, we used the average linkage method in this study and used the cluster stopping rule to aid in selecting partitions [20].

HRQoL
The Assessment of Quality of Life (AQoL-4D) instrument was used to measure quality of life due to its brevity [21], sensitivity and robustness [22]. Four dimensions (independent living, mental health, relationships and senses) consisting of three items each were included for scoring. Then, five new variables, four dimension scores and one overall instrument score, which ranged from −0.04 to 1, were created. A score of 1.00 indicated the best quality of life equal to perfect health, and 0.00 indicated quality of life equal to death, and negative values (0 to −0.04) indicated quality of life worse than death [23].

Covariates
Univariate analyses with a 0.25 p-value cut-off were performed to screen the covariates before the second round of screening, involving multivariate analyses. A cut-off of a 10% change in the exposure variable's coefficient estimate in the multivariate model was adopted to identify the "important" variables influencing the association between outcome and exposure. Covariates that remained after these procedures were utilized throughout all subsequent analyses conducted in this study.
The covariates screened in this study included sex, age, registered marital status (married, unmarried), labour force status (employed, unemployed, not in the labour force), area of relative socioeconomic disadvantage (decile 1 = most disadvantaged, decile 10 = least disadvantaged), body mass index (BMI = self-reported weight/self-reported height 2 ), smoking status (current, ex-smoker, never) and level of exercise (low: <1600 min; moderate: 1600-3200 min, or >3200 min but <2 h of vigorous exercise; high: >3200 min, including ≥2 h of vigorous exercise), which was also used to assess exercise intensity  (exercise intensity scores were multiplied by minutes per fortnight) [11].

Statistical analyses
Due to the complex survey design used in the NSMHWB2007, a weighting strategy was applied to infer results for the total in-scope population by allocating a 'weight' to each sample unit. The weight was an indication of how many population units were represented by the sample unit [11]. As a result, Jack-knife delete-agroup survey adjustment replication methods were used to calculate the standard errors (SEs) [24]. This process accounted for the stratified multistage sampling framework used in the NSMHWB2007 and adjusted for nonresponse, which may cause some groups to be over-or under-represented [25]. The theory behind the Jackknife delete-A-group replication method is that, the sampling variability between repeated samples can be estimated by repeatedly taking random but unbiased subsamples from the achieved sample and then computing the variance of the sub-samples (after taking the smaller sample size into account). Jack-knife estimation replicates are created by deleting one group at a time, and then weighting the other groups from the same stratum to adjust for the removal. Therefore, each replicate provides an unbiased estimate of the population mean, and the variance of those estimates provides an estimate of the full-sample of the variance. In short, application of these methods ensures the sample is representative of the Australian population, which ensures that subsequent findings are generalizable to Australian adults (n = 16,015,000) in 2007 [11]. Frequencies and percentages calculated with jack-knife SEs were used for the descriptive analysis. Hierarchical clustering analysis was used to identify common clusters of multiple chronic conditions. Linear regression models were used to examine the associations between the HRQoL scores and the multimorbidity clusters. In each regression model, the dependent variable was the HRQoL score, and the independent variable was one cluster (present or absent), for example, "whether presenting 2+ chronic conditions" in model-1. The p value for the trend of continuous variables in the linear regression models was given. A log-transformed HRQoL score was used due to its negatively skewed distribution, which resulted in 55 missing values. A two-tailed p-value of <0.05 was considered statistically significant. To test the validation in clusters of hierarchical clustering, sensitivity analyses were performed that including factor analysis [26], principal component analysis [27] and K-means clustering [28], which have been used in previous studies. All analyses were performed using Stata/SE Version 12.1 (StataCorp, College Station, TX, USA).

Results
We analysed data from 8841 respondents which could be generalizable to 16,015,000 Australian adults. The mean age of the study participants was 44 years (SE = 0.04). A total of 20.5% (SE = 0.6) of the population was obese (BMI >30), 65.2% (SE = 0.2) were employed, 72.7% (SE = 0.9) reported low levels of exercise, 53% (SE = 0.7) were married and 22.3% (SE = 0.7) were current smokers. More than half of the respondents (56.7%, SE = 0.7) had at least one chronic condition, and 46% had two or more chronic conditions (Table 1). Table 2 presents the prevalence of each chronic condition and the percentage of coexistence with other Table 3 Prevalence of common clusters using count method and hierarchical cluster, weighted (N = 8,820)  CVD Heart or circulatory condition, MDD Major depression disorder, Sample size (n) are showed based on the raw data, proportion (%) are estimated with standard error based on the weighting strategy chronic conditions. CVD (21.2%, SE = 0.7), arthritis (19.9%, SE = 0.6) and asthma (19.6%, SE = 0.5) were the three most prevalent conditions. All chronic conditions coexisted with other chronic conditions to various degrees (range from 49 to 91%). Table 3 presents the two common clusters obtained using hierarchical clustering, CVD/arthritis (cluster-1, prevalence = 9.2%, SE = 0.5) and MDD/anxiety (cluster-2, 4.3%, SE = 0.3). In contrast, the prevalence of multimorbidity as defined by the MM2+ and MM3+ count method were 26% (SE = 0.6) and 10.1% (SE = 0.5), respectively. The mean ages of the population with MM2+, MM3+, cluster-1 and cluster-2 were 54.6 (SE = 0.3), 57.5 (SE = 0.6), 63.8 (SE = 0.7) and 41.7 (SE = 0.7) years, respectively. As expected, prevalence of MM3+ was lower than MM2+ (Table 3), but mean HRQoL was poorer. Interestingly both count methods identified groups with similar sociodemographic characteristics such as female, older, higher BMI, lower education level, less exercise, lower socioeconomic status, and not in the labour force. Individuals with MDD/anxiety (cluster-2), which resulting from hierarchical clustering to identify multimorbidity, had the lowest HRQoL scores with the different socio-demographic characteristics comparing to the other hierarchical cluster and count method to identify multimorbidity, such as younger, unemployed, unmarried (Tables 4-5).
Individuals with MDD/anxiety (cluster-2) had HRQoL scores that were 0.38 points (SE = 0.02; p < 0.01) lower than those without cluster-2. Individuals with any two or more chronic conditions (MM2+) had HRQoL scores that  MM3 + which means having any 3 or more chronic conditions out of asthma, cancer, stroke, CVD, gout rheumatism or arthritis and diabetes or high sugar levels, MDD and anxiety disorder were 0.21 points (SE = 0.01; p < 0.01) lower than those with no more than one chronic condition. Individuals with any three or more chronic conditions (MM3+) had HRQoL scores that were 0.26 points (SE = 0.02; p < 0.01) lower than those with no more than two chronic conditions. Individuals with CVD/arthritis (cluster-1) had HRQoL scores that were 0.14 points lower than those without CVD/arthritis. After adjusting for sex, age, BMI, labour force status, level of exercise (not in the model of cluster-2), registered marital status, smoking status and socio-economic disadvantage index, multivariate analyses revealed the associations between the HRQoL scores and each cluster remained significant; the MM2+ cluster (coef: −0.18, SE = 0.01; p < 0.01) and the MM3+ cluster (coef: −0.23, SE = 0.02; p < 0.01) were higher than the CVD/arthritis cluster (coef: −0.10, SE = 0.01; p < 0.001) but lower than the MDD/anxiety cluster (coef: −0.36, SE = 0.01; p < 0.001) (Tables 6-7).

Discussion
Consistent with the findings of the 2004 systematic review by Fortin et al. [8], our analysis of a large, nationally representative dataset showed that multimorbidity is common and significantly associated with lower HRQoL. Although the different definitions of multimorbidity did not change this association, the sociodemographic profiles and HRQoL scores varied depending on the definition of multimorbidity. In the present study, the HRQoL scores were lowest in the participants characterized by cluster-2 (MDD/anxiety disorders), followed by MM2+, which defined multimorbidity as 2+ condition entities, and cluster-1 (CVD/arthritis).
Although this study failed to identify a specific cluster of comorbid mental and physical disorders, previous research has demonstrated that the co-occurrence of mental and   physical disorders is strongly associated with poorer HRQoL [29]. Therefore, the findings of this study suggest that the count method does not take the type of chronic conditions into account. Therefore, this method can detect the overall influence of multimorbidity on the HRQoL, but it does not capture the specific disease that contributes to the associated HRQoL. Individual disease-based treatments can help relieve associated discomfort, slow the course of disease and increase the HRQoL for people with a single chronic condition. However, for individuals with multimorbidity, traditional, individual, disease-focused treatment does not perform well due to interactions between the diseases and treatments. Moreover, reducing the number of conditions does not provide health professionals with an effective therapeutic plan. Furthermore, when calculating the burden of multimorbidity, the condition needs to be treated in its entirety if it is to inform accurate health care planning.
Different cut-off values of the number-based count definition of multimorbidity have been used in the previous HRQoL studies [30]. Some of them used both two or more (2+) and three or more (3+) chronic conditions at the same time [15,[31][32][33] as the cut off value. Harrison C, Britt H, Miller G and Henderson J [15] reported that the 2+ definition was more appropriate in a broader age-scope population, whereas 3+ was more specific for an elderly study population [15]. However, no cut-off can be used without caution, particularly because the number of conditions in the current studies ranged from 4 to 102 [33,34]. Furthermore, the 2+ cut-off is recommended when a limited number of chronic conditions are included in the definition of multimorbidity, whereas the 3+ cut-off requires the inclusion of more chronic conditions [15]. Despite these issues, the 2+ cut-off was deemed most appropriate for our study based on the data used, i.e., eight chronic conditions and a population-based sample.
In addition to hierarchical clustering, other approaches to the common clusters of multimorbidity exist including factor analysis, principal component analysis and Kmeans clustering [16]. This study used hierarchical clustering with Jaccard's coefficient due to the shared risk factors among the chronic conditions and the binary nature of chronic diseases [19]. However, the other three approaches were tested in a sensitivity analysis using the same sample (results not shown). The same clusters were produced by the factor analysis and principal component analysis: CVD/arthritis (cluster-1) and cancer/stroke/ CVD/arthritis/diabetes. K-means analysis produced clusters including cancer/stroke/CVD/arthritis/diabetes and cancer/ stroke/CVD/diabetes/MDD/anxiety. Cancer/stroke/CVD/ arthritis/diabetes and cancer/stroke/CVD/diabetes/MDD/ anxiety were not examined further due to the extremely low prevalence, with only five and two cases, respectively. These differences may be due to the different mechanisms of the methods used to detect the clusters, i.e., the cluster analysis processes used distance measures, whereas the factor analysis and principal component analysis processes used correlations. In addition, the individuals within these distance-based clusters have more common characteristics. Furthermore, because the results of the hierarchical cluster approach may be sensitive to the distance scales and linking methods, we performed sensitivity analyses using an additional four distance scales: Ward's linkage, waverage linkage, single linkage and complete linkage. All of these scales   [15] The results shown that even prevalence of multimorbidity as well as the mean HRQoL scores in the people considered as multimorbid varied by the different cut-off of multimorbidity used, multivariate analyses revealed similar patterns in the variations of estimates of HRQoL scores within each of the subgroups of individuals considered. In relation to the cluster definitions of multimorbidity, the method does not prespecify number of conditions but is statistically derived, thus these analyses remain unchanged. This study has several notable limitations. First, the findings of multimorbidity studies must be in considered with reference to the list of conditions included in the definition of multimorbidity, as the prevalence of multimorbidity depends on the definition used. [34] However, the health conditions used in this study were chosen because they contribute significantly to the burden of disease in the Australian community. Moreover, the present study excluded acute conditions, which some previous studies have included [6]. Although including more conditions in the definition of multimorbidity may potentially provide a more comprehensive understanding of an individual's health status, acute conditions were not considerate in this study as they may only influence health status temporarily [15] and not be relevant to long-term health care planning.
Second, the data used in this study were derived from a survey focused on mental health well-being. As a result, the assessments of physical chronic conditions were relatively brief, self-reported and not verified using medical records [11]. Physical conditions were assessed by selfreport in the past 12-months, which may be underestimated or overestimated due to recall bias. However, the validity of self-reported chronic conditions has been indicated in different contexts [35][36][37][38][39]. Moreover, self-reported data offers cost-effectiveness and convenience for gathering information in the population-based surveys. [40] Finally, despite being encouraged [32], the severity of chronic conditions was not included in this study because it was not measured in the NSMHWB2007 and the additional burden on the respondents (time consuming) may reduce the response rate. Although it is unlikely to change the present status of the condition when defining multimorbidity, the severity may have influenced the HRQoL scores.
To our knowledge, this is the first study to compare number-based and cluster-based definitions of multimorbidity using nationally representative data. This large population-based database, using the delete-1 group jack-knife technique to generate the replicate weights, increases the generalizability of the study's findings and could inform the investigation of multimorbidity-related HRQoL in Australia and similar economies worldwide.

Conclusions
Our findings confirm the existence of an inverse relationship between multimorbidity and HRQoL in the Australian population and indicate that the sociodemographic profile and HRQoL vary depending on the method used to define multimorbidity. We conclude that from this head-to-head comparison, the hierarchical clustering approach has been validated when the outcome of interest is HRQoL. Moreover, a simple count fails to identify if there are specific conditions of interest that are driving lower HRQoL. From this perspective, the cluster-based methods, resulting in clusters with the same shared health conditions, may be more useful and informative. Finally, we recommend that researchers exercise caution when selecting a definition of multimorbidity because it may significantly influence the study outcomes.

Availability of data and materials
The data supporting the findings of this study are available from Australian Bureau of Statistics, but some restrictions apply regarding the availability of these data, which were used under license for the current study and are thus not publicly available. However, the data are available from the authors upon reasonable request and with permission from the Australian Bureau of Statistics.

Authors' contributions
Kristy Sanderson conceived of and designed the study. Lili Wang and Kristy Sanderson analysed the data. All authors interpreted the data. Lili Wang drafted the manuscript, and Kristy Sanderson, Andrew J Palmer and Fiona Cocker critically revised the paper for important intellectual content. All authors gave final approval of the version to be published.

Competing interests
The authors, Lili Wang, Andrew J Palmer, Fiona Cocker and Kristy Sanderson, declare that they have no competing interests.

Consent for publication
The Australian Bureau of Statistics obtained informed consent from all individual participants included in the study.

Ethics approval and consent to participate
The original data were collected by the Australian Bureau of Statistics. All procedures performed in the study involving human participants were in accordance with the ethical standards of the Australian National University Ethics Committee and with the 1964 Helsinki declaration and its later amendments. Consultations with the Privacy Commissioner were continued throughout the development of the survey. A de-identified dataset was provided to the authors.