The ten-item multidimensional measure provided clear patterns for well-being across 21 countries and various groups within. Whether used individually or combined into a composite score, this approach produces more insight into well-being and its components than a single item measure such as happiness or life satisfaction. Fundamentally, single items are impossible to unpack in reverse to gain insights, whereas the composite score can be used as a macro-indicator for more efficient overviews as well as deconstructed to look for strengths and weaknesses within a population, as depicted in Figs. 6 and 7. Such deconstruction makes it possible to more appropriately target interventions. This brings measurement of well-being in policy contexts in line with approaches like GDP or national ageing indexes [7], which are composite indicators of many critical dimensions. The comparison with GDP is discussed at length in the following sections.
Patterns within and between populations
Overall, the patterns and profiles presented indicate a number of general and more nuanced insights. The most consistent among these is that the general trend in national well-being is usually matched within each of the primary indicators assessed, such as lower well-being within unemployed groups in countries with lower overall scores than in those with higher overall scores. While there are certainly exceptions, this general pattern is visible across most indicators.
The other general trend is that groups with lower MPWB scores consistently demonstrate greater variability and wider confidence intervals than groups with higher scores. This is a particularly relevant message for policymakers given that it is an indication of the complexity of inequalities: improvements for those doing well may be more similar in nature than for those doing poorly. This is particularly true for employment versus unemployment, yet reversed for educational attainment. Within each dimension, the most critical pattern is the lack of consistency for how each country ranks, as discussed further in other sections.
Examining individual dimensions of well-being makes it possible to develop a more nuanced understanding of how well-being is impacted by societal indicators, such as inequality or education. For example, it is possible that spending more money on education improves well-being on some dimensions but not others. Such an understanding is crucial for the implementation of targeted policy interventions that aim at weaker dimensions of well-being and may help avoid the development of ineffective policy programs. It is also important to note that the patterns across sociodemographic variables may differ when all groups are combined, compared to results within countries. Some effects may be larger when all are combined, whereas others may have cancelling effects.
Using these insights, one group that may be particularly important to consider is unemployed adults, who consistently have lower well-being than employed individuals. Previous research on unemployment and well-being has often focused on mental health problems among the unemployed [46] but there are also numerous studies of differences in positive aspects of well-being, mainly life satisfaction and happiness [22]. A large population-based study has demonstrated that unemployment is more strongly associated with the absence of positive well-being than with the presence of symptoms of psychological distress [28], suggesting that programs that aim to increase well-being among unemployed people may be more effective than programs that seek to reduce psychological distress.
Certainly, it is well known that higher income is related to higher subjective well-being and better health and life expectancy [1, 42], so reduced income following unemployment is likely to lead to increased inequalities. Further work would be particularly insightful if it included links to specific dimensions of well-being, not only the comprehensive scores or overall life satisfaction for unemployed populations. As such, effective responses would involve implementation of interventions known to increase well-being in these groups in times of (or in spite of) low access to work, targeting dimensions most responsible for low overall well-being. Further work on this subject will be presented in forthcoming papers with extended use of these data.
This thinking also applies to older and retired populations in highly deprived regions where access to social services and pensions are limited. A key example of this is the absence in our data of a U-shaped curve for age, which is commonly found in studies using life satisfaction or happiness [5]. In our results, older individuals are typically lower than what would be expected in a U distribution, and in some cases, the oldest populations have the lowest MPWB scores. While previous studies have shown some decline in well-being beyond the age of 75 [20], our analysis demonstrates quite a severe fall in MPWB in most countries. What makes this insight useful – as opposed to merely unexpected – is the inclusion of the individual dimensions such as vitality and positive relationships. These dimensions are clearly much more likely to elicit lower scores than for younger age groups. For example, ageing beyond 75 is often associated with increased loneliness and isolation [33, 43], and reduction in safe, independent mobility [31], which may therefore correspond with lower scores on positive relationships, engagement, and vitality, and ultimately lower scores on MPWB than younger populations. Unpacking the dimensions associated with the age-related decline in well-being should be the subject of future research. The moderate positive relationship of MPWB scores with life satisfaction is clear but also not absolute, indicating greater insights through multidimensional approaches without any obvious loss of information. Based on the findings presented here, it is clearly important to consider ensuring the well-being of such groups, the most vulnerable in society, during periods of major social spending limitations.
Policy implications
Critically, Fig. 6 represents the diversity of how countries reach an overall MPWB score. While countries with overall high well-being have typically higher ranks on individual items, there are clearly weak dimensions for individual countries. Conversely, even countries with overall low well-being have positive scores on some dimensions. As such, the lower items can be seen as potential policy levers in terms of targeting areas of concern through evidence-based interventions that should improve them. Similarly, stronger areas can be seen as learning opportunities to understand what may be driving results, and thus used to both sustain those levels as well as potentially to translate for individuals or groups not performing as well in that dimension. Collectively, we can view this insight as a message about specific areas to target for improvement, even in countries doing well, and that even countries doing poorly may offer strengths that can be enhanced or maintained, and could be further studied for potential applications to address deficits. We sound a note of caution however, in that these patterns are based on ranks rather than actual values, and that those ranks are based on single measures.
Figure 7 complements those insights more specifically by showing how Finland and Norway, with a number of social, demographic, and economic similarities, plus identical life satisfaction scores (8.1) arrive at similar single MPWB scores with very different profiles for individual dimensions. By understanding the levers that are specific to each country (i.e. dimensions with the lowest well-being scores), policymakers can respond with appropriate interventions, thereby maximizing the potential for impact on entire populations. Had we restricted well-being measurement to a single question about happiness, as is commonly done, we would have seen both countries had similar and extremely high means for happiness. This might have led to the conclusion that there was minimal need for interventions for improving well-being. Thus, in isolation, using happiness as the single indicator would have masked the considerable variability on several other dimensions, especially those dimensions where one or both had means among the lowest of the 21 countries. This would have resulted in similar policy recommendations, when in fact, Norway may have been best served by, for example, targeting lower dimensions such as Engagement and Self-Esteem, and Finland best served by targeting Vitality and Emotional Stability.
Targeting specific groups and relevant dimensions as opposed to comparing overall national outcomes between countries is perhaps best exemplified by Portugal, which has one of the lowest educational attainment rates in OECD countries, exceeded only by Mexico and Turkey [36]. This group thus skews the national MPWB score, which is above average for middle and high education groups, but much lower for those with low education. Though this pattern is not atypical for the 21 countries presented here, the size of the low education group proportional to Portugal’s population clearly reduces the national MPWB score. This implies that the greatest potential for improvement is likely to be through addressing the well-being of those with low education as a near-term strategy, and improving access to education as a longer-term strategy. It will be important to analyze this in the near future, given recent reports that educational attainment in Portugal has increased considerably in recent years (though remains one of the lowest in OECD countries) [36].
One topic that could not be addressed directly is whether these measures offer value as indicators of well-being beyond the 21 countries included here, or even beyond the countries included in ESS generally. In other words, are these measures relevant only to a European population or is our approach to well-being measurement translatable to other regions and purposes? Broadly speaking, the development of these measures being based on DSM and ICD criteria should make them relevant beyond just the 21 countries, as those systems are generally intended to be global. However, it can certainly be argued that these methods for designing measures are heavily influenced by North American and European medical frameworks, which may limit their appropriateness if applied in other regions. Further research on these measures should consider this by adding potential further measures deemed culturally appropriate and seeing if comparable models appear as a result.
A single well-being score
One potential weakness remains the inconsistency of scaling between ESS well-being items used for calculating MPWB. However, this also presents an opportunity to consider the relative weighting of each item within the current scales, and allow for the development of a more consistent and reliable measure. These scales could be modified to align in separate studies with new weights generated – either generically for all populations or stratified to account for various cultural or other influences. Using these insights, scales could alternatively be produced to allow for simple scoring for a more universally accessible structure (e.g. 1–100) but with appropriate values for each item that represents the dimensions, if this results in more effective communication with a general public than a standardized score with weights. Additionally, common scales would improve on attempts to use rankings for presenting national variability within and between dimensions. Researchers should be aware that factor scores are sample-dependent (as based on specific factor model parameters such as factor loadings). Nevertheless, future research focused on investigating specific item differential functioning (by means of multidimensional item response functioning or akin techniques) of these items across situations (i.e., rounds) and samples (i.e., rounds and countries) should be conducted in order to have a more nuanced understanding of this scale functioning.
What makes this discussion highly relevant is the value of a more informed measure to replace traditional indicators of well-being, predominantly life satisfaction. While life satisfaction may have an extensive history and present a useful metric for comparisons between major populations of interest, it is at best a corollary, or natural consequence, of other indicators. It is not in itself useful for informing interventions, in the same way limiting to a single item for any specific dimension of well-being should not alone inform interventions.
By contrast, a validated and standardized multidimensional measure is exceptionally useful in its suitability to identify those at risk, as well as its potential for identifying areas of strengths and weaknesses within the at-risk population. This can considerably improve the efficiency and appropriateness of interventions. It identifies well-understood dimensions (e.g. vitality, positive emotion) for direct application of evidence-based approaches that would improve areas of concern and thus overall well-being. Given these points, we strongly argue for the use of multidimensional approaches to measurement of well-being for setting local and national policy agenda.
There are other existing single-score approaches for well-being addressing its multidimensional nature. These include the Warwick-Edinburgh Mental Well-Being Scale [44] and the Flourishing Scale [11]. In these measures, although the single score is derived from items that clearly tap a number of dimensions, the dimensions have not been systematically derived and no attempt is made to measure the underlying dimensions individually. In contrast, the development approach used here – taking established dimensions from DSM and ICD – is based on years of international expertise in the field of mental illness. In other words, there have long been adequate measures for identifying and understanding illness, but there is room for improvement to better identify and understand health. With increasing support for the idea of these being a more central focus of primary outcomes within economic policies, such approaches are exceptionally useful [13].
Better measures, better insights
Naturally, it is not a compelling argument to simply state that more measures present greater information than fewer or single measures, and this is not the primary argument of this manuscript. In many instances, national measures of well-being are mandated to be restricted to a limited set of items. What is instead being argued is that well-being itself is a multidimensional construct, and if it is deemed a critical insight for establishing policy agenda or evaluating outcomes, measurements must follow suit and not treat happiness and life satisfaction values as universally indicative. The items included in ESS present a very useful step to that end, even in a context where the number of items is limited.
As has been argued by many, greater consistency in measurement of well-being is also needed [26]. This may come in the form of more consistency regarding dimensions included, the way items are scored, the number of items representing each dimension, and changes in items over time. While inconsistency may be prevalent in the literature to date for definitions and measurement, the significant number of converging findings indicates increasingly robust insights for well-being relevant to scientists and policymakers. Improvements to this end would support more systematic study of (and interventions for) population well-being, even in cases where data collection may be limited to a small number of items.
The added value of MPWB as a composite measure
While there are many published arguments (which we echo) that measures of well-being must go beyond objective features, particularly related to economic indicators such as GDP, this is not to say one replaces the other. More practically, subjective and objective approaches will covary to some degree but remain largely distinct. For example, GDP presents a useful composite of a substantial number of dimensions, such as consumption, imports, exports, specific market outcomes, and incomes. If measurement is restricted to a macro-level indicator such as GDP, we cannot be confident in selecting appropriate policies to implement. Policies are most effective when they target a specific component (of GDP, in this instance), and then are directly evaluated in terms of changes in that component. The composite can then be useful for comprehensive understanding of change over time and variation in circumstances. Specific dimensions are necessary for identifying strengths and weaknesses to guide policy, and examining direct impacts on those dimensions. In this way, a composite measure in the form of MPWB for aggregate well-being is also useful, so long as the individual dimensions are used in the development and evaluation of policies. Similar arguments for other multidimensional constructs have been made recently, such as national indexes of ageing [7].
In the specific instance of MPWB in relation to existing measures of well-being, there are several critical reasons to ensure a robust approach to measurement through systematic validation of psychometric properties. The first is that these measures are already part of the ESS, meaning they are being used to study a very large sample across a number of social challenges and not specifically a new measure for well-being. The ESS has a significant influence on policy discussions, which means the best approaches to utilizing the data are critical to present systematically, as we have attempted to do here. This approach goes beyond existing measures such as Gallup or the World Happiness Index to broadly cover psychological well-being, not individual features such as happiness or life satisfaction (though we reiterate: as we demonstrate in Fig. 7a and b, these individual measures can and should still covary broadly with any multidimensional measure of well-being, even if not useful for predicting all dimensions). While often referred to as ‘comprehensive’ measurement, this merely describes a broad range of dimensions, though more items for each dimension – and potentially more dimensions – would certainly be preferable in an ideal scenario.
These dimensions were identified following extensive study for flourishing measures by Huppert & So [27], meaning they are not simply a mix of dimensions, but established systematically as the key features of well-being (the opposite of ill-being). Furthermore, the development of the items is in line with widely validated and practiced measures for the identification of illness. The primary adjustment has simply been the emphasis on health, but otherwise maintains the same principles of assessment. Therefore, the overall approach offers greater value than assessing only negative features and inferring absence equates to opposite (positives), or that individual measures such as happiness can sufficiently represent a multidimensional construct like well-being. Collectively, we feel the approach presented in this work is therefore a preferable method for assessing well-being, particularly on a population level, and similar approaches should replace single items used in isolation.