1. Introduction
Composite indicators are one-dimensional representations resulting from the weighted aggregation of different factors associated with a multidimensional phenomenon [
1,
2]. The ability of composite indicators to provide a simultaneous and easy-to-understand visualization of the different factors associated with a multidimensional phenomenon makes them a popular approach among researchers from diverse areas [
3,
4].
The construction of composite indicators involves many aspects, from defining the conceptual framework to validating and visualizing the results [
3,
5]. Specifically, composite indicator scores are derived from the normalization, weighting, and aggregation of factors [
6]. These steps are critical in constructing composite indicators due to their direct relationship with the composite indicator score [
7,
8].
Weighting, in particular, can be carried out based on data or expert opinion [
9]. Weighting based on data is carried out using statistics [
10]. For example, principal component analysis (PCA) defines weights that maximize the variance extracted from input factors [
11]. Entropy weighting assigns weights that account for the informational diversity of input factors [
12,
13]. Factor analysis (FA) defines weights that maximize the correlations between input factors [
7]. The concept of benefit of the doubt (BoD) defines weights that maximize the scores of the composite indicator [
14]. Weighting based on expert opinion involves the application of multicriteria methods such as the Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS), the analytic hierarchy process (AHP), and Elimination and Choice Expressing Reality (ELECTRE) [
15].
These statistical and multicriteria methods apply to different situations, offering solutions to represent multidimensional phenomena of a social [
16], business [
17,
18], economic [
19,
20], environmental [
21,
22], governmental [
23,
24], and industrial [
25,
26] nature. However, both classes of methods consider all factors selected in the conceptual framework in the composite indicator. Although this approach appears appropriate, it overlooks the fact that not all input factors are significantly correlated with the multidimensional phenomenon.
This research examines this flaw and proposes a new approach to constructing composite indicators, focusing only on factors that explain the multidimensional phenomenon. The proposed approach leverages two valuable properties of discriminant analysis for the construction of composite indicators, preventing factors that are not highly relevant to the multidimensional phenomenon from being included in the composite indicator. This ensures that the composite indicator scores have a significant correlation with the dependent variable.
Despite its advantages, the adoption of discriminant analysis as a method of constructing composite indicators is rare. Studies that review methods for the construction of composite indicators rarely mention discriminant analysis [
10,
15]. Research using this method has focused on finding variables that indicate the quality of surface waters [
27], identifying the human labor variables that best classify countries’ quality of life levels [
28], or determining factors that can be used to discriminate between diseases [
29,
30], rather than on the construction of the composite indicator itself. This method’s advantages have also been overlooked in research that employs discriminant analysis to validate a composite indicator constructed by other methods [
31].
The applicability of discriminant analysis to represent multidimensional phenomena is demonstrated through the construction of the so-called multidimensional depression diagnostic index. This application example is especially timely for two reasons. Depression impairs the ability of people worldwide to carry out routine activities, resulting in social and economic consequences [
32,
33,
34,
35]. The multidimensional nature of depression is poorly explored in the literature. Studies exploring multidimensional phenomena and depression are limited to correlating depression with composite indicators [
36], considering depression as a factor associated with composite indicators of mental health [
37,
38], and diagnosing an individual based on factors linked to depression [
39,
40]. Therefore, to the best of the authors’ knowledge, this study is the first to construct a composite indicator that captures the various factors of multidimensional depression.
This context enables us to assert that a depression diagnosis composite indicator offers methodological and phenomenological contributions. From a methodological perspective, this research demonstrates that discriminant analysis objectively selects and weights sub-indicators, providing a more comprehensive explanation of the multidimensional phenomenon based on a categorical dependent variable. This objective selection of sub-indicators is a unique property of discriminant analysis, preventing sub-indicators that are not highly relevant from being aggregated into the composite indicator. From a phenomenological perspective, this research offers a novel composite indicator to represent the multidimensional nature of depression diagnosis. The depression diagnosis composite indicator provides valuable information for public authorities to plan investments in public health in Brazilian states and prioritize actions in states with populations with a self-reported positive diagnosis of depression. Furthermore, the research reveals which factors contribute most to discriminating between positive and negative diagnoses of depression in Brazilian states, providing public managers with information to plan specific actions to offer support to individuals with depression, according to the characteristics of each Brazilian state.
These new developments have a high degree of appropriateness among researchers and academics, as they can be applied to other countries, regions, and cities and explored in the representation of multidimensional social, environmental, and managerial phenomena. However, it is essential to note that the determinants of multidimensional depression are context-sensitive, which limits direct comparisons. On the one hand, comparing the determinants of depression across different contexts provides valuable insights into local similarities and differences. On the other hand, the consideration of different factors and weights in obtaining the scores of the multidimensional depression diagnostic index makes it impossible to indicate locations that are more or less vulnerable to multidimensional depression, hindering the development of policies across different contexts. This limitation also applies to the use of discriminant analysis in constructing composite social, environmental, and managerial indicators.
This research is organized into six sections. In addition to this Introduction,
Section 2 presents limitations, gaps, and opportunities in understanding mental health, especially depression, from a multidimensional perspective.
Section 3 illustrates the application of discriminant analysis in creating a multidimensional depression diagnostic index for the 27 states of Brazil.
Section 4 presents the results, while
Section 5 discusses the index for the diagnosis of multidimensional depression and the study’s limitations and outlines future research directions. Finally, the research conclusions are presented in
Section 6.
2. Mental Health: Limitations, Gaps, and Research Avenues
Mental health is a multidimensional phenomenon that involves the interaction of various factors, including biological, psychological, and social aspects [
41,
42,
43]. In this sense, researchers agree on the need to measure mental health through composite indicators [
42,
44,
45,
46,
47].
Composite indicators of mental health, as found in the literature, are constructed using factor analysis methods, such as principal component analysis [
45,
48,
49]. These methods ensure that the information contained in the factors is maximally captured in the composite indicator [
11,
50]. However, these methods do not guarantee that the composite indicator has a significant correlation with the explained variable of the multidimensional phenomenon. Guaranteeing this correlation is a recommendation given in the “Handbook on Constructing Composite Indicators” [
3], which has served as a basis for researchers in constructing composite indicators. Although there are methods that maximize the correlation of the composite indicator with the explained variable [
51,
52], they consider that all factors correlate with the explained variable. This failure distorts the composite indicator score, resulting in misleading conclusions about the relative importance of each factor in the multidimensional phenomenon. In addition to the gap in research that addresses these methodological problems, there is a gap in research on a specific multidimensional mental health phenomenon: depression.
2.1. Composite Indicators and Depression
Composite indicators and depression have been studied on three research fronts. The first of these is composed of research that correlates depression with composite socioeconomic indicators [
36], social connectedness [
53], social participation [
54], sleep [
55], nutrition [
56], glucose variability [
57], and vocal acoustic properties [
58]. The second type of research uses depression to construct composite indicators of health deficits [
37], psychological distress [
38], and clinical responses [
59]. The third type of research considers the emotional, cognitive, somatic, and interpersonal dimensions involved in defining a depression rating scale [
60,
61,
62,
63].
On the one hand, the first type of research offers valuable information about the importance of diagnosing and treating depression, the second demonstrates the importance of considering the dimension of depression in understanding other diseases, and the third seeks to improve the means of diagnosing depression. On the other hand, the gap in composite indicators that represent the multiple dimensions of depression is evident.
2.2. Dimensions of Multidimensional Depression
The multidimensionality of depression diagnosis is addressed in this research through a framework composed of eleven dimensions: chronic diseases, economics, education, gender, social habits, age, place of residence, professional, psychological, physical–cognitive health, and mental health.
Table 1 presents a summary of these dimensions.
There are three dimensions of a socioeconomic nature. The economic dimension is particularly relevant, as social disparities are directly linked to depression [
64,
69]. Individuals with higher incomes have greater access to healthcare and, consequently, greater access to mental health services [
65] and other services that promote health and well-being, such as assistance with healthy eating, which reduces the risk of severe depression [
82]. The social habits dimension, comprising the use of substances such as alcohol and tobacco, includes important factors in understanding depression. Excessive alcohol consumption increases negative feelings such as sadness, hopelessness, and social isolation, increasing an individual’s risk of developing symptoms of depression [
66]. In turn, individuals who consume excessive nicotine are more likely to develop symptoms of depression as this substance influences the chemical balance in the brain [
66]. Finally, the likelihood of developing symptoms of depression is lower for individuals who practice sports [
67,
83] or own pets [
68,
84]. Social determinants, including one’s place of residence (geography), are indirectly associated with depression through socioeconomic and cultural factors, providing important information in terms of defining and implementing public policies and allocating resources to care for the population diagnosed with depression [
69,
70].
There are four demographic dimensions. The educational dimension is crucial in understanding depression, given its direct relationship with other dimensions, such as economic status and location of residence, which impact an individual’s ability to access mental health services [
71,
72]. The gender dimension is also directly related to depression [
74]. Research reveals that the prevalence of depression in women is higher than in men of the same age group [
73]. The age dimension is directly related to depression. Neurobiological changes, loss of loved ones, and social isolation due to age are associated with increased vulnerability to depression [
75]. This condition places age as a significant dimension in measuring depression. The employment dimension includes important factors in understanding depression. Professional evaluation, unhealthy work (based on occupational hazards), a hostile work environment [
85], and even abrupt retirement [
76] have been associated with the development or worsening of depression as they lead to a loss of purpose, social isolation, and stress [
77].
There are also four dimensions related to health. Comorbidities, also known as the chronic disease dimension, brings together a set of important factors in understanding depression. Individuals with chronic diseases are subjected to greater physical and emotional burdens, functional limitations, and the need to adapt to new lifestyles [
80]. These conditions impact quality of life and increase the individual’s chances of developing depressive symptoms [
80]. The psychological dimension involves trauma and stress caused by violence, insults, and threats that lead to helplessness, social isolation, and low self-esteem, impacting mental health and emotional well-being [
78]. The physical–cognitive health dimension is related to two types of factors. The first type includes increased emotional anxiety, limited participation in social activities, frustration, and isolation [
79]. The second type of factor is associated with the quality of life of individuals who require assistance in performing simple tasks, such as dressing, feeding, and cleaning [
79]. Finally, the mental health dimension involves factors directly associated with depression symptoms. Sleep deprivation impairs an individual’s emotional regulation, reducing their capacity to cope with stressful situations and negative emotions [
81]. Depressive feelings, persistent fatigue, and a lack of motivation can significantly impair an individual’s ability to face daily challenges, ultimately impacting their quality of life and emotional well-being [
79,
86].
4. Results
Once the internal consistency of the discriminant analysis has been verified, it is possible to reformulate Equation (1) with the coefficients obtained for the five factors considered: (α) the percentage of the state’s population with a life-limiting chronic disease, (β) the average education level (one basic and five superior) of the state’s population, (γ) the percentage of the state’s population that has support from someone, (δ) the percentage of the state’s population that performs unhealthy work, and (ε) the percentage of the male population. By applying the coefficients of importance of each of these factors, it is possible to construct the multidimensional depression diagnostic index through the following discriminant function:
where
is the composite indicator score for the state
resulting from the product of two coefficients of the discriminant function
,
,
,
and
based on the values of their respective factors in each state
.
The negative sign of ε in Equation (2) confirms the current literature indicating that a positive diagnosis of depression is more common in women than in men [
73,
74]. The analysis of the two normalized coefficients of Equation (2) also shows that having a limiting chronic illness and being male are the most determining factors when separating individuals with a positive diagnosis of depression from individuals with a negative diagnosis.
By applying Equation (2), it is possible to determine the multidimensional depression diagnosis composite indicator score, as represented by the two histograms in
Figure 1. These histograms show that the multidimensional depression diagnostic index scores of individuals with a self-reported positive depression diagnosis are well separated from the scores of individuals with a self-reported negative depression diagnosis. This separation reveals that the composite indicator scores are compatible with the dependent categorical variable.
The compatibility of the composite indicator scores with the dependent categorical variable is confirmed by the cross-classification analysis shown in
Table 4. In total, 100% of cases with scores above zero were associated with a positive diagnosis of multidimensional depression, and 100% of cases with scores below zero were associated with a negative diagnosis of multidimensional depression.
At this point, it is worth clarifying that this study employs discriminant analysis for a purpose that differs from that of other studies in the area of mental health. On the one hand, studies that apply discriminant analysis in mental health aim to classify individuals concerning their mental health, specifically in the context of diagnosis [
99,
100,
101]. On the other hand, this study aims to represent multidimensional depression through an index and to identify its determining factors. In this sense, the data are not separated into sample and training sets, as this separation provides a partitioned picture and is limited to the states used in the training.
Thus, cross-validation was used for all cases in the analysis to verify the model’s reliability. In the cross-validation shown in
Table 5, each case is classified according to the functions derived from all cases other than the examined case.
In SPSS, discriminant analysis cross-validation consists of randomly dividing the sample into subsets (folds) to train and test the discriminant model repeatedly. It helps to assess the model’s generalizability by indicating how it performs with data not present in the model.
The ANOVA results also confirm that individuals with self-declared depression in a state present significantly different characteristics and behaviors concerning the five factors extracted from the discriminant analysis. All p-values and F-statistics surpassed the thresholds of 0.05 and 3.89.
Figure 2 illustrates the discriminating factors of the multidimensional phenomenon by state in Brazil. It shows that the percentage of individuals who have a limiting chronic illness is 3.41 times higher in individuals who self-declare as having depression. Seventy-three percent of individuals self-declared with depression have a limiting chronic illness. This percentage is 24% for individuals who self-declare without depression. A positive diagnosis of depression is 2.17 times less likely to be self-reported by men than by women. Individuals who self-declared as having depression studied for an average of 3.97 years, while individuals who self-declared as not having depression studied for an average of 3.82 years.
The results presented in the maps reveal that the percentage of individuals with a positive diagnosis of depression who have someone’s support is 97%. This percentage is over 98% for individuals with a negative diagnosis of depression. The percentage of people who engage in unhealthy work and have a diagnosis of depression is 21%. This percentage is even higher for individuals with a negative diagnosis of depression, at 29%. In brief, these two factors help to separate individuals with a positive and negative diagnosis of depression, but they are not characteristics of individuals with depression. In the latter case, performing unhealthy work has the opposite effect.
The analysis of the maps illustrated in
Figure 2 highlights the difficulty in analyzing multiple sub-indicators simultaneously and understanding the multidimensional phenomenon [
15,
52]. This highlights the importance of developing a composite indicator to represent the multidimensional diagnosis of depression.
In this sense, understanding multidimensional depression through a composite indicator requires the analysis of its factors and respective weights.
Table 6 shows the directions of the correlations between the factors and the multidimensional depression diagnostic index. The factors, including a limiting chronic illness and the male gender, are significantly correlated with a positive diagnosis of depression. In other words, individuals who have a limiting chronic illness and who are female are more likely to have a self-reported positive diagnosis of multidimensional depression.
Education, having someone’s support, and performing unhealthy work contribute to building a multidimensional depression diagnosis composite indicator that separates individuals with positive and negative diagnoses of depression. However, these factors do not significantly correlate with the multidimensional depression diagnostic index individually. It is worth noting that the total weight of these three factors in the multidimensional depression diagnostic index accounts for 43%. According to
Figure 1, the weights assigned to having someone’s support, performing unhealthy work, and having an education are 0.12, 0.15, and 0.16, respectively.
From these factors, it is possible to represent the diagnosis of multidimensional depression in Brazilian states using the map in
Figure 3.
The map indicates that states in the northeastern region have higher scores on the multidimensional depression diagnostic index, while states in the southern and southeastern regions have lower scores. These results are consistent with the per capita incomes of these states and align with the supporting literature that correlates income and depression [
64,
65]. At this point, it is important to highlight that a relationship between income and depression occurs in Brazil, although the discriminant function did not capture this effect.
Table 7 shows the rankings of the states regarding each of the determining factors of multidimensional depression. Alagoas, in the northeastern region, is the worst-positioned state in the ranking of the multidimensional depression index. This state has the third-lowest level of education, the second-lowest percentage of people who have someone’s support and perform unhealthy work, and the highest percentage of males. Meanwhile, the state of Rio Grande do Sul is the best-positioned in the multidimensional depression ranking. This state stands out mainly in terms of the factors of the level of education, having someone’s support, performing unhealthy work, and being male.
The results in
Table 8 suggest that public policies aimed at preventing and treating depression should be directed towards the states of the northeastern region, especially in terms of actions that promote education, as well as the prevention and treatment of chronic diseases.
In general, the multidimensional depression diagnostic index provides policymakers with information to guide decisions about which states should receive investments in care and assistance networks, as well as for educational campaigns. In this context, the index is more concerned with providing information for public policies than with classifying or indicating an individual’s depression diagnosis.
5. Discussion of the Diagnosis of Multidimensional Depression
This research reveals that female Brazilians with more years of education are more likely to have a positive diagnosis of depression. These results align with the existing literature, reinforcing the importance of education in understanding depression [
71]. Furthermore, individuals with more years of education tend to have a better socioeconomic status and, consequently, greater access to the healthcare system and mental health services [
65]. The research results also corroborate the fact that depression in women is higher than in men of the same age group [
73,
102,
103], and comorbidities are more common in individuals with depression [
103,
104].
In turn, factors related to depression in other studies, such as age and psychological conditions, were not confirmed in the present study. This divergence can be attributed to the type of population studied or the scope of analysis. In particular, the present study analyzes the entire population, without distinction regarding age or health conditions. In turn, studies found in the literature analyze exclusively children and adolescents [
102] or older adults with functional disabilities [
103]. Regarding the scope of analysis, the present study analyzes individuals in an aggregated manner, according to their unit in the Brazilian Federation. In turn, studies found in the literature analyze individuals in a disaggregated manner [
105,
106].
The results indicate that the determining factors of depression and the composite indicator of depression depend on the population investigated and the scope of analysis. In the case of the Brazilian population, regarding its states, performing unhealthy work and having someone’s support are not characteristics of people with depression. However, these factors are essential to separate individuals with a positive diagnosis of depression from individuals with a negative diagnosis of depression. For public policymakers, the joint analysis of this information with the multidimensional depression diagnostic index enables more effective actions to combat depression, especially by identifying factors that significantly affect the results.
5.1. Study Contributions
This study’s results demonstrate the clear separation between individuals with and without a diagnosis of depression, evidencing the accuracy of the multidimensional depression diagnostic index. Ensuring accuracy and clarity in the representation of multidimensional depression diagnoses offers a new approach to understanding and interpreting this phenomenon. Visualizing and identifying the factors underlying the diagnosis of multidimensional depression facilitates a deeper understanding of the phenomenon, thereby assisting in the development of public policies related to mental health.
Another contribution of this study involves the application of discriminant analysis to construct a multidimensional depression diagnostic index. Multifactorial and machine learning methods [
105,
106], including discriminant analysis [
99,
100,
101], have been widely applied to classify and diagnose depression and other mental health conditions. These studies play an important role in understanding depression. However, they do not provide comparative information for public managers to direct care policies according to the characteristics of each population. This is, therefore, an innovative application of discriminant analysis, which offers information to aid in understanding the problem and formulating effective public policies, such as directing investments to populations that are more susceptible or vulnerable to depression.
5.2. Study Limitations
This study’s results are susceptible to biases, which reduce the accuracy of the representation of depression in Brazilian states. In particular, the data are based on the self-reported diagnosis of depression, which presupposes knowledge of depression symptoms. Additionally, a significant portion of the Brazilian population consists of individuals with low incomes, limited education, and restricted access to the healthcare system. These conditions may be more associated with difficulties in identifying the symptoms of depression than with depression itself.
Furthermore, Brazilian women have greater levels of education than Brazilian men. In this sense, the higher prevalence of positive diagnoses of depression may be associated with knowledge about depression and its symptoms, rather than with depression itself. Another limitation regarding the data used is that self-reporting a negative diagnosis of depression does not necessarily imply a clinical diagnosis.
Limited generalizability, limited data, methodological complexity, unconsidered variables, and the need for external validation are other limitations that should be taken into account. This study focuses on the Brazilian context, and the determining factors when separating individuals with a positive diagnosis of multidimensional depression may differ from those in other countries. The analysis relies on available data from the National Health Survey, which may not capture all relevant factors in diagnosing multidimensional depression. While discriminant analysis selects relevant factors, other important variables may be overlooked.
Finally, discriminant analysis can be complex and requires specialized knowledge, which may limit its application among researchers without adequate training. In particular, the method requires the presence of a category-dependent variable for its operationalization. This study does not include the external validation of the composite indicator, which is important to confirm its applicability in other contexts.
5.3. Future Research
Future work includes the application of other methods capable of identifying relevant factors and generating scores that could serve as a composite indicator, such as the least absolute shrinkage and selection operator (LASSO) [
102,
103]. This line of research may also involve comparisons and in-depth analyses between different methods for the construction of composite indicators based on categorical variables, including discriminant analysis, logistic regression, and LASSO.
Furthermore, the proposed approach can be applied in different contexts, providing valuable information for the formulation of targeted public health policies and interventions, e.g., for nurturing care [
107], sleep health [
108], and children’s health [
109]. Finally, comparisons between the determinants of depression are sensitive to the data. This particularity offers an opportunity for researchers to conduct comparative analyses between countries and identify differences in the determinants of depression in each country.