3.1. Methodology: Interval-Based Composite Indicators
In that context, composite indicators’ construction can be based on subjective choices (
Becker et al. 2017;
Nardo et al. 2005). However, these subjective choices (for instance, the composite indicator’s weighting scheme, see (
Nardo et al. 2005) can lead to different results. Therefore, the recent literature aims to construct composite indicators, avoiding the subjectivity of considering an assumption or a different one. Therefore, the target is to measure the different impacts of the suitable choices for constructing the indicator (see
Paruolo et al. 2013).
Uncertainty techniques can be considered in this respect because they can measure uncertainty in constructing the composite indicator (for instance, using probabilistic rankings). See for a discussion (
Nardo et al. 2005;
Saisana et al. 2005). Therefore, the idea is to consider some robustness checks and sensitivity analysis by considering different assumptions to evaluate its robustness. This work aims to internalize this robustness by considering the interval of possible results, which can be obtained by varying the composite indicator’s assumptions (see
Drago 2017,
2018;
Gatto and Drago 2020;
Drago and Gatto 2018). In that way, it is essential to define the “model” for the composite indicators initially. Then, it is essential to declare the different factors that lead to the composite indicator variability. From the model, it is possible to identify the different internal sources of variability in the construction of the composite indicator, which leads to the uncertainty of the outcome.
At this point, it is possible to consider several replications of the composite indicator considered by taking into account different combinations of the assumptions given. At every stage, a different combination of assumptions is sampled, and a different outcome is computed. Then, they explicitly consider an interval of all the possible obtained composite indicators by considering different combinations of assumptions on the composite indicator. Finally, the different results are collected, and they can be represented utilizing interval-valued data (
Billard and Diday 2003;
Billard 2008). Thus, these data can be used to represent uncertainty and also inaccuracy (
Qi et al. 2020;
Barclay et al. 2019) and, in general, composite phenomena (for instance, in (
Mlodak 2014;
Fura et al. 2017;
Schang et al. 2016), statistical units characterized by different statistical features are discussed).
We propose internalizing the uncertainty analysis using the interval-valued data, which is relevant in constructing a composite indicator (
Saisana et al. 2005). Our approach allows us to directly measure, represent, and compare the variability of the different assumptions used to construct the composite indicators using random weights and simulating different indicator structures. In this respect, the subjectivity of the weightings can be solved. It is also possible to obtain more consistent public policies that also consider the different composite indicator choices.
Furthermore, using the intervals, it is possible to better design policies because it can better measure the uncertainty related to a different situation, for instance, measured by a composite indicator. The results of this work are beneficial for all who are interested in the construction and the use of composite indicators, including analysts, policy analysts, economic and social researchers, and of course, policymakers. The entire approach is described and visualized in
Table 1 and
Figure 1.
It is essential to note that our final results are interval data and not scalar data. In this logic, the interval allows us to measure the uncertainty explicitly and permits us to obtain a unique measure of the composite indicator (
Sunaga 1958). Moreover, the interval data own a specific algebra that allows different computations between intervals (
Moore 1979;
Sunaga 1958) and statistical analyses (
Gioia and Lauro 2005;
Lauro and Palumbo 2000).
Therefore, in this respect, the process is started by considering n number of different composite indicators with
(they contribute to creating the interval), computed by random combinations of factors (
Saltelli 2016;
Saltelli et al. 2008). Then each interval based composite indicator is built by having:
where
c is the considered, measured phenomenon to measure with the indicator
X for
(
Palumbo and Lauro 2003).
From the composite interval indicator obtained it is possible to compute the center:
Furthermore, the range or the width obtained:
and finally, the radius:
The range and the width represent the variability of the considered interval composite indicators (
Gioia and Lauro 2005). The parameters on which the ranking analysis is computed for the different intervals are the center, the minimum, the maximum, and the range (
Mballo and Diday 2005;
Song et al. 2012). In order to measure the uncertainty, it is possible to consider the difference between the upper and the lower bound of the computed interval (see also
Grzegorzewski 2018).
Finally, it is possible to analyze at the same time the prototype (an average interval) using interval arithmetic. The interval arithmetic and the capacity to handle these composite indicators as intervals allow different advantages. First, they represent a more robust version of a classical composite indicator (based on a single value) and consider the internal variability. This is determined by the various composite indicators’ different performances on the same conceptual “model” (
Nardo et al. 2005). Finally, they can be used and considered in comparisons as a scalar (it is possible to use, for instance, the center) or genuinely as intervals (considering center, minima, and maxima). In this case, it is possible to use analytical approaches such as interval arithmetic to evaluate, for instance, a prototype (the statistical average of the different interval-based composite indicators). Furthermore, these interval-based composite indicators can contain a higher quantity of information so that the decision could be based on a more precise evaluation.
3.2. Methodology and Data
In the first step, we have to define the composite indicator model. The model is given by taking into account the following choices:
- (1)
The essential variables to be considered on the composite indicator;
- (2)
The significant number on the total to be considered;
- (3)
The relevant aggregation function;
- (4)
The weights applied on the composite indicator.
All these data come from the ASVIS database, which is considered a unique source. The date for each variable is 31 December 2016. The different indicators of their original name and their name are defined in
Table 1. Each indicator is considered a statistical unit in the Italian regions (for the year 2016).
In
Table 2, we compute the descriptives for each variable to evaluate some outliers of the data.
We chose the dataset related to 2016 to ensure the most recent set of data jointly available together, at higher reliability of the observations considered. Data reliability is a significant issue (see, for instance,
Kamanou et al. 2005). In this sense, these six variables are the most relevant we can consider for our model. In that way, these variables are considered the most significant in the framework we are explicitly considering. In this sense, it is possible to proceed with the data analysis to evaluate our initial indicators and the structure of the indicators used as components or factors of the interval-based composite indicator. All variables are destimulants in the study, but this is not usually the case in other studies. The stimulants and destimulants (
Kuc-Czarnecka et al. 2020) as factors that positively or negatively affect the considered phenomenon were introduced in (
Hellwig 1972). These definitions can also be found in (
Walesiak 2018). Other authors (
Mazziotta and Pareto 2016,
2018) use the terms ‘positive polarity’ and ‘negative polarity’ instead of the concept of stimulant and destimulant.
Some descriptive analyses of our data are considered. In this respect, we explore our variables by observing if some situations require special attention (for instance, significant outliers). In this vein, it is possible to compute the descriptive statistics for the variables and examine the critical structure of the data we can observe. Then it is possible to consider the correlation matrices of the variables. In particular, the correlation matrix can be usefully considered and visualized as a network with a specific threshold. These are relevant in practice because we can think of specific weighting schemes that show a high correlation. In extreme cases, the choice can be made not to use these indicators.
In this respect, it is necessary to evaluate our choices primarily. We considered the correlation matrix of the different variables to avoid select variables that eventually showed more relevant correlation problems.
Therefore, it is possible at this point to define our model of composite interval indicator by considering these specific factors (for the terminology in the composite indicators, see
Nardo et al. 2005):
The indicator choice;
The number of the indicator choice on the total number of indicators considered (in this respect, we can explore alternative configurations of the composite indicator);
The different weightings.
At the same time, we normalize each indicator by providing standardization for each of them. Following (
Nardo et al. 2005), we used a simple standardization for each considered component in the simulations:
Given
as the reference region, the component (or the indicator)
q for region
r is
, where the mean is
for the component and the
is the standard deviation (
Nardo et al. 2005).
Then, we aggregated the different indicators by obtaining the outcome. The algorithm is described in
Figure 1. In the figure, the construction of the interval-based composite indicator is described. First, it is necessary to choose the variables to be considered entirely for constructing the composite indicator (a set of feasible indicators to consider for the construction of the indicator). Then, to consider the uncertainty related to the construction of the composite indicator, a set of possible different random specifications is considered. In this sense, they simulated 2000 different composite indicators by choosing a different combination of the variables considered and weights. So, a set of different composite indicators is obtained, followed by the final interval. In the end, the intervals are estimated using 2000 simulations defined a priori as sufficient to estimate the intervals for each region. Thus, different interval-based composite indicators have been calculated, with each interval including a different number of simulations than the previous interval.
Additionally, the tables provide further evidence that supports this assertion. Two thousand runs seem to be sufficient for producing results that are consistent and stable.
Appendix A presents the results of the interval-based composite indicator from 1000 and 20,000 simulations (
Table A1 and
Table A2). It is possible to see that the findings are not significantly different from the results obtained after running 2000 simulations (see
Figure 1). Furthermore, the ranks are pretty robust.
It is also considered a choice of the relevant number of variables on the composite indicator. They are used for different indicators in the total consideration to evaluate different measurement approaches in constructing the poverty measure. In this respect, different results are obtained due to the variability of the different measures. There can be a weak association between the different indicators so that some regions can perform better in some indicators than others.
These characteristics can vary during the process of construction of the interval-based composite indicator. However, other elements on the construction of the composite indicator do not vary. For instance, the standardization of the different variables does not vary. At the same time, any outlier detection and missing imputation are not considered (in our case, there are no missing data detected on the analysis).
The computation of different parameters for the interval composite indicators is considered: we obtain four measures: the minimum, a measure for the maximum, and center and radius. It is possible to note that the composite indicator’s outcome comprehends a ranking for the minimum, the maximum, the center, and the radius. Therefore, the composite indicator can be interpreted as continuous. Furthermore, interval arithmetic makes it possible to compute the different prototypes (the interval average, which can be helpful as a benchmark).
At this point, it is possible to compute a different composite indicator by considering the random selection of a particular combination from the feasible initial indicators chosen. In this sense, our Monte Carlo simulation considers a maximum of four factors out of six with a random weight (we obtained different composite indicators by considering both the components and their weights). It was possible to sample the simulated weights using random number generation; then, we combined the results. In order to arrive at the ultimate value, we divided each preliminary weight for the total. Given the simulated composite indicator structure, we constrained the total of the final weights to be 1.
In total, 2000 unique composite indicators were obtained, based on the method above, and, finally, an interval was quantified. We computed the interval data representing the different poverty measurements using these results by considering the defined model. We considered the quantile 0.10 to be the minimum and the quantile 0.90 to be the maximum in order to avoid outliers and provide a robust version of the interval.
The different rankings were obtained by taking into account the different characteristics of the interval data: the minimum, the center, the maximum, and also the range. In the end, a different ranking that can take into account the alternative scenarios was obtained. Thus, the findings seem to be robust when considering various quantiles in the study. However, the scenarios with quantiles 0.05 and 0.95 and 0.01 and 0.99 produced the most significant shift in the ranking, with Campania taking first place rather than Calabria (
Table A3 and
Table A4). This finding shows that the situation in Campania, depending on the factors, may be critical.
The interpretation of the center (or mid-point) and the range (or width) is essential. In this respect, it is possible to interpret the center as the “result” of the composite interval indicator, which is comparable to the most probable scenario (in this way, compared to the classical composite indicator analysis, the center could be used). In order to compare the center, the same composite indicator is computed using the equal weights scenario (
Table 3). The interval range simultaneously is essential because it shows a critical difference in the results between different composite indicators. It is also possible to observe some scenarios producing relevant results when significant differences exist between the different indicators used to construct the composite indicator.