**2. Theoretical Background**

At present, we can observe several tendencies regarding the application of prediction models in predicting bankruptcy. Sun et al. (2014) specifically point to three trends: the transition from one-dimensional analysis of variables to multidimensional prediction, a shift from classical statistical methods to machine learning methods based on artificial intelligence, and more intensive involvement of hybrid and ensemble classifiers. Aziz and Dar (2006) divided prediction models into statistical prediction models, models that use artificial intelligence, and theoretical models. Individually, the use of multiple discriminant analysis (MDA) and Logit models dominates the research (Altman and Saunders 1997; In: Csikosova et al. 2019).

Fitzpatrick (1931) was the first to deal with bankruptcy prediction in his study of solvent and insolvent businesses. In the following years, research on this topic has been carried out by Merwin (1942), Chudson (1945), Jackendoff (1962), and Beaver (1966) (In: Delina and Packová 2013). Beaver (1966) demonstrated that financial ratios can be useful in the prediction of an individual firm's failure. He has proven that not all financial indicators can be used to predict business difficulties. However, the use of simple financial indicators was questioned in practice because of their possible mismanagement by managers. Univariate analysis was later followed by authors who used multivariate analysis. In the beginning of multivariate prediction models, discriminant analysis (DA) was applied. In 1968 Altman developed a multiple discriminant analysis model (MDA) called the Z-Score Model. Since Altman's

study, the number and complexity of these models has increased dramatically. DA was explored by Blum (1974), Elam (1975), Altman et al. (1977), Norton and Smith (1979), and Taffler (1983). Altman's original model required the fulfilment of multinormality, homoskedasticity, and linearity assumptions. These prerequisites for financial indicators were often not met. The main drawback of DA, however, is that although it is able to identify businesses that are likely to go bankrupt, it is not able to estimate the likelihood of this situation occurring. Based on these shortcomings, the next step in the theory of bankruptcy prediction was to develop methods and models that would be able to provide such information (Mihaloviˇc 2015). That was the reason why logistic regression began to be preferred, as this method does not have to meet these conditions. Compared to methods based on multi-dimensional discriminant analysis, logistic regression has several advantages. It has a higher predictive ability and its application does not require compliance with assumptions that could limit its usability. The method was first used to predict the bankruptcy of banks by Martin (1977). Ohlson (1980) was the first to use it to assess companies. Ohlson, as a pioneer in the application of Logit analysis, did not agree with the application of discriminant analysis to predict bankruptcy due to its requirement for a variance-covariance matrix (Klieštik et al. 2014). However, even Logit models have their weakness—their sensitivity to remote observation.

Another method used in the area of bankruptcy prediction is DEA (Horváthová and Mokrišová 2018). Compared to statistical methods, DEA is a relatively new, non-parametric method, which represents one of the main possible approaches to assessing the financial health of a business and its risk of bankruptcy (Štefko et al. 2018). This method was first applied in Charnes et al. (1978). It is based on the idea mentioned in the article "Measuring efficiency of decision making units", published by Farrell in 1957. His work was based on the works of Debreu (1951) and Koopmans (1951). Farrell (1957) proposed a new approach to efficiency measuring based on a linear convex envelopment curve and the use of distance measurement functions between the enterprise of interest and the projected point on the efficiency frontier. In this way, he proposed a new level of efficiency based on the calculation of two components of the overall business efficiency: technical efficiency and resource allocation efficiency. Farrell's approach measures the ability of the business to transform inputs into outputs. Therefore, it is also called the input-oriented approach. Charnes et al. (1978) have applied a multiplicative input-output model to measure business efficiency. The approach of these authors represents a two-stage efficiency calculation. The first step is to identify the production frontier, while businesses that lie on this line are among the best businesses. In the second step, the efficiency score is calculated for the analysed enterprises and their distance from the production frontier is determined. From the point of view of their input, DEA models can be divided into DEA CCR (Charnes et al. 1978) and DEA BCC (Banker et al. 1984). This method was further developed by Färe et al. (1985). The DEA method was also used by the following authors: Tone (2001); Wang et al. (2007); Kao and Hwang (2008); Sadjadi and Omrani (2008); Zhu (2015); Oanh and Ngoc (2016); Ghomi et al. (2019); Dumitrescu et al. (2020); and many others.

The first idea to use the DEA method to predict bankruptcy was recorded by Simak (1997), who was the first to compare its results with the results of Altman's Z-score. Other authors dealing with the DEA bankruptcy prediction included Cielen et al. (2004). The authors used the DEA radial model to predict bankruptcy and compared the results with DA results. In the same year Paradi et al. (2004) applied an additive and radial model along with the peeling technique. The model achieved 100% success in predicting the bankruptcy of businesses. In 2009, Premachandra et al. used an ADD model and compared its results with the results of logistic regression. The result of this research was a satisfactory level of correct prediction of business bankruptcy. The prediction rate for financially sound businesses was less accurate. Sueyoshi and Goto (2009) applied an ADD model to create a line under which businesses go bankrupt. The results were then compared with the DEA-DA approach. In 2011, Premachandra et al. combined the radial and ADD model and created the DEA ranking index. Shetty et al. (2012) applied the DEA model in 2012 to determine the bankruptcy likelihood for their

analysed business sample. The result of their study was the designing of indicators that should be applied as predictors of bankruptcy.

Other methods that are suitable for the application in the field of bankruptcy prediction include neural networks. In this context, it is possible to mention the authors Odom and Sharda (1990), who developed a neural network to investigate business bankruptcy using selected financial indicators. Gherghina (2015) made a significant contribution to the application of neural networks in this area. The neural network in the field of bankruptcy prediction was also applied by Altman et al. (1994). Other methods include decision trees (Breiman et al. 1984; Frydman et al. 1985). However, in conclusion, it should be noted that the most commonly used methods today are discriminant analysis and logistic regression.

In line with the above-mentioned text, we identified the following research problem: Is the DEA method a suitable alternative in predicting failure of businesses from the analysed sample? In relation to the research problem, the aim of the paper was formulated: To predict business failure with the use of the ADD model and to compare its results with the results of the Logit model. The aim was also to analyse classification and estimation accuracy of the ADD DEA model and to compare it with the classification and estimation accuracy of the Logit model.

#### **3. Methodology and Data**

DEA models are designed to assess the technical efficiency of production units based on the size of inputs and outputs. There are two possible approaches to creating DEA models: multiplicative and dual. The dual model is an additional task to the multiplicative one. A significant problem of the DEA analysis are production externalities (negative outputs) and desirable inputs. Generally, in DEA models, the basic prerequisite is data positivity. However, situations in which negative inputs and outputs occur are not uncommon. In the case of the sample of companies analysed, negative outputs occurred in the case of profitability. The ways to deal with this problem are different. Some software programmes attach zero weight to negative inputs and outputs. Another frequently used option is to treat the negative outputs as inputs (thus minimising them) and the desired inputs as outputs (thus maximising them). However, this procedure is not universally applicable. One of the simpler options is to use an additive model in which the positive and negative inputs and outputs are evaluated separately (Premachandra et al. 2009; Mendelová and Stachová 2016).

The ADD model is one of the non-oriented models. This model was formulated by Charnes et al. (1985). A Decision Making Unit (DMU) was introduced as a unit for which efficiency was solved and which describes any entity for which the process of transforming inputs into outputs is in progress. Determining DMU efficiency with an additive model for variable returns to scale means solving the following linear programming model:

$$\begin{aligned} \max\_{\boldsymbol{\lambda}, \mathbf{s}^T \mathbf{s}^\mathbf{y}} & A\_o = \left( \mathbf{e}\_{\text{m}}^T \mathbf{s}^\mathbf{x} + \mathbf{e}\_s^T \mathbf{s}^\mathbf{y} \right) \\ \text{s.t. } & \sum\_{j=1}^n \mathbf{x}\_j \boldsymbol{\lambda}\_j + \mathbf{s}^\mathbf{x} = \mathbf{x}\_0, \quad \mathbf{s}^\mathbf{x} \ge \mathbf{0}, \\ & \sum\_{j=1}^n y\_j \boldsymbol{\lambda}\_j - \mathbf{s}^\mathbf{y} = y\_0, \quad \mathbf{s}^\mathbf{y} \ge \mathbf{0}, \\ & \sum\_{1}^n \boldsymbol{\lambda}\_j = 1, \qquad \boldsymbol{\lambda}\_j \ge \mathbf{0}, \end{aligned} \tag{1}$$

where *em*, *es*, are unit vectors of appropriate length and *sx*, *s<sup>y</sup>* are additional variables-slacks. DMUo o = {1, ... , *n*}, is efficient when *s<sup>x</sup>* = 0, *s<sup>y</sup>* = 0, in other words, when the objective function and all slacks equal zero. Otherwise, the DMUo is inefficient.

Since our paper does not address the efficiency of the analysed sample, but rather covers bankruptcy, the input vectors *xo*, were replaced by output vectors *yo*. The efficiency condition in this case served as a condition for the assumed bankruptcy of the company. In our research, we used 9 financial indicators. We selected this group of indicators in such a way that it contains indicators from all areas of financial health evaluation (liquidity, profitability, activity, indebtedness) and there is not a strong correlation between indicators. As output variables, we applied indicator

LLTA—long-term liabilities/total assets used as a leverage measure which indicates long-term financial obligation, and indicator CLTA—current liabilities/total assets which indicates a lack of cash flow to fund business operations. As input variables, we applied 7 indicators: TRTA—total revenue/total assets, CR (Current ratio)—(financial assets + short-term receivables)/current liabilities, WCTA—working capital/total assets, CATA—current assets/total assets, EBTA—earnings before interest and taxes/total assets, EBIE—earnings before interest and taxes/interest expense, and ETD—equity/total debt. For the creation of the ADD model, we used the Efficiency Measurement System (EMS) software. We divided the results of the DEA model into 6 zones (businesses in financial distress—3 zones and financially healthy businesses—3 zones) according to Mendelová and Bieliková (2017).

#### *3.1. Logit Model*

The Logit regression model was applied to compare the results obtained with the DEA model. The Logit model is a widespread model that has been used by several authors to predict the default/no default probability of a company (Premachandra et al. 2009; Kovácov ˇ á and Klieštik 2017; Mendelová and Stachová 2016). This model is a type of multivariate statistical model. It captures the relationship between the dependent variable Y and the independent variable X.

Logistic regression works very similar to linear regression, but with a binomial response variable (Sperandei 2014). The dependent variable *yi* can only take two values: *yi* <sup>=</sup> 1 if the probability of bankruptcy occurs and *yi* <sup>=</sup> 0 if the probability of bankruptcy does not occur. Therefore, we can assume that probability *yi* =1 is given by *Pi*; probability *yi* = 0 is given by 1 − *Pi.* By using logistic transformation, we could specify the probability *Pi* using the following model: *Pi* = *f* (α + β*xi*), where xi are the chosen financial indicators while α and β are estimated parameters. *Pi* is then calculated using the logistic function:

$$P\_i = \frac{\exp(\alpha + \beta \text{xi})}{1 + \exp(\alpha + \beta \text{xi})} = \frac{1}{1 + \exp(-\alpha - \beta \text{xi})}.\tag{2}$$

A logistic regression models the chance of an outcome based on individual characteristics (Sperandei 2014). According to Kovácov ˇ á and Klieštik (2017), the Logit can be defined as:

$$\text{Logit} = \ln(\frac{P\_i}{1 - P\_i}) = f\left(\alpha + \beta \mathbf{x}\_i\right). \tag{3}$$

The above represents the logarithm of the odds ratio of the two possible alternatives (*Pi*, 1 − *Pi*). It is called the Logit. The goal of logistic regression is to calculate the odds ratio ( *Pi* 1−*Pi* ); ln in this relationship represents the Logit transformation.

For the creation of Logit model, it was necessary to divide businesses into bankrupt and non-bankrupt. When choosing the appropriate conditions for bankruptcy occurrence evaluation, we studied the papers of various authors. Some of them assume that a company goes bankrupt if it does not make a profit (Beaver 1966; Altman 1968; Altman et al. 1977; Geng et al. 2014) or reaches negative cash flow (Ding et al. 2008). Based on the bankruptcy definition stated in the Introduction, we chose the value of indebtedness as the bankruptcy condition. We then detected 50 bankrupt businesses.

When creating the Logit model, we started with the same 9 indicators which we used for the ADD model. However, there was a strong correlation between ETD and the indicator of indebtedness which we used as a bankruptcy criterion. Therefore, we did not use the indicator ETD in the Logit model. We also excluded the indicator CATA because when applying it, the Logit model did not process any coefficients. We assumed that the indicators CATA and WCTA are related indicators which evaluate the same financial area of evaluation and express the same reality. For the creation of the Logit model, we used software Statistica 13.1.

Using the results of the Logit model, it is possible to determine whether a company is about to go bankrupt or not. This classification may use a cut-off score (usually 0.5), with businesses above this value facing a probability of going bankrupt and businesses below this value facing lower (or no) probability of going bankrupt. Two types of misclassification can occur when evaluating business failure. The type I error (false negative rate) arises when a bankrupt company is classified as non-bankrupt, and the type II error (false positive rate) arises when the non-bankrupt company is classified as bankrupt (Kovácov ˇ á and Klieštik 2017).

The prediction ability of the Logit model can be verified by using the Area Under Curve (AUC) method, which measures the area under the Receiver Operating Characteristic curve (ROC curve). This analysis represents a statistical procedure for evaluating correct and false positives as well as correct and false negatives. ROC curve analysis describes the relationship of sensitivity and specificity at different discriminatory levels. AUC measures overall performance of the model. It can take on any value between 0 and 1. The closer AUC is to 1, the better is the overall performance of the model (Park et al. 2004).

One of the important tests that can be mentioned in verifying the Logit model is the Wald test, which confirms the significance of variables in the model. Based on the results of Likelihood ratio test, the model includes those variables, which increases its maximum credibility. This test is suitable not only to assess the significance of the model, but also to assess the contribution of individual predictors to the model. The higher the Chi-square test statistic, the better the model reflects the situation of a business. In addition to the above tests, the results of the Hosmer–Lemeshow test should be mentioned. This test indicates the compliance of the model with the applied data. Nagelkerke's R Square explains the percentage of variance, while we could also find out how successful the model is in explaining the "variability" of a dependent variable.

## *3.2. Description of the Sample of Companies*

The input database of this empirical study was created from data obtained for 497 companies operating in Slovakia in the heat supply industry. The database of the data from financial statements of these companies for the year 2016 was obtained from the Slovak analytical agency CRIF—Slovak Credit Bureau, s.r.o. According to SK NACE Rev. 2, the sample of enterprises analysed falls under section D: "Supply of electricity, gas, steam and cold air". Sources and distribution of heat of these businesses were built along with the development of urban agglomerations. Their systems allow the effective use of various sources of energy generated in a city, including renewable sources, waste heat, and so on. These systems are an energy infrastructure integrator which can efficiently link production and consumption and enable the storing of energy (in the form of heat) at the time of its surplus. As part of independent heat production, today about 54% of the heat is produced in combined production (Janiš 2018). The European Commission's winter energy package sets new targets for energy efficiency. These goals and new trends in energy bring new opportunities and challenges for the heating industry. These facts are a precondition for the occurrence of risk factors which affect the performance and competitiveness of analysed businesses from outside. A more detailed analysis of this sample excluded 154 companies due to a negative value of equity or deficiencies in the database. The resulting analysed sample consisted of 343 companies. In terms of each business's legal status, 15% of the companies are joint stock companies and the remaining 85% are limited liability companies. The results of the financial analysis show that the analysed companies do not have a liquidity problem. The average value of current ratio found is 3.92. However, we also obtained a median of current ratio of 0.951. This was also reflected in the negative value of net working capital. The analysed sample of companies reported a high creditors payment period, which results in a negative value of cash conversion cycle. The assets of these companies change on average once a year. The average value of the return on assets is 5%. The capital structure of these companies is 35:65 in favour of equity. The performance of companies active in the heating industry was not found to reach the required value to avoid bankruptcy.

The Table 1 shows the descriptive statistics of indicators, which represent indicators applied in the DEA model. The values of the indicators are divided into two groups. The first group consists of bankrupt businesses and the second group consists of non-bankrupt businesses. Of these values, the negative values for WCTA, EBTA, ETD, and EBIE should be pointed out, as these negative values are one of the signs of bankruptcy. Analysed businesses also have high indebtedness.


**Table 1.** Descriptive statistics for bankrupt and non-bankrupt businesses.
