Private Firm Valuation Using Multiples: Can Artificial Intelligence Algorithms Learn Better Peer Groups?

Jagrič, Timotej; Fister, Dušan; Grbenic, Stefan Otto; Herman, Aljaž

doi:10.3390/info15060305

Open AccessArticle

Private Firm Valuation Using Multiples: Can Artificial Intelligence Algorithms Learn Better Peer Groups?

by

Timotej Jagrič

¹,

Dušan Fister

²,

Stefan Otto Grbenic

^3,4 and

Aljaž Herman

^1,*

¹

Institute of Finance and Artificial Intelligence, Faculty of Economics and Business, University of Maribor, 2000 Maribor, Slovenia

²

Faculty of Electrical Engineering and Computer Science, University of Maribor, 2000 Maribor, Slovenia

³

Institute of Business Economics and Industrial Sociology, Graz University of Technology, 8010 Graz, Austria

⁴

Business and Management, Webster Vienna Private University, 1020 Vienna, Austria

^*

Author to whom correspondence should be addressed.

Information 2024, 15(6), 305; https://doi.org/10.3390/info15060305

Submission received: 8 April 2024 / Revised: 7 May 2024 / Accepted: 23 May 2024 / Published: 24 May 2024

(This article belongs to the Section Artificial Intelligence)

Download

Browse Figure

Review Reports Versions Notes

Abstract

:

Forming optimal peer groups is a crucial step in multiplier valuation. Among others, the traditional regression methodology requires the definition of the optimal set of peer selection criteria and the optimal size of the peer group a priori. Since there exists no universally applicable set of closed and complementary rules on selection criteria due to the complexity and the diverse nature of firms, this research exclusively examines unlisted companies, rendering direct comparisons with existing studies impractical. To address this, we developed a bespoke benchmark model through rigorous regression analysis. Our aim was to juxtapose its outcomes with our unique approach, enriching the understanding of unlisted company transaction dynamics. To stretch the performance of the linear regression method to the maximum, various datasets on selection criteria (full as well as F- and NCA-optimized) were employed. Using a sample of over 20,000 private firm transactions, model performance was evaluated employing multiplier prediction error measures (emphasizing bias and accuracy) as well as prediction superiority directly. Emphasizing five enterprise and equity value multiples, the results allow for the overall conclusion that the self-organizing map algorithm outperforms the traditional linear regression model in both minimizing the valuation error as measured by the multiplier prediction error measures as well as in direct prediction superiority. Consequently, the machine learning methodology offers a promising way to improve peer selection in private firm multiplier valuation.

Keywords:

private firm valuation; multiples; peer group; peer selection; artificial intelligence; self-organizing map

1. Introduction

Multiplier valuation is based on the principle of substitution; that is, the company valued is substituted by comparable companies (peers). Companies with identical idiosyncratic risks should have at least comparable ratios for profit or success. Consequently, forming optimal peer groups is at the heart of any multiplier valuation. After identifying a group of peers, the value of a company can simply be reconstructed by using the multiples of the peer group. Hence, the choice of an optimal peer group is one of the most crucial steps involved in company valuation with multiples. In defining peers to ensure the maximum performance of the multiples, valuators must simultaneously define the optimal peer pool [1], the optimal size of the peer group (e.g., [2,3]), the aggregation function for the peer-related multiples, the time horizon of the related firm transactions regarded [4], and the optimal set of peer selection criteria employed. Regarding the number of peer selection criteria employed, two schools on peer selection/filtering strategies are promoted, emphasizing the trade-off between a narrow and a broad market materializing in a small or large number of selection criteria employed (e.g., [5,6]). Emphasizing the trade-off between additional information captured and the impact of less valuable information added, the total market strategy (broad market) captures all information but at the cost of adding more noise, while the best-fit strategy (narrow market) captures only high-valuable information (avoiding noise) but at the cost of disregarding information. Although no theoretical solution on superiority exists, theory generally prefers the best-fit strategy (e.g., [7]).

Since there exists no universally applicable set of closed and complementary rules due to the complexity and diverse nature of firms, the market (multiplier) approach as well as further common valuation approaches (income approach, asset approach, etc.) lack sufficiency in assessing all drivers of company value [8,9]. Identifying peers, research based on traditional methodology lacks direction, since no economic theory and no mathematical/statistical model exists that captures all relevant factors of comparability (in the sense of “economical attractiveness”) between companies. Leaving numerous open-ended questions, employing the machine learning methodology might offer a promising way to improve peer selection. Consequently, this study utilizes the linear regression model as a benchmark for the self-organizing map machine learning algorithm serving as a comparative tool. While traditional linear regression modelling requires all model parameters (explaining variables, selection criteria) as well as the peer pool, peer group size, and the aggregation function, to be defined a priori, the self-organizing map algorithm automatically excludes redundant and non-contributing parameters and optimizes the relevant peer pool, peer group size, and time horizon of firm transactions regarded. To stretch the performance of the linear regression model to the maximum, three datasets were employed: (i) a full dataset capturing all appropriate selection criteria, (ii) an F-test optimized set of selection criteria, and (iii) a dataset optimized employing Neighbourhood Component Analysis.

The aim is to predict values of five valuation multiples commonly used in valuation practice. Therefore, a pool of private firm transactions was selected, and their previously known multiples were estimated employing synthetic peer group multiples. Using a sample size of over 20,000 private firm transactions, the performance of the synthetic multiples was evaluated employing three distinct performance indicators (error measures) emphasizing bias as well as valuation accuracy, and statistical analyses of the resulting valuation errors were conducted.

The results allow for the overall conclusion that the self-organizing map algorithm outperforms the traditional linear regression model in both minimizing the valuation error as measured by various valuation errors as well as in a direct comparison of winners. Consequently, the machine learning methodology offers a promising way to improve peer selection.

The remainder of this article is organized as follows: Section 2 describes the related literature on both peer selection criteria employed in traditional regression modelling and machine learning applications. Section 3 reports descriptive statistics on sample data and methods. Section 4 describes the research methodology and reports the results. Section 5 discusses the results. Finally, Section 6 concludes.

2. Literature Review

In traditional peer selection methodology research, emphasizing the optimal number of peer selection criteria and following the total market strategy employing no or only a limited number of selection criteria, the selection power of the industry criterion was examined, assuming companies within an industry to show similar characteristics and being exposed to similar expectations concerning the core value drivers. Industry serves as a surrogate for companies with similar profitability, growth potential and cycles, operating risks, revenue and earning trends, capital structure, accounting methods, etc. In the aspect of Initial Public Offerings (IPOs) [10,11,12], where a company’s ownership transitions from private ownership to public ownership, and of leveraged buyouts (LBOs) [13,14], where a company is acquired using a significant amount of borrowed money, the authors conclude that selecting peers based on the industry criterion is superior to using pure total market multiples. Furthermore, in research [15], the authors conclude that the industry criterion is superior to a random selection of peers, and in [16], the researchers find the industry criterion superior to the selection criteria risk and earnings growth. Combining the industry criterion with additional filters, the conclusion of adding a company size filter (as a duplicating indicator of risk) to improve selection power was made [17]. By adding a regional filter to sort companies by location, from continents to specific regions to help investors align with their strategy or risk tolerance, diversify portfolios, and manage geopolitical risks [18], adding filters for profitability and intangible assets to assess a company’s ability to generate profits, typically measured through metrics like net income, operating income, or profit margins [19], adding a growth filter to identify companies with high growth potential in terms of revenue, earnings, or market share for investors who often look for indicators such as historical revenue growth rates, projected earnings growth, or penetration into new markets [20], and adding some combination of profitability (e.g., looking for appropriate companies, also known as peer companies, for comparison [21,22]), growth, risk, and further financial ratio filters similarly find an improvement. In research [23], the authors conclude to use an industry filter combined with splitting profitability and risk filters into an industry component and a company-specific component, and in [24], the author concludes to combine the industry criterion with some macroeconomic indicators to improve the selection power of the industry filter. Finally, the isolated use of a company size/risk, growth, or debt filter was found to be inferior to the industry filter [20,25,26]. More specifically, firms are chosen for comparison based on industry, risk (measured by firm size), and earnings growth, both individually and in pairs [20], and consistency with the importance of recognizing differences between companies when applying the multiple valuation method should be taken into account [26].

Following the best-fit strategy employing numerous selection criteria, the research employed various peer selection criteria in different research settings that may be categorized into three categories for use in transaction multiples: (i) Transaction (acquisition) characteristics are related to synergy filters that assess the potential synergies (e.g., businesses operating within the same industry [27]), control filters, which relate to the level of control that the acquiring company seeks to have over the target company (e.g., [28]), diversification filters that consider how the acquisition fits into the acquiring company’s overall portfolio of businesses (e.g., the checking of intra-industry mergers to typically entail higher premiums [29]), regional filters that focus on geographical considerations related to the transaction (e.g., [30]), and a filter indicating the method of payment (e.g., [31]). (ii) Market characteristics are related to market condition/activity filters, which pertain to the broader market conditions or activities at the time of the transaction, lending support to the managerial herding hypothesis, indicating managers who may be more likely to engage in acquisitions during periods of high market activity due to perceived safety in following the actions of their peers [32]; examining the relation between the premium paid in acquisitions and deal size (e.g., the authors presented a strong negative relationship between offer premiums and target size, indicating that acquirers tend to pay less for large firms, not more [29]); examining valuation differences in boom and crash market periods relative to stable periods of IPOs [33]; and comparing the effectiveness of various industry classification codes (e.g., [34]). (iii) Company characteristics are related to profitability (ROA—Return on Assets and ROS—Return on Sales, e.g., [35]); examining the performance of various industry groupings as industries can vary significantly in terms of growth prospects, competitive dynamics, regulatory environments, and other factors that impact performance [36]; examining the performance of industry-related as compared to cross-sectoral multiples [25]; elaborating upon the accuracy and drivers’ evidence of multiples which influence their variations [37]; examining the sell-side analyst’s choice on peers [38]; examining optimal peer selection formulae [39]; examining CEO compensation (e.g., [40]), risk (the size of the target and acquirer firm, e.g., [41]), business risk, e.g., [34], the legal form of the target firm [42], the size of the acquired stake [32], the size ratio of the target firm to the acquirer firm, e.g., [29], growth (e.g., [43]), and further criteria (type of accounting [44]).

Evaluating the performance of machine learning (hereinafter: ML) algorithms in peer selection, an approach that yields flexible groupings of firms to construct peer groups based on financial ratios applying

k

-medians clustering was proposed [45], an unsupervised ML algorithm that divides a set of observations into clusters with observations within the clusters to be similar and to be dissimilar from observations in the remaining clusters.

K

-medians clustering partitions a given dataset into different groups by minimizing the intra-group heterogeneity within the groups and maximizing inter-group heterogeneity by both defining the optimal number of groups and deciding the optimal set of clustering variables employed. This meets the primary objective of peer selection identifying firms that share similar characteristics indicated by similar financial ratios. They compare the performance of the

k

-medians model to grouping methods based on industry classification and firm size. The results indicate that

k

-medians clustering provides an improvement over industry and company size-based peer selection methods. Furthermore, they conclude that the performance of those traditional methods is enhanced if information about

k

-medians clustering-based peers is included in the peer selection process.

In the research [46], the authors argued that the widely used industry criterion does not meet the requirements for proper peer classification on company valuation with multiples and demonstrated in an out-of-sample study that applying ML algorithms improves peer comparability as compared to standard classification approaches. They demonstrated that the data-driven dimensionality reduction and visualization algorithm t-distributed Stochastic Neighbour Embedding [47] together with the Spectral Clustering algorithm can effectively visualize and classify company data, since it can handle high-dimensional and less compact data efficiently, detect nonlinear data structures, and reduces data dimensionality by increasing interpretability and simultaneously minimizing information loss, providing a dataset that can be used to detect groups (clusters) of comparable (similar) companies.

It was examined how companies differ from their competitors employing time-varying measures of product similarity based on a text-based analysis of firm product descriptions as compared to the traditional industry grouping [48]. While the latter places firms within predefined industry groups, measuring product similarity determines peers using the information that firms provide to determine whom they compete against, based on the products sold that arise from underlying consumer preferences and demand. A year-by-year set of product similarity measures allows for the generation of sets of industries in which firms can have their individual distinct set of competitors. They employ a clustering algorithm using word vectors and firm pairwise cosine similarity scores based on the words used by each firm. Both firms and entire industries can move in the product space over time as technologies and product tastes evolve; new firms can appear in the sample, and each firm can have its own distinct set of competitors. The results indicate that measures of similarity in industry groups based on text-based network industries better explain peer firms than existing classification systems do. The algorithm has both a higher ability to capture product and industry change and a higher ability to capture cross-industry relatedness. Furthermore, it has a higher ability to generate higher levels of cross-industry variation in various measures of profitability, sales growth, and stock market risk.

Gradient Boosting Decision Trees were proposed for relative valuation and peer firm selection tasks to answer the question of stocks being overvalued or undervalued relative to their peers [49]. The authors argue that the choice of peers is often highly subjective, peer selection is loosely based on the industry criterion or only a limited number of selection criteria (variables), and empirical evidence suggests practitioners strategically selecting peers to achieve desired valuation results. Furthermore, they argue that traditional regression models assume a (actually implausible) linear relationship between the selection criteria and a valuation multiple, whereas ML models do not require a predefined theoretical model. They predict valuation multiples and express them as weighted averages of multiples of comparable companies. These weights are a measure of peer–firm comparability and can be employed simultaneously for selecting peers based on similar company fundamentals. Companies with high weights are closely comparable firms, since they are allocated to the same leaves. Their results indicate that the ML model substantially outperforms traditional valuation models generating more accurate out-of-sample predictions. This outperformance holds across different multiples, across different types of firms, and persists over time.

Finally, a conceptual framework for constructing clustering and classification ML models was developed to identify peer companies when evaluating the market value of private companies [50].

3. Materials and Methods

3.1. Sample Data

This analysis is based on the transaction data of private firms located in OECD countries for the period 2000 until 2019. Data on transactions, pricing, and company information was collected from the ZEPHYR M&A database, adding company data from the ORBIS database to gain additional information on incomplete datasets to ensure the maximum data. The database providers operate within the private sector; hence, we are unable to disclose the database itself. However, access to the data can be facilitated through these providers upon request. The transactions had to meet three additional requirements: First, the completion of the transaction had to be confirmed or at least be assumed confirmed. Second, the transaction had to be a private market transaction indicating direct sales of common stock, as they can be considered as arm’s length transactions representing fair market value. Consequently, all transactions of convertible preferred stock, stock options, or warrants were dropped, as they usually do not involve actual arm’s length negotiations that may cause a significant bias in the pricing data. Finally, third, following general practice (e.g., [11]), only positive enterprise and equity values as well as the multiple-related positive value drivers were included to ensure positive multiples. There is considerable debate on the exclusion of negative value indications and/or negative value drivers in forming peer groups. From a technical point of view, proponents of the elimination approach argue that since the multiple is aggregated into a single synthetic peer group multiple, the most common aggregation methods (i.e., arithmetic mean, median, and harmonic mean) do not work properly in the presence of negative multiples. Furthermore, in the study [51], the authors empirically conclude that the elimination of negative multiples improves valuation accuracy. In contrast, opponents of the elimination approach (e.g., [52]) argue that including negative multiples better reflects the “true” aggregate peer group performance, it avoids the overestimation of peer group profitability (and, ceteris paribus, the underestimation of company value), and the exclusion of negative multiples might in its entirety be regarded as a systematic peer selection strategy.

Input data (variables) were divided into four sections: the (i) fundamental characteristics of the transaction (real and dummy values), (ii) industry sector of the target company (dummy values only), (iii) country and continent of the target company (dummy values only), and (iv) period the transaction took place (dummy values only). Table 1 reports input data.

Important fundamental characteristics of the transactions are the acquired stake in percent, the type of acquisition (control/minority acquisitions, management/institutional buyouts, mergers/demergers, joint ventures), acquirer company size (natural logarithm of operating sales indexed), and the (absolute) size of the acquisition (indexed). Additionally, macroeconomic data concerning the rate of inflation and GDP were included. Most transactions were recorded in the telecommunications sector, followed by transportation machinery and the chemical sector. Transactions were grouped into three periods (i) 2000 until 2007, (ii) 2008 until 2012, and (iii) 2013 until 2019.

Output data were organized into five variables (value drivers), each representing a robust financial multiplier (hereinafter: FM) of the target company (real values) popular in the valuation literature: (i) EPV (enterprise value) to sales, (ii) EPV to EBITDA (earnings before interest, taxes, depreciation, and amortization), (iii) EPV to EBIT (earnings before interest and taxes), (iv) EPV to total assets, and (v) EQV (equity value) to EBT (earnings before taxes). The EPV to sales multiple and the EPV to total assets multiple are denoted by [1,41]

λ_{i}^{δ_{i, j}} = \frac{E_{i} + T_{i} - I_{i} + D_{i} - C_{i} + L_{i} + P_{i} + A_{i}}{δ_{i, j}}

where

i

is the index on firm transactions,

δ_{i, j}

is the value driver (with

j

indicating sales and total assets),

λ_{i}^{δ_{i, j}}

is the enterprise value multiplier on the corresponding value driver,

E_{i}

is the market value of equity,

T_{i}

is the market value of third party (minority) shares in subsidiaries,

I_{i}

is the market value of investments in unconsolidated companies (associates and joint ventures),

D_{i}

is the market value of straight debt (interest bearing liabilities),

C_{i}

is the market value of cash and cash equivalents,

L_{i}

is the market value of finance leases,

P_{i}

is the market value of pension reserves, and

A_{i}

is the market value of accounts payable.

The earnings multiples EPV to EBITDA and EPV to EBIT are denoted by [1]

λ_{i}^{δ_{i, j}} = \frac{E_{i} + T_{i} - I_{i} + D_{i} - C_{i} + L_{i} (+ P_{i})}{δ_{i, j}}

where the pension reserves

P_{i}

are added only if the interest on pensions is not part of the cost of goods sold.

Finally, the EQV to EBT multiple is denoted by [1]

λ_{i}^{δ_{i, j}} = \frac{E_{i}}{_{i, j}}

Descriptive statistics for the FM are reported in Table 2.

A high skewness indicates many outliers in the output variable data, increasing requirements for data processing. A high kurtosis indicates a high concentration of data around the mean rather than a normal-like distribution. Therefore, a fat-tailed distribution was expected. Missing data again made data processing more challenging.

With respect to the data, we can conclude that unsupervised ML techniques theoretically provide at least two significant benefits over supervised ML techniques: (i) First, only a single model is trained instead of building models for each output variable. Consequently, data are mapped using this single model by searching for similar input instances and calculating peer mean values for the a priori known values of the FM. (ii) Second, missing output data do not cause the input training data to diminish, while supervised ML techniques suffer from missing output data as they search for an optimal transformation from inputs to outputs.

3.2. Linear Regression

Linear regression (hereinafter: LR) is a widely used statistical approach to model the connection between one or more explanatory variables (inputs) and a dependent variable (output, response variable). Similarly, it is widely used in ML due to its advantages of (i) transparency (which is typically heavily limited for other ML methods due to the black-box problem), (ii) determinism (enabling efficient computing), and (iii) adaptability (since many remediations may be employed if LR assumptions are not met). The drawbacks of classic LR typically are its sensitivity to outliers, its inability to emphasize the dominant input variables, and, simultaneously, disregarding the less important input variables due to either redundancy or poor contribution (feature selection). Both drawbacks may be compensated during the pre-processing and data preparation stages (prior using LR) but at the expense of increasing complexity. The general multiple regression problem is defined as [53]

y_{i} = β_{1} \cdot x_{1 i} + β_{2} \cdot x_{2 i} + \dots + β_{k} \cdot x_{k i} + u_{i}

where

y_{i} = E (y_{i}| x_{1 i}, x_{2 i}, \dots, x_{k i}) + u_{i}

indicates the expected value of the dependent variable for the

i

-th input data instance,

u_{i}

indicates the stochastic error for the

i

-th input instance, and

β_{j}

indicates the partial regression coefficients of the

j

-th explanatory variable. Regularly,

x_{1 i}

is reduced to

1

, and, consequently,

β_{1}

represents the intercept, and the remaining

β_{2}, \dots, β_{k}

indicate partial slopes.

The equation of the Ordinary Least Squares estimator (hereinafter: OLS) using vector notation is denoted by [53]

\hat{β} = {(X^{T} \cdot X)}^{- 1} \cdot X^{T} \cdot y

where

\hat{β}

indicates the vector estimates of the partial regression coefficients

β_{1}, \dots, β_{k}

,

X

is a design matrix, and

X^{T} \cdot X

is a covariance matrix. Under certain assumptions, the OLS vector estimate

\hat{β}

provides the best linear unbiased estimator (BLUE).

OLS was used as a benchmark method, following a process as described in Algorithm 1: First, the explanatory and the dependent variables were employed to estimate the regression coefficients

\hat{β}

. Since output data capture five dependent variables

j

(the five FMs), five different models (

n = 5

) with five different sets of regression coefficients were built (

{\hat{β}}^{(j)}

). Next, all input instances within the input sample were predicted, and the pool

p_{i}

of the most similar peers of size

n

was selected. Finally, the median value

r_{i}

of the respective dependent variable

j

was calculated, indicating the result of the OLS peer selection.

Algorithm 1: OLS process.

Input: Explanatory and dependent variables
Output: estimated values in the OLS model

Begin Algorithm
For j in 1 to 5

Estimate {\hat{β}}^{(j)}

for each dependent variable
For i in 1 to sample size

Predict \hat{y_{i}}

for each i - th input instance from {\hat{β}}^{(j)}

End For
For i in 1 to sample size

p_{i}

\leftarrow Select n most similar instances by value of \hat{y_{i}}

r_{i}

\leftarrow Calculate median \tilde{p_{i}}

End For
End For

Return results r_{i}

End Algorithm

3.3. Feature Selection Using the F-Test

Since OLS LR is unable to exclude redundant and non-contributing features, its performance was optimized applying two feature selection algorithms (hereinafter: FS) in the data processing stage: (i) the parametric F-test FS and (ii) the non-parametric Neighbourhood Component Analysis FS. As expected, both FS reduced the number of input features substantially. Employing the F-test, the importance of each input feature is evaluated individually. The null hypothesis indicates that the mean value of each input feature

X_{k}

is

\sum_{m = 1}^{N} \frac{X_{k, m}}{N}

and was drawn from populations with the same mean as the output variable

\sum_{m = 1}^{N} \frac{y_{j, m}}{N}

(where

N

indicates sample size)

H_{0} : \sum_{m = 1}^{N} \frac{X_{k, m}}{N} = \sum_{m = 1}^{N} \frac{y_{j, m}}{N}

contrary to the alternative hypothesis

H_{1} : \sum_{m = 1}^{N} \frac{X_{k, m}}{N} = \sum_{m = 1}^{N} \frac{y_{j, m}}{N}

with the corresponding

p

-value indicating the importance of the

k

-th feature’s

X_{k}

and a decreasing

p

-value indicating an increasing feature importance. Furthermore,

p^{*} = - l o g (p)

was calculated. If

p^{*} > 1

, the

k

-th feature was selected in the subset of features

N_{k}^{(F)}

and excluded otherwise. This allowed for a non-fixed number of selected features.

3.4. Feature Selection Using Neighbourhood Component Analysis

The Neighbourhood Component Analysis (hereinafter: NCA) is a more complex algorithm. It was developed as a simplification of the

k

-nearest neighbour algorithm and has been proved as a widely applicable and powerful FS algorithm [54]. The objective is to find the transformation matrix

A

that minimizes the target function iteratively, typically using Gradient Descent optimization. A metric of squared errors

Q = A^{T} \cdot A

is introduced into the Mahalanobis distance

d (x, y) = {(x - y)}^{T} \cdot Q \cdot (x - y) = {(A \cdot x - A \cdot y)}^{T} \cdot (A \cdot x - A \cdot y)

3.5. Self-Organizing Maps

A self-organizing map (hereinafter: SOM) is an unsupervised ML algorithm for the primary purpose of clustering and dimensionality reduction [55,56]. Clustering searches for similar input instances, labels distinctive classes, and classifies instances into these classes [57]. N-input instances capture d dimensions, which may be either correlated or uncorrelated. The representation of the weight vector of the SOM artificial neuron is denoted by [58]

m_{i} = [m_{i 1}, \dots, m_{i d}]

where dimensions range from 1 to

d

. SOMs run two modes, that is, a training and a mapping mode. During training, the neuronal weights are updated by the learning algorithm. Instead, mapping is a propagation-only procedure. An SOM consists of topologically organized artificial neurons. These neurons represent the output (mapping) space and are typically organized into a two-dimensional hexagonal or rectangular grid architecture. Each artificial neuron can be located using two parameters

x

and

y

that assign an underlying weight vector. The length of the weight vector equals

d

. During training, the weight vector is compared to the input instance by calculating the error distance (regularly, the Euclidean distance). The objective is to find the neuron with the lowest error distance (Best Matching Unit, hereinafter: BMU)

‖x - m_{c}‖ = \min_{i} \{‖x - m_{i}‖\}

where

i

runs all input data instances, and the operator

‖\cdot‖

indicates the Euclidean distance. Next, the SOM is trained using a competitive learning algorithm. Many different versions of training algorithms exist, such as sequential (one thing at a time, step by step) or batch (showing the map a group of things all at once) training algorithms [58]) or more advanced such as the Growing SOM (this model is like a map that expands as it learns more) [59], LVQ (Learning Vector Quantization is a model that compares things and puts them into groups) [56], among others. This paper employs the sequential training algorithm that treats the neurons one-by-one sequentially. Training modifies the weight vector by reducing the error distance and keeping the position and topological order of the neurons in the map space constant. The weight vector of the BMU is modified the most, while the weight vector of the neurons in the vicinity of the BMU is modified at a decreasing rate, and the weight vectors of the neurons far away from the BMU are kept constant. The modification of the BMU is denoted by

m_{i (t + 1)} = m_{i (t)} + α (t) \cdot h_{c i} (t) \cdot [x (t) - m_{i} (t)]

where

α (t)

indicates the learning rate,

h_{c i} (t)

indicates the neighbourhood kernel around the BMU, and

x (t) - m_{i} (t)

indicates the difference per dimension. The learning rate is divided into (primary) coarse tuning and a (secondary) fine tuning. Additionally, learning rate decay is implemented. After the training instance is finished (the number of epochs), the mapping instance is run to test the performance of the trained SOM.

Prior to training, the data samples (input instances) for training and mapping are split into two non-overlapping parts. The training data sample is employed for training only, while the testing data sample is utilized for mapping. The SOM projects the higher-dimensional input space instances into a two-dimensional position where clusters can easily be identified. To evaluate the performance of the SOM, visualization (graphical) as well as statistical techniques may be employed. Visualization techniques utilize the colour-coding of the information, such as (i) visualizing input planes (for each of the

d

dimensions separately), (ii) visualizing neighbour weight distances, (iii) visualizing the number of BMU hits, and (iv) visualizing the weight positions, where a two-dimensional coordinate system with actual input data and trained weight vectors is reported.

4. Research Methodology

The motivation for the experiments was to compare the performance of OLS LR against the SOM model in peer selection to predict the actual values of five types of FMs. To perform this study, we employed a two-fold approach employing OLS LR as our primary benchmark as compared to the self-organizing map algorithm. The analysis is built upon three datasets: (i) the full dataset, (ii) a dataset reduced by the F-test, and (iii) a dataset reduced by the NCA test. The datasets for each FM were computed employing the standard holdout routine, once utilizing the target transaction as the transaction peers are searched for and in all other cases serving as part of the peer group.

Performance was evaluated in two ways: First, employing statistical metrics based on three error measures and second, directly comparing the number of superior predictions (winners). To evaluate bias, the relative absolute prediction error (hereinafter: RAPE) was employed, since it is both exposed to a systematic upwards bias and avoids positive and negative deviations to net out (and hence allows for a one-dimensional result figure). It is denoted by [25,60]

{R A P E}_{i, k}^{δ_{i, j}} = |\frac{λ_{i, k}^{{\hat{δ}}_{i, j}} - λ_{i}^{δ_{i, j}}}{λ_{i}^{δ_{i, j}}}|

where

i

is the index on firm transactions,

δ_{i, j}

indicates the financial multipliers FM1 to FM5 (with

j

indicating the respective value drivers),

k

indicates the underlying technique (i.e., OLS or SOM),

λ_{i}^{{\hat{δ}}_{i, j}}

is the predicted financial multiple on the corresponding value driver, and

λ_{i}^{δ_{i, j}}

is the observed financial multiple on the corresponding value driver. To evaluate accuracy, the relative log-scaled absolute prediction error (hereinafter: RLAPE) was employed, since it avoids both an upwards bias as well as the netting effect of positive and negative deviations and hence considers solely the strength of the deviation. It is denoted by [19,51,60,61]

{R L A P E}_{i, k}^{δ_{i, j}} = |l n (\frac{λ_{i, k}^{{\hat{δ}}_{i, j}}}{λ_{i}^{δ_{i, j}}})| .

Finally, for robustness testing on accuracy, the relative squared prediction error (hereinafter: RSPE) was employed, since it does not only indicate the strength of the deviation between the predicted and the observed multiple but is sensitive to extreme values yielding larger errors in case of severe over- or underestimation. This incorporates valuable information on distribution and homogeneity. It is denoted by [60,62]

{R S P E}_{i, k}^{δ_{i, j}} = {(\frac{λ_{i, k}^{{\hat{δ}}_{i, j}} - λ_{i}^{δ_{i, j}}}{λ_{i}^{δ_{i, j}}})}^{2} .

Directly comparing the number of superior predictions (winners), the Euclidean distance between the OLS LR and SOM predicted outputs for FM1 to FM5 was computed. For each input instance, the superior scored +1 and 0 otherwise. Subsequently, the count of +1 for all input instances on each FM was computed. The equation for the direct comparison between OLS and the SOM (initial conditions prior incrementing are equal to

S O M^{s u p} = O L S^{s u p} = 0

) is denoted by

\{\begin{matrix} {(λ_{i, S O M}^{{\hat{δ}}_{i, j}} - λ_{i}^{δ_{i, j}})}^{2} < {(λ_{i, O L S}^{{\hat{δ}}_{i, j}} - λ_{i}^{δ_{i, j}})}^{2} & S O M^{s u p} = S O M^{s u p} + 1 \\ o t h e r w i s e & O L S^{s u p} = O L S^{s u p} + 1 \end{matrix} .

The setup for the SOM algorithm for clustering similar input instances is outlined in Table 3. To ensure fairness between the algorithms, the process of removing missing data instances was run uniformly. Therefore, the sample size decreased substantially.

5. Results

Section 5 is divided into two parts, with the first part exhibiting the results of the error measures between OLS LR and the SOM, as reported in Table 4, and the second part exhibiting the results of the direct comparison of winners, as reported in Table 5.

It is to be recognized that the mean and standard deviation values for the RSPE indicators show an unstable behaviour. The magnitudes of these indicators increase heavily as compared to the remaining two. In contrast, the median performs well for all cases, indicating that the error measures include many outliers. In general, the values of the error measures were decreased (and ceteris paribus, the performance of the peer groups was increased) by increasing the number of peers.

In contrast, the SOM does not employ a fixed number of peers. Instead, the neighbourhood size

N_{p o o l} = \{1, 2\}

is held constant when searching for similar data samples. No FS algorithm was employed for the SOM. The results indicate a remarkable increase in performance for the RAPE measure. The mean and standard deviation values of

N_{n e i g h} = 1

compared to

N_{n e i g h} = 2

were much lower. Although this is seen as a positive sign, one must note that approximately 10 percent (2184) of the predictions were output as “NaN”, since

N_{n e i g h} = 1

did not assure any neighbours. Furthermore, the results were frequently instable due to the very low pool size setting. Therefore, a separate case study for

N_{n e i g h} = 2

was conducted, resulting in increased error stability and reducing the number of “NaN” from 2184 to only 4. Table 5 reports the results for the performance analysis of the SOM model.

The boxplot as reported in Figure 1 indicates no significant statistical differences between the error measures for

N_{p o o l} = 5

(OLS LR) and

N_{n e i g h} = \{1,2\}

(SOM), as interquartile ranges were not non-overlapping. However, since no post-processing of output data was conducted, numerous outliers still impact the fat-tailed behaviour of the distribution.

Table 6 reports the results for the direct comparison of winners, indicating the SOM clearly outperforming OLS LR. On average, in 93 to 95 percent of all instances, it creates superior peer groups entailing superior multiple predictions. The size of the peer group (i.e., the number of peers) is not fixed but is provided as a parameter of neighbourhood width. The wider the neighbourhood, the larger the peer group. In contrast, especially in the corners of the SOM, the size of the peer group is very low or even zero. In that case, computing the mean value of the synthetic multiple is impossible, causing OLS LR to win. To remedy this problem, an adjustable neighbourhood width for the SOM is recommended to overcome the problem of low peer group sizes.

6. Discussion

In our research, our focus was exclusively on unlisted companies. Consequently, our dataset comprised solely transactions within this specific domain, rendering a direct comparison with existing research unfeasible. To address this challenge, we developed a bespoke benchmark model through rigorous regression analysis. This model serves as a reference point, enabling us to juxtapose its outcomes with those derived from our unique approach. By employing this comparative framework, we aim to enrich our understanding of the distinctive dynamics inherent in unlisted company transactions.

OLS LR as a benchmark and the SOM algorithm as a comparative have been employed for predicting the values of five financial enterprise and equity multipliers. The objective was to select a pool of

n

similar peer firm transactions and to estimate the a priori known multiplier utilizing the synthetic peer group multipliers. Based on a sample of over 20,000 private firm transactions, the performance of OLS LR (employing a full set as well as optimized sets of selection criteria) and the SOM algorithm was evaluated employing (i) various valuation error measures and (ii) directly comparing the number of superior predictions.

The quality of results could be improved by employing a more rigorous pre-processing. The quality of the output results was generally observed to be inferior for some instances, and the magnitude ranges for the

j = 5

value drivers were substantial, implying some kind of instability. Additional research should be conducted on whether the two employed FS techniques lead to an improvement for the OLS LR methodology. This disadvantage of the OLS LR methodology highlights the large and promising potential of the SOM algorithm, which may effectively identify and filter unexpected deviations, since processing the logics of the SOM algorithm causes sample majorities to prevail constantly. However, unexpected behaviour may still occur for some of the input instances, if an island of such unexpected instances is formed. In that case, all predictions for the instances with the BMU close to the centre of the islands are heavily biased. Again, further research on exhibiting the detailed shape of the SOM architecture should be conducted to highlight potential areas of improvement. Nevertheless, the SOM algorithm (and unsupervised learning in general) has proven as a superior framework and offers a promising way to improve peer selection, entailing superior estimates on private firm multiples. Finding similar instances, as needed to identify peers, is at the core capabilities of clustering algorithms like the SOM as compared to the classical regression methodology like OLS LR. This finding could be further supported by employing the

k

-nearest neighbours algorithm in an additional experimental setting.

The pool size of the SOM was varied and conditioned by (i) the density of the SOM hits and (ii) the neighbourhood width parameter

N_{n e i g h}

. It appears that a larger density (e.g., for SOM islands) as well as a larger neighbourhood both require a larger pool size. Implementing a varying pool size comes at the cost of allowing for extreme (close to zero or close to infinity) pool sizes that negatively impact the performance of the SOM. Further research should, therefore, implement related remediations. Alternatively, a sensitivity analysis could be conducted to approximate the optimal pool size, although we expect the related results to vary over time. Statistical analysis indicates that larger peer groups entail lower prediction errors for OLS LR. In contrast, the SOM shows a reverse effect in that increasing

N_{n e i g h} : 1 \to 2

partly contributed towards the enhanced stability of the results due to the larger pool and avoiding the “NaN” count in the results. This holds especially for the RAPE measure indicating bias. But, for all three error measures employed (indicating bias as well as accuracy), increasing the neighbourhood entails that both the median and the standard deviation increase, thus lowering multiplier estimation performance. Additionally, one must note that increasing

N_{n e i g h} : 1 \to 2

increases

N_{p o o l}

exponentially rather than linearly. This is especially important for countries with a low private firm transaction frequency, since forming larger peer groups entails adding a large fraction of foreign peers and/or a large fraction of outdated transactions, in its entirety reducing peer selection and multiple estimation performance for companies in these countries.

Nevertheless, the pronounced superiority of the SOM over OLS LR can be seen. Notably, our findings reveal that the SOM consistently outperforms OLS LR across the spectrum of analyses conducted. On average, the SOM generates superior peer groups and delivers more accurate multiple predictions. This robust performance underscores the efficacy and reliability of the SOM as a powerful analytical tool for predicting outcomes and delineating optimal peer group compositions within our research domain.

7. Conclusions

Forming optimal peer groups is a crucial step in multiplier valuation. Traditional regression methodology requires, among others, defining the optimal peer pool, size of the peer group, time horizon of firm transaction regarded, and the optimal set of selection criteria. Since research lacks direction leaving numerous open-ended questions, employing the machine learning methodology offers a promising way to improve peer selection. This study compares the related performance of OLS linear regression to the self-organizing map algorithm, automatically excluding redundant and non-contributing parameters and simultaneously optimizing further parameters. To stretch the performance of the linear regression method to the maximum, various datasets on selection criteria (full as well as F- and NCA-optimized) were employed. Using a sample of over 20,000 private firm transactions, model performance was evaluated employing multiplier prediction error measures (emphasizing bias and accuracy) as well as prediction superiority directly.

Emphasizing five enterprise and equity value multiples, the results allow for the overall conclusion that the self-organizing map algorithm outperforms the traditional linear regression model in both minimizing the valuation error as measured by various valuation errors as well as in direct prediction superiority. Consequently, the machine learning methodology offers a promising way to improve peer selection.

This study contributes to the ongoing debate on peer selection parameters highlighting the importance of further methodological development. Upcoming studies should focus on refining existing methods, addressing data quality issues, and exploring further innovative approaches.

Author Contributions

Conceptualization, T.J., D.F., S.O.G. and A.H.; methodology, T.J., D.F., S.O.G. and A.H.; software, T.J. and D.F.; validation, T.J., D.F., S.O.G. and A.H.; formal analysis, T.J. and D.F.; investigation, T.J., D.F., S.O.G. and A.H.; resources, T.J., D.F., S.O.G. and A.H.; data curation, T.J. and D.F.; writing, D.F. and A.H; writing—review and editing, T.J., D.F., S.O.G. and A.H.; visualization, D.F. and A.H.; supervision, T.J.; project administration, T.J. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

No human subjects, identifiable private information, or ethical considerations were involved in the research.

Informed Consent Statement

This study did not involve human subjects.

Data Availability Statement

The data presented in this study are available on Kaggle platform at https://www.kaggle.com/datasets/charanpuvvala/company-classification.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

Grbenic, S.O. Private Firm Valuation in the Technology Sector: Illuminating the Interaction Between Multiple Performance and Peer Pool Setting. Int. J. Econ. Financ. Manag. Sci. 2021, 9, 77. [Google Scholar] [CrossRef]
Asche, F.; Misund, B. Who’s a major? A novel approach to peer group selection: Empirical evidence from oil and gas companies. Cogent Econ. Financ. 2016, 4, 1264538. [Google Scholar] [CrossRef]
Cooper, I.A.; Cordeiro, L. Optimal Equity Valuation Using Multiples: The Number of Comparable Firms. SSRN Electron. J. 2008, 29, 1025–1053. [Google Scholar] [CrossRef]
Sommer, F.; Woehrmann, A. Triangulating the Accuracy of Comparable Company Valuations: A Multidimensional Analysis Considering Interaction Effects. SSRN Electron. J. 2013. [Google Scholar] [CrossRef]
Plenborg, T.; Pimentel, R.C. Best practices in applying multiples for valuation purposes. J. Priv. Equity 2016, 19, 55–64. [Google Scholar] [CrossRef]
Nel, W.S.; Bruwer, B.W.; Le Roux, N.J. Equity- and entity-based multiples in emerging markets: Evidence from the JSE Securities Exchange. J. Appl. Bus. Res. 2013, 29, 829–852. [Google Scholar] [CrossRef]
Herrmann, V.; Richter, F. Pricing with Performance-Controlled Multiples. Schmalenbach Bus. Rev. 2003, 55, 194–219. [Google Scholar] [CrossRef]
Miciuła, I.; Kadłubek, M.; Stepien, P. Modern methods of business valuation-case study and new concepts. Sustainability 2020, 12, 2699. [Google Scholar] [CrossRef]
Kazlauskiene, V.; Christauskas, Č. Business Valuation Model Based on the Analysis of Business Value Drivers. 2008. Available online: https://www.ceeol.com/search/article-detail?id=952729 (accessed on 10 March 2024).
Chan, L.K.C.; Lakonishok, J.; Swaminathan, B. Industry classifications and return comovement. Financ. Anal. J. 2007, 63, 56–70. [Google Scholar] [CrossRef]
Liu, J.; Nissim, D.; Thomas, J. Equity valuation using multiples. J. Account. Res. 2002, 40, 135–172. [Google Scholar] [CrossRef]
Kim, M.; Ritter, J.R. Valuing IPOs. J. Financ. Econ. 1999, 53, 409–437. [Google Scholar] [CrossRef]
Kaplan, S.N.; Ruback, R.S. The Market Pricing of Cash Flow Forecasts: Discounted Cash Flow vs. the Method of “Comparables”. J. Appl. Corp. Financ. 1996, 8, 45–60. [Google Scholar] [CrossRef]
Kaplan, S.N.; Ruback, R.S. The Valuation of Cash Flow Forecasts: An Empirical Analysis. J. Financ. 1995, 50, 1059–1093. [Google Scholar] [CrossRef]
Boatsman, J.R.; Baskin, E.F. Asset Valuation with Incomplete Markets. Account. Rev. 1981, 56, 38–53. [Google Scholar]
Agnes Cheng, C.S.; McNamara, R.A.Y. The valuation accuracy of the price-earnings and price-book benchmark valuation methods. Rev. Quant. Financ. Account. 2000, 15, 349–370. [Google Scholar] [CrossRef]
Grbenic, S.O.; Zunk, B.M. The Formation of Peer Groups in the Pricing Process of Privately Held Businesses: Can Firm Size Serve as a Selection Criterion? Empirical Evidence from Europe. Int. J. Bus. Humanit. Technol. 2014, 4, 73–90. [Google Scholar]
Schreiner, A. Equity Valuation Using Multiples: An Empirical Investigation; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2007. [Google Scholar]
Lie, E.; Lie, H.J. Multiples Used to Estimate Corporate Value. Financ. Anal. J. 2002, 58, 44–54. [Google Scholar] [CrossRef]
Alford, A.W. The Effect of the Set of Comparable Firms on the Accuracy of the Price-Earnings Valuation Method. J. Account. Res. 1992, 30, 94. [Google Scholar] [CrossRef]
Nel, W.S.; Le Roux, N.J. An Analyst’s Guide to Sector-Specific Optimal Peer Group Variables And Multiples in the South African Market. Econ. Manag. Financ. Mark. 2017, 12, 25–54. [Google Scholar]
Dittmann, I.; Weiner, C. Selecting Comparables for the Valuation of European Firms. SSRN Electron. J. 2005. [Google Scholar] [CrossRef]
Asness, C.S.; Porter, R.B.; Stevens, R.L. Predicting Stock Returns Using Industry-Relative Firm Characteristics. SSRN Electron. J. 2000. [Google Scholar] [CrossRef]
Codau, C. Influencing Factors of Valuation Multiples of Companies. Ann. Univ. Apulensis Ser. Oeconomica 2013, 15, 391–401. [Google Scholar] [CrossRef]
Henschke, S.; Homburg, C. Equity valuation using multiples: Controlling for differences amongst peers. SSRN 2009. [Google Scholar] [CrossRef]
Cooper, E.W.; Barenbaum, L.; Schubert, W. Using Guideline Company Multiples for Small Firm Valuations. Valuat. Strateg. 2013, 16, 4–17. [Google Scholar]
Paglia, J.K.; Harjoto, M. The discount for lack of marketability in privately owned companies: A multiples approach. J. Bus. Valuat. Econ. Loss Anal. 2010, 5. [Google Scholar] [CrossRef]
Harvard Law School. Determining Control Premiums: A Better Approach. Valuat. Strateg. 2014, 17, 44–46. [Google Scholar]
Alexandridis, G.; Fuller, K.P.; Terhaar, L.; Travlos, N.G. Deal size, acquisition premia and shareholder gains. J. Corp. Financ. 2013, 20, 1–13. [Google Scholar] [CrossRef]
Hertzel, M.; Smith, R.L. Market Discounts and Shareholder Gains for Placing Equity Privately. J. Financ. 1993, 48, 459–485. [Google Scholar] [CrossRef]
Madura, J.; Ngo, T.; Viale, A.M. Why do merger premiums vary across industries and over time? Q. Rev. Econ. Financ. 2012, 52, 49–62. [Google Scholar] [CrossRef]
Bouwman, C.H.S.; Fuller, K.; Nain, A.S. Market valuation and acquisition quality: Empirical evidence. Rev. Financ. Stud. 2009, 22, 633–679. [Google Scholar] [CrossRef]
Aggarwal, R.; Bhagat, S.; Rangan, S. The impact of fundamentals on IPO valuation. Financ. Manag. 2009, 38, 253–284. [Google Scholar] [CrossRef]
Hrazdil, K.; Scott, T. The role of industry classification in estimating discretionary accruals. Rev. Quant. Financ. Account. 2013, 40, 15–39. [Google Scholar] [CrossRef]
Arora, P.; Kweh, Q.L.; Mahajan, D. Performance comparison between domestic and international firms in the high-technology industry. Eurasian Bus. Rev. 2018, 8, 477–490. [Google Scholar] [CrossRef]
Kahle, K.M.; Walkling, R.A. The Impact of Industry Classifications on Financial Research. J. Financ. Quant. Anal. 1996, 31, 309. [Google Scholar] [CrossRef]
Harbula, P. Valuation Multiples: Accuracy and Drivers Evidence from the European Stock Market. Bus. Valuat. Rev. 2009, 28, 186–200. [Google Scholar] [CrossRef]
De Franco, G.; Hope, O.K.; Larocque, S. Analysts’ choice of peer companies. Rev. Account. Stud. 2015, 20, 82–109. [Google Scholar] [CrossRef]
Bhojraj, S.; Lee, C.M.C. Who is my peer? A valuation-based approach to the selection of comparable firms. J. Account. Res. 2002, 40, 407–439. [Google Scholar] [CrossRef]
Albuquerque, A.M.; De Franco, G.; Verdi, R.S. Peer choice in CEO compensation. J. Financ. Econ. 2013, 108, 160–181. [Google Scholar] [CrossRef]
Abbott, A.B. Estimating the Discount for Lack of Marketability: A Best fit Model. Valuat. Strateg. 2012, 15, 20–25. [Google Scholar]
Da Silva Rosa, R.; Limmack, R.; Woodliff, D. The Equity Wealth Effects of Method of Payment in Takeover Bids for Privately Held Firms. Aust. J. Manag. 2004, 29, 93–110. [Google Scholar] [CrossRef]
Yin, Y.; Peasnell, K.; Lubberink, M.; Hunt, H.G. Determinants of Analysts’ Target P/E Multiples. J. Investig. 2014, 23, 35–42. [Google Scholar] [CrossRef]
Bonacchi, M.; Marra, A.; Zarowin, P. Earnings Quality of Private and Public Firms: Business Groups versus Stand-Alone Firms; Social Sciences Research Network (SSRN): New York, NY, USA, 2017. [Google Scholar]
Ding, K.; Peng, X.; Wang, Y. A machine learning-based peer selection method with financial ratios. Account. Horiz. 2019, 33, 75–87. [Google Scholar] [CrossRef]
Husmann, S.; Shivarova, A.; Steinert, R. Company classification using machine learning. Expert Syst. Appl. 2022, 195, 116598. [Google Scholar] [CrossRef]
Van der Maarten, L.; Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 2008, 9, 2579–2605. [Google Scholar]
Hoberg, G.; Phillips, G. Text-based network industries and endogenous product differentiation. J. Polit. Econ. 2016, 124, 1423–1465. [Google Scholar] [CrossRef]
Geertsema, P.; Lu, H. Relative Valuation with Machine Learning. J. Account. Res. 2023, 61, 329–376. [Google Scholar] [CrossRef]
Guryanova, L.; Panasenko, O.; Gvozditskyi, V.; Ugryumov, M.; Strilets, V.; Chernysh, S. Methods and Models of Machine Learning in Managing the Market Value of the Company. 2021. Available online: https://ceur-ws.org/Vol-2927/paper5.pdf (accessed on 3 May 2024).
Sommer, F.; Rose, C.; Wöhrmann, A. Negative value indicators in relative valuation-An empirical perspective. J. Bus. Valuat. Econ. Loss Anal. 2014, 9, 23–54. [Google Scholar] [CrossRef]
Benninga, S.Z.; Sarig, O.H. Corporate Finance: A Valuation Approach; McGraw-Hill: New York, NY, USA, 1996. [Google Scholar]
Gujarati, D.N.; Porter, D.C. Basic Econometrics; McGraw-Hill: New York, NY, USA, 2009. [Google Scholar]
Goldberger, J.; Roweis, S.; Hinton, G.; Salakhutdinov, R. Neighbourhood components analysis. Proc. Adv. Neural Inf. Process. Syst. 2004, 17, 4. [Google Scholar]
Kohonen, T. Self-organized formation of topologically correct feature maps. Biol. Cybern. 1982, 43, 59–69. [Google Scholar] [CrossRef]
Kohonen, T. Learning Vector Quantization. In Self-Organizing Maps; Springer: Berlin/Heidelberg, Germany, 1995; pp. 175–189. [Google Scholar] [CrossRef]
Ezugwu, A.E.; Ikotun, A.M.; Oyelade, O.O.; Abualigah, L.; Agushaka, J.O.; Eke, C.I.; Akinyelu, A.A. A comprehensive survey of clustering algorithms: State-of-the-art machine learning applications, taxonomy, challenges, and future research prospects. Eng. Appl. Artif. Intell. 2022, 110, 104743. [Google Scholar] [CrossRef]
Vesanto, J. Neural Network Tool for Data Mining: SOM Toolbox. 2000. Available online: https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=0e9bee375e885c4740ba0dab007167c485fa1e48 (accessed on 18 March 2024).
Villmann, T.; Bauer, H.U. Applications of the growing self-organizing map. Neurocomputing 1998, 21, 91–100. [Google Scholar] [CrossRef]
Grbenic, S.O. Private Firm Valuation using Enterprise Value Multiples: An Examination of Relative Performance on Minority and Majority Share Transactions. SSRN Electron. J. 2021. [Google Scholar] [CrossRef]
Chullen, A.; Kaltenbrunner, H.; Schwetzler, B. Does consistency improve accuracy in multiple—Based valuation? J. Bus. Econ. 2015, 85, 635–662. [Google Scholar] [CrossRef]
LeClair, M.S. Valuing the Closely-Held Corporation: The Validity and Performance of Established Valuation Procedures. Account. Horiz. 1990, 4, 31–42. [Google Scholar]

Figure 1. Boxplots—comparison of error measures. Comment: Comparison of error measures between N_pool = 5 (OLS LR) and N_neigh = {1,2} (SOM).

Table 1. Input data descriptive statistics.

	Fundamental Characteristics (Target)	Industry Sector (Target)	Country and Continent (Target)	Period of Transaction	Total
Number of input variables	42	20 **	35+4 ^,*	3	104
Number of input dummy variables	22	20 **	35+4 ^,*	3	84

Comment: * 35+4 represents 35 countries from 4 continents. ** transactions not limited to a single dummy variable; if various types were included, the respective values were set to 1.

Table 2. Output data descriptive statistics.

	FM1 EPV/Sales	FM2 EPV/EBITDA	FM3 EPV/EBIT	FM4 EPV/Total Assets	FM5 EQV/EBT
Number of samples	41,427	28,465	28,441	41,427	30,498
Number of missing data	0	12,962	12,986	0	10,929
Mean	9826.5	915.5	2519.3	5151.9	869.7
Standard deviation	1.5 × 10⁶	1.1 × 10⁵	3.8 × 10⁵	9.8 × 10⁵	9.3 × 10⁴
Skewness	1.9 × 10²	1.7 × 10²	1.7 × 10²	2.0 × 10²	1.5 × 10²
Kurtosis	3.8 × 10⁴	2.8 × 10⁴	2.8 × 10⁴	4.1 × 10⁴	2.3 × 10⁴

Table 3. SOM algorithm setup.

Parameter	Value
Input data dimension	$104$
Number of epochs	$10,000$ *
SOM dimensions	$100 \times 100$
Topology function	Grid
Initial neighbourhood size	10
Distance function	Euclidean
Sample size	20,713

Comment: * 10,000 epochs divided into two halves, the first 5000 epochs for coarse tuning and the remaining 5000 for fine tuning.

Table 4. Error measures between OLS LR and SOM.

	Mean	Median	Std.dev	CV
RAPE_5	4.938	0.838	242.800	49.166
RLAPE_5	1.664	1.222	1.554	0.934
RSPE_5	6.645 × 10⁶	0.820	8.663 × 10⁸	130.371
RAPE_6	4.675	0.834	252.560	54.025
RLAPE_6	1.654	1.220	1.518	0.918
RSPE_6	5.942 × 10⁶	0.845	7.808 × 10⁸	131.398

Comment: Std.dev indicates standard deviation, CV indicates coefficient of variation.

Table 5. Statistical performance analysis of results for SOM.

	RAPE		RLAPE		RSPE
	$N_{n e i g h} = 1$	$N_{n e i g h} = 2$	$N_{n e i g h} = 1$	$N_{n e i g h} = 2$	$N_{n e i g h} = 1$	$N_{n e i g h} = 2$
Mean	44.533	1.758	1.091	1.211	1.599 × 10⁷	5.365 × 10⁴
Median	0.592	0.720	0.617	0.814	0.318	0.504
Std.dev	3.997 × 10³	58.464	1.338	1.282	2.048 × 10⁹	7.339 × 10⁶
CV	89.301	33.247	1.226	1.059	128.017	136.798

Comment: Std.dev indicates standard deviation, CV indicates coefficient of variation.

Table 6. Direct comparison of winners.

	All LR Variables Included		LR Variables Selected by F-Test		LR Variables Selected by NCA
	${S O M}^{s u p .}$	${L R}^{s u p .}$	${S O M}^{s u p .}$	${L R}^{s u p .}$	${S O M}^{s u p .}$	${L R}^{s u p .}$
EPV/Sales	18,147	2566	18,515	2198	17,086	3627
EPV/EBITDA	11,885	2347	11,859	2373	11,763	2469
EPV/EBIT	11,300	2920	11,250	2970	10,962	3258
EPV/Total assets	18,371	2342	18,366	2347	18,299	2414
EQV/EBT	12,768	2481	12,802	2447	12,733	2516

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Jagrič, T.; Fister, D.; Grbenic, S.O.; Herman, A. Private Firm Valuation Using Multiples: Can Artificial Intelligence Algorithms Learn Better Peer Groups? Information 2024, 15, 305. https://doi.org/10.3390/info15060305

AMA Style

Jagrič T, Fister D, Grbenic SO, Herman A. Private Firm Valuation Using Multiples: Can Artificial Intelligence Algorithms Learn Better Peer Groups? Information. 2024; 15(6):305. https://doi.org/10.3390/info15060305

Chicago/Turabian Style

Jagrič, Timotej, Dušan Fister, Stefan Otto Grbenic, and Aljaž Herman. 2024. "Private Firm Valuation Using Multiples: Can Artificial Intelligence Algorithms Learn Better Peer Groups?" Information 15, no. 6: 305. https://doi.org/10.3390/info15060305

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Private Firm Valuation Using Multiples: Can Artificial Intelligence Algorithms Learn Better Peer Groups?

Abstract

1. Introduction

2. Literature Review

3. Materials and Methods

3.1. Sample Data

3.2. Linear Regression

3.3. Feature Selection Using the F-Test

3.4. Feature Selection Using Neighbourhood Component Analysis

3.5. Self-Organizing Maps

4. Research Methodology

5. Results

6. Discussion

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI