1. Introduction
In the last decade, the economic and financial crisis in Europe, triggered by the USA subprimes, has led to a relevant cogency of evaluation tools able to provide ‘slender’ and reliable mass appraisals [
1,
2,
3]. The inability to update properties’ market values over time with respect to the current selling prices trend, and the inadequacy of the methodologies used to assess market values, mainly based on direct estimates that require long processes and lead to results influenced by significant approximations [
4,
5,
6,
7,
8], are the main causes of the global economic crisis that started in 2008. Moreover, the negative effects triggered by inappropriate valuations have highlighted the need for adequate professional skills in property appraisals. In this context, the International Valuation Standards define uniform and shared guidelines in order to guarantee a unique code based on the same principles and rules among professionals and the public interest in the valuation models [
9].
The need to control the uncertainty is aimed at avoiding or, at least, reducing the likelihood of systemic financial and economic market crises, like that of the USA subprimes in 2007. With reference to Covid-19 and the resulting global pandemic, the emergence has also created a huge amount of uncertainty around the world in terms of the enormous market volatility that could have impacts on the real estate sector from the point of view of sales and prices. In fact, the current Covid-19 crisis is of non-financial or economic origin, is not consequent to a war period, and is causing a global upheaval in all sectors. The impacts are affecting all fields, starting from the collapse of global and national Gross Domestic Product. Phases of recessions, a slowdown in national and international trade, an increase in unemployment with consequent lower spending power and income impoverishment, an increase in crime and an overall modification of social relationships are the main effects. Inevitably, this will have a significant impact on individuals’ availability to pay rent, mortgages and various household expenditures [
10,
11,
12].
In the context of real estate evaluations, the main impacts derive from the need to limit travel and contacts, and from the introduction of drive-by assessments, which only provide for the external inspection of the property to be valued. In the case of bank lending, this typology of preliminary investigation aimed at exploring the property does not ensure the necessary guarantee to lending institutions [
13]. In this sense, Aronsohn A. (IVSC Technical Director) underlines that uncertainty, already present in ‘normal’ times, is inherent in most market valuations, since there is rarely a single price with which it is possible to compare the valuation [
14]. In this scenario, such as the one marked by the current health and economy emergency in progress, the only way to make predictions is to proceed by hypothesis [
15].
In recent years, the static nature of the traditional valuation methods and their inability to consider the complex and changing socio-economic dynamics, and their effects on the real estate market, have generated the experimentation and diffusion of innovative mass appraisal models (genetic algorithms, spatial analysis models, fuzzy logic, artificial neural networks, etc.).
Starting from the spread of spatial big data, i.e., large amounts of data from heterogeneous sources, the innovative assessment techniques are supported by elaborated computing technologies, which are able to automate the implementation processes that, otherwise, would require more time. The Automated Valuation Methods (AVMs) are characterized by a strong theoretical and methodological basis, and are able to: (i) automatically capture the causal functional links between explanatory variables and selling prices; and (ii) obtain reliable forecasts of property market values over the medium-long term [
16,
17,
18].
In the framework outlined, the use of innovative statistical valuation methods has become necessary for the different market operators (buyers, sellers, institutions, real estate funds, insurance companies, banks, etc.) in order to determinate more appropriate and objective property values, and to effectively monitor the evolution of property values [
19,
20].
In the sector of real estate valuations, the widespread interest in these techniques testifies to the growing central role played by AVMs in the support of evaluation processes and the periodic updates of the public and private assets values [
21,
22,
23].
The present research proposes a contribution to the debate on the use of AVMs through the definition of, and experimentation with, an evaluation model for corporate properties, i.e., for those properties characterized by large size, with non-residential intended uses and the widespread interest of professional and/or institutional investors. In particular, two different techniques have been implemented in order to identify the functional relationships between the selling prices and the considered factors. In this research, a GIS-based Territorial Information Tool that processes the properties selling prices and the explanatory variables was developed. A Geographic Information System (GIS) is a tool that aims to receive, store, process, analyse, manage and represent geographic data. Many studies have used the potentialities of GIS tools to investigate different economic phenomena, such as dynamics related to per capita income in Europe [
24], the links between urban morphology and economic growth [
25], and the relationship between human capital, represented through the level of education, and productivity in the different areas of Europe [
26]. The common goal of all the applications is to identify, through spatial analysis, the links between a deductive theoretical system and an inductive empirical one. The GIS environment can support: (i) the analysis of spatial variables and interdependencies; (ii) the identification of the value of the spatial component variables; and (iii) the definition of predictive models. The use of GIS-based systems allows us to investigate the influence of spatial elements in determining the price of a property [
27], and to effectively define a series of spatial variables, increasing the objectivity of the process and also supporting the user through an intuitive graphic representation [
28].
The model proposed constitutes an expeditious assessment tool that allows the Public Administration to identify the potential future value of public assets following enhancement processes. In particular, the model could be useful in the initial evaluation phases of redevelopment initiatives, which can also be integrated through multi-criteria analysis [
29]. At the same time, the proposed tool can be used by private investors to identify the areas where the market is most dynamic, in terms of the number of transactions occurring in the short term and where there are the highest profit opportunities, and by independent experts to formulate reliable value judgments on the properties. The appraisers and the core valuation of the Asset Management Companies could implement the model as a comparison tool with the classic assessment procedures in order to verify the congruity of the values assessed by independent experts. Finally, for the institutional subjects involved (banks, Public Administrations, insurance companies, etc.) the proposed model can be applied to monitoring directly and in a more transparent way the evolution over time of the market value of the Fund’s asset, and consequently the progress of their investment. The proposed tool could also be used for the representation of alternative scenarios related to different intended uses, in order to enhance abandoned or under-utilized property assets and, with reference to the current crisis trigged by Covid-19, to analyse the market trend of relevant and large assets.
This work is organized as follows. In
Section 2, the main mass appraisal methods and their respective predictive potentialities are illustrated. In
Section 3, the proposed method and the two assessment techniques implemented are explained: the first is a non-linear regressive procedure, named Evolutionary Polynomial Regression, the other one is a linear spatial regressive procedure, named Geographically Weighted Regression. In
Section 4, the two different techniques are applied to two sample corporate properties, respectively, located in the cities of Rome and Milan (Italy), in order to determine their market value, taking into account a series of factors; the results of the two implemented techniques are then compared. The elaboration of the input and output data in the GIS environment allowed the development of an intuitive platform for the immediate representation of the results, and their easy interpretation, even to non-expert users. In
Section 5, the conclusions are discussed by describing the results of the research and identifying the limits, the innovative elements and the possible lines of development for future research.
2. Background on Mass Appraisal Techniques
In the real estate valuations sector, the role of mass appraisal techniques has become strategic: (i) to define the urban policies aimed at the enhancement of exiting property assets [
30]; (ii) to develop technical and economic refunctionalization initiatives [
31]; (iii) to evaluate the risk related to the provision of mortgage loans by the credit institutions [
32]; and iv) to assess the urban planning choices carried out by Public Administration for territorial strategic programs definition [
33,
34]. According to the International Association of Assessing Officers [
35], mass appraisal concerns ‘the process of valuing a group of properties as of a given date and using common data, standardized methods, and statistical testing’. In fact, with reference to an appropriate spatial and temporal horizon, mass appraisal methods concern large samples of properties similar to each other, collected in a systematic way, to be assessed through the implementation of mathematical algorithms.
Mass appraisal is a statistical procedure for the definition of a representative sample of a larger database in order to assess the overall value of the database [
36] through an inferential approach. The Appraisal Institute Foundation defines mass appraisal techniques as including the following steps: (i) the identification of the property being assessed; (ii) the definition of the market trade area relating to the property to be assessed; (iii) the selection of the factors (demand and supply) that influence the value formation in this trade area; (iv) the elaboration of the model that returns the value formation by starting from the characteristics of the trade area; (v) the application of the model and assessment of the property value; (vi) the analysis of the model’s results (statistical error, more significant variables, etc.) [
37].
In the international reference literature, numerous contributions demonstrate the potentialities of mass appraisal techniques and the wide interest for this issue [
38,
39,
40]. In particular, in these applications, the ‘advanced’ mass appraisal procedures have been primarily implemented to assess the influence of locational, productive, technological and socio-economic factors on properties’ selling prices. In this sense, the aim of mass appraisal techniques concerns the analysis of the contribution of each component in the market value formation processes, in order to support public subjects in the planning decision-making phases.
In the framework outlined, Pagourtzi et al. [
19] proposed the following mass appraisal technique classifications:
- -
The Hedonic Price method;
- -
Artificial Neural Networks;
- -
Fuzzy logic methods;
- -
ARIMA (autoregressive integrated moving average) models;
- -
Spatial analysis methods.
The Hedonic Price (HP) method allows the assessment of the marginal contributions of the influencing factors on property price, taking into account that this value depends on the utilities obtainable from the qualitative and quantitative characteristics that compose it [
41]; therefore, according to the hedonic price method, the value of the property can be expressed as the sum of the contributions of its characteristics. In fact, the aim of the HP model is to estimate the price of a property as a function of its characteristics [
42]. In theory, the price paid for the property purchase can be decomposed into the hedonic prices (implicit prices) of the individual attributes that constitute the whole unit. The resulting regression coefficients provide the assessment of the individual property features value.
The main application fields in which the HP method has been most used, classified by Capello [
43], concern the assessment of the negative environmental externalities in urban areas [
44,
45,
46] and of the ex post urban planning policies in order to analyse the effects of initiatives already carried out through land rent variation [
47,
48]. Moreover, through the HP method, it has been possible to determine the effects of social, environmental and urban factors on real estate values [
49,
50,
51,
52], highlighting the importance of the proximity to urban-type services [
53,
54,
55] and environmental attractors such as green areas [
56,
57]. The major limitation of the hedonic price method is represented by the impossibility of considering the combination of the variables among them, as they are based on multivariate regression techniques. Another HP method weakness concerns the likelihood of the omission of significant model variables. In fact, this technique requires a detailed database of each property variable (intrinsic and extrinsic variables). In addition, it is necessary to collect a large number of data—which is not always available—in order to obtain a study sample that is sufficiently wide and representative of the phenomena, and to reliably analyse the weight of each factor on selling prices.
Artificial Neural Networks (ANN) constitute complex systems [
58] composed of a set of elementary units (neurons) combined in an opportune manner in a netting structure made of layers that have an elevated interconnection degree that is able to associate an output
y to a set of inputs (
). The input layer represents the first level, and includes neurons which contain the exogenous information, translated in pulse for the neurons of the upper level. The output layer, instead, is formed by the neurons that return the result generated by the network’s implementation. In the intermediate levels, called the hidden layers, the information deriving from the input layer is developed and transformed into outputs.
The complexity of the ANN structure depends on the number of neurons and existing connections. The ANN have been widely used both for the prediction of real estate values in the short and medium term [
59,
60,
61,
62,
63] and for the determination of market micro-zones [
64,
65]. The limit of the ANN is that its depends on the exact information of the system under study, and the methods of training that must be used, as the algorithm of the ANN has the ability to identify unnecessary data during its training [
66]. ANN models require that the structure of the neural network (e.g., model inputs, transfer functions, the number of hidden layers, etc.) is exogenously defined. Furthermore, other disadvantages are related to the over-fitting problems that are frequent in parameter estimation, and to the inability to incorporate known economic laws into the learning processes.
Fuzzy logic constitutes a linguistic-mathematical approach useful for describing ‘vague’ concepts through a formal logical support that allows us to create analytically treatable models [
67,
68]. In particular, the fuzzy rules are able to translate the mechanism which the decision maker adopts to assume the choice into formal models [
69] by associating an input linguistic relationship with an output linguistic expression [
70]. A fuzzy rule describes, in words, the rational but intuitive process that a subject follows to define the action to be taken, i.e., to reach a final decision starting from qualitative and quantitative information on the phenomenon, and on the basis of similar experiences that already addressed [
71]. Fuzzy logic methods have been applied in the context of property valuations in several scientific works [
72,
73,
74,
75], developing an alternative and flexible approach to uncertainty [
76]. In this context, Sarip and Hafez [
77] have developed a theoretical formulation for selling price prediction through the implementation of a fuzzy regression model. Furthermore, Renigier-Biłozor et al. [
78] have elaborated upon a decision-making algorithm based on fuzzy logic and rough set theory in order to obtain real estate values.
The limits of the fuzzy logic methods are connected to the preliminary definition of the membership function and of the operators to be used in the different steps; moreover, the computational burdens connected to the use of a lot of variables makes this method difficult to apply in cases where it is necessary to accurately describe the problem by using a large number of factors [
79]. Furthermore, fuzzy logic methods are suitable to those applications in which the low definition of the problem being analysed requires an approach that is not particularly rigid (‘fuzzy’), which allows us to intercept the errors connected to a model described in an insufficiently accurate way [
80].
An Autoregressive Integrated Moving Average (ARIMA) model concerns a particular econometric technique which aims at investigating historical or temporal series in order to describe their main characteristics and to predict the future values of the series. This approach is useful in the situation in which limited information on the process of generating the data is available, or when there is no efficient explanatory model that links the forecast variable to other variables.
An ARIMA process constitutes an extension of the autoregressive moving average (ARMA) model; that is, a combination of an autoregressive model (AR) and a moving average model (MA). Therefore, while ARMA models are linear dynamic models that generate stationary processes and are able to represent and approximate the autocorrelation structure of any stationary process, ARIMA models use a process of data transformation in order to obtain a stationary series (a random walk). The main limitation of ARIMA models concerns the presumed linear form of the model and the exclusion of all non-linear correlation schemes [
81].
ARIMA models have been used to support valuation and real estate issues [
82,
83,
84,
85,
86], especially in the analysis of correlations among housing prices, population income and bank mortgage loans [
87,
88,
89,
90]. ARIMA models find their most effective application when there are substantial data time series, whereas the data that characterize the real estate market, even if they are referable to a specific historical moment, often do not have the characteristics of homogeneity and frequency such as to make this type of model applicable in an effective manner.
Spatial analysis aims to examine the aggregation forms of a phenomenon and their relationships in the space. The spatial unit to be studied have to be geo-referenced, i.e., specific geographic coordinates (longitude, latitude) capable of uniquely locating this unit in the space must be known. Furthermore, the spatial dimension is analyzed and interpreted from the illustrative-descriptive point of view in order to investigate the existence of a spatial dependence between what happens in a territorial unit and what occurs elsewhere in the space. The first law of geography according to which ‘everything is related to anything else, but the things closest to each other are more related than the far ones’ [
91] is the logic behind the models, aimed at studying dynamic phenomena in the spatial dimension.
Spatial analysis methods have been implemented as a methodological approach for the study of different types of spatial problems [
92,
93,
94]. The spatial analysis methods are particularly useful when a series of geo-referenced data is available: this allows us to analyse a multiplicity of problems by adding, among the different variables, the ‘spatial’ one. In these methods, different problems can arise related to the difficulty in determining the elementary analysis unit, to the behaviour near the boundaries, to the spatial interpolation, to the spatial autocorrelation, to the non-static space-time parameters and to the different definition scales of the parameters [
95]. For these reasons, the statistical and geostatistical analysis of the database constitutes the preparatory phase for the application of spatial analysis methods.
In recent years, spatial analysis applications, implemented with GIS-based tools, have introduced new perspectives in order to investigate different economic phenomena.
Applications of GIS-based tools have been carried out by several authors in order to measure the impact of spatial attributes on real estate prices and to define a prediction model in terms of the spatial estimation of residential values [
96,
97,
98,
99]. In particular, Oud [
100] highlighted the role of GIS applied to automated regression in order to assess the value of a panoramic view by considering two clusters in the residential market of the Dutch municipality of Alkmaar.
Moreover, Sesli [
101] used a GIS tool to define Real Estate Evaluation Maps integrated by the Multi-Criteria Decision-Making Analysis, with reference to the Atakum neighborhood in Samsun Province (Turkey). Finally, Connor [
102] demonstrated the effectiveness of using GIS technology to enhance data review, market and locational analysis, and the appraisers’ market analysis abilities.
3. Outlines of Evolutionary Polynomial Regression and Geographically Weighted Regression
With reference to Automated Valuation Methods (AVMs), in this research, two different techniques were implemented for corporate property evaluation. The application of the method allows us to identify the functional relationships between the selling prices and the main influencing factors.
3.1. Evolutionary Polynomial Regression (EPR)
Evolutionary Polynomial Regression (EPR) can be considered to be a generalization of the classical regressive methods. EPR is a technique aimed at the construction of polynomial symbolic models that uses a genetic algorithm to search for the best mathematical structures that describe the phenomenon being analyzed. The methodology underlying EPR limits the set of operators used in the symbolic regression to a subset consisting of addition, multiplication, power, logarithm and exponentials. EPR is linear with respect to parameters but not linear with respect to the model structure, which is obtained through the combination of Genetic Programming and classical numerical regression [
103].
If we set the dependent variable (
) and the independent factors (
) to the established parameters that are useful to return the functional form that allows us to define
, the generic structure of the non-linear model implemented in EPR can be synthesized by the Equation (1):
where
is the number of additive terms,
represents the number of the parameters to be identified,
are the potential explanatory variables,
, with
is the exponent of the
-th input related to the
-th term, and
is a function identified by the user among a set of possible mathematical expressions. The exponents
are also selected by the user from a range of possible real numbers.
The iterative analysis of the mathematical model, carried out through the combinations of exponents to be attributed to each of the potential inputs, is optimized by means of a population generated by a genetic algorithm, whose individuals are constituted by the set of exponents chosen by the user.
The underlying EPR algorithm does not require an a priori definition of the mathematical expression and of the variables that best represent the database, since it is the iterative process of the genetic algorithm that returns the best solution.
The accuracy of each equation elaborated by EPR is verified through its Coefficient of Determination (CoD), defined through the Equation (2):
where
is the value of the dependent variable assessed by the EPR algorithm,
is the detected value of the dependent variable and
is the size of the analyzed sample. The closer the CoD value is to the unit, the higher the accuracy of the expression returned by the EPR algorithm.
The genetic algorithm underlying EPR provides a multi-objective maximization function, which aims to pursue a Pareto optimization strategy. The objectives that are optimized in the model are: the statistical accuracy of the model, by satisfying appropriate performance criteria; the optimization of computational burdens by reducing the number of coefficients ; the reduction of the complexity of the model, through the minimization of the number of explanatory variables of the final equation. Therefore, the obtained equations must combine the statistical accuracy in the explanation of the investigated phenomenon and the simplicity of interpretation of the outputs from the end user.
With reference to the applications of the EPR technique to the real estate market sector, the literature contributions are very few and recent. In particular, Tajani et al. [
104] carried out a first experimentation of the EPR technique for mass appraisal, comparing it with the ANN methods and with the HP Methods. Morano et al. [
105] used EPR techniques for an analysis of the functional relationships between the socio-economic factors in the Municipalities of the Puglia Region (Italy) and the selling prices. Morano et al. [
106] compared EPR with the Utility Additive Model for mass appraisal related to residential properties in the Italian real estate market, in order to interpret and forecast the formation of the selling prices. Morano et al. [
34] tested an evolution of EPR on three different Italian cities, which was able to generate a ‘unique’ functional form in order to simultaneously identify the best set of significant explanatory variables to describe the same phenomenon in the different selected study samples.
Morano et al. [
107], finally, analyzed the contribution of the energy performance component to housing prices in the city of Bari (Italy).
3.2. Geographically Weighted Regression (GWR)
Spatial statistics includes a series of methods to describe and model spatial data; in several cases, it can be interpreted as an extension of what the cognitive abilities intuitively perform through a formal representation in a spatial model aimed at capturing the distribution, the trend, the processes and the relationships of the investigated phenomenon [
108]. Unlike the traditional non-spatial statistical techniques, the spatial statistics methodologies use the ‘spatial’ concept in its mathematical meaning, i.e., through the analysis of mono or two-dimensional characteristics, and of proximity and orientation relationships [
109] allowing us, for example, to define spatial clusters, excluding any anomalous values, or to identify spatial relationships among different elements. This methodology has been successfully applied in the study of the residential market of the city of Wroclaw, located in Poland, in Lower Silesia [
110].
Geographically Weighted Regression (GWR) is a non-parametric weighted local regression technique, developed in statistics for curve-fitting and smoothing applications, in which regression coefficients are estimated using a spatial proximity variation model that allows the local calibration of the coefficients. The spatial coordinates of the points associated to the data are used to calculate the distance among the points: this represents the input of the kernel function that allows us to calculate the weight that represents the spatial dependence among the observations. This methodology is based on the assumption that there is a spatial correlation among the regression coefficients.
Starting from the concept underlying the regression models for the mathematical description of the GWR model, i.e., the determination of the relationship among two or more sets of variables, the most common situations involve the presence of a response variable
and a number of input variables
,
,…,
. If the regression is linear, the Equation (3) shows the respective functional form:
where
indicates the dependent variable of the
-th observation,
represents the constant to be associated with the independent variables
, and
is the random variable that returns the random error. In a generic multiple regression model, the assessment of the unknown parameters is obtained through the Least Squares technique. If the available observations are geographically referenced, it is possible that the hypothesis of homogeneity could be not verified, which happens in the case of spatial data characterized heterogeneity [
111]. GWR allows us to investigate this typology of phenomena through the introduction of the geographic coordinates
assigned to each survey in the space. The representative functional structure of the linear regression becomes the Equation (4):
The substantial difference between GWR and a classical linear regression is that, in the GWR, a coefficient is assessed for each observation, as well as for each independent variable.
The dissemination of databases from different sources, as evidenced by the various applications developed thanks to the open data [
112], represents an important preparatory resource for the increase of the transparency of the evaluation processes. When these information data are associated with a shape file, they become particularly useful, because they allow: (i) an effective map display that makes the information intelligible even to a non-expert user; (ii) the geostatistical analyses to be performed [
113]; and (iii) the users to create interactive queries.
Several applications of GWR have been carried out in the scientific literature concerning property valuations [
114,
115,
116]. Dziauddin et al. [
117] assessed the effect of a light rail transit system (LRT) on residential property values in Greater Kuala Lumpur, Malaysia. Geographically Weighted Regression was implemented to assess the increased land value as a result of improved accessibility related to the construction of the LRT systems. Dimopoulos and Moulas [
118] highlighted the importance of GWR in an ArcGIS environment to identify the critical parameters that affect property values in the Municipality of Thessaloniki (Greece), and to create a market value forecasting tool for a fairer taxation system. Cohen et al. [
119] developed a new methodology for obtaining accurate and equitable property value assessments, that adds a time dimension to the Geographically Weighted Regressions (GWR) framework; this method was also applied to sales data for residential properties in 50 municipalities in Connecticut for 1994–2013 and 145 municipalities in Massachusetts for 1987–2012 to compare results over a long time period and across cities of two different states. Bujanda and Fullerton [
120] implemented a GWR analysis to determine the geographic footprint and to quantify the impacts of transportation infrastructure proximity and accessibility on real property values in El Paso (Texas).
With reference to the present research, a GIS-based Territorial Information Tool called SIT Valuation was been developed. In particular, the proposed tool, on the basis of the property prices and of a series of independent variables, using the two different EPR and GWR techniques, has allowed us to define the ‘price’ function by which the contribution of each variable on the selling prices has been analysed. Property prices have a strong spatial component, as neighbours develop in the same historical period and according to similar building typologies. In this sense, the buildings will have homogeneous intrinsic characteristics. Moreover, the neighbouring buildings will be influenced by the same positional characteristics, such as the shops, schools or the presence of green areas. The SIT Valuation provides the database population with the independent variables (geographic, socio-economic, etc.) to define the map basis, and the introduction of qualitative information allows the identification of independent variables.
4. Application
4.1. Case Studies
The case studies considered in the research concern the city of Rome (Central Italy) and the city of Milan (Northern Italy), which are the Italian cities for which there is a more dynamic real estate market, both in terms of number of the transactions and turnover of the real estate industry. Furthermore, these two cities are the only two characterized by a high number of transactions in the corporate sector and, at the same time, a consistent public property asset to be enhanced. These elements allow the construction of a database, named DB Corporate Real Estate, which is representative of the corporate properties, as composed by a sufficiently large population even after the necessary operations aimed at excluding anomalous data. In particular, the database related to the city of Rome concerns 170 corporate properties sold in the period from 2004 to 2016, whereas the database for the city of Milan regards 188 corporate properties, for an overall market value equal to 10 billion euros, which represents approximately 25% of the total value in corporate property investments in the two considered metropolitan cities (
Figure 1).
The choice of these two Italian cities is linked to the highest interest for corporate properties at the national level.
Figure 1 shows the percentage of corporate investments from 2008 to 2016 with reference to the cities of Milan and Rome and compared to the Italian context [
121]. The graph attests the most relevance for the two Italian cities in terms of a high amount of corporate investments.
In the last few years, the strong prevalence of the city of Milan to present a larger number of prime properties compared to the city of Rome should be pointed out. The international interest for the Rome and Milan markets is relevant: this confirms the significance for foreign investment related to these cities and the partial saturation of the European markets.
The city of Rome presents a consistent number of trophy properties, i.e., those buildings characterized by high architectural quality that attracts prestigious brands, and consequently high standing tenants. The trophy asset concerns a specific market segment composed of high-end range buildings with a strategic and relevant location, and excellent architectural levels and finishes [
122]. The strong tourist vocation and the centralization of the governmental functions of the Italian capital are further elements that guarantee the high return of real estate investments made in Rome.
The city of Milan, on the other hand, is the Italian city with the most European style: it has always been the capital of finance and fashion; in recent years it was the location of important international events, such as Expo and the Winter Olympics, and real estate redevelopment initiatives that added further appeal to the city, i.e., the Porta Nuova project, the interventions of CityLife and Milano Innovation District (MInD) in the Expo 2015 area.
Attention has therefore focused on Rome and Milan, taking into account the interest of institutional investors and foreign professional operators. For the city of Rome, investors are looking for properties situated in prestigious and central locations, characterized by a continuity of tenants and a low vacancy rate (<10%). Conversely, for the city of Milan, investors have a higher risk appetite due to the confidence climate that characterizes this metropolitan city. This also translates into the search for peripheral properties, with a rather high market rent compared to the value, even if there is less certainty about the tenants’ persistence and vacancy times. Further development of the research may address different national or international cities in order to select the most representative ones for the analysis of the corporate property markets and to apply the evaluation methods.
4.2. Variables
The construction of the database required the identification of the variables that most influence the selling price formation as a preliminary step [
123].
The selling price per unit surface of the property sold in the period 2004–2016 in the cities of Rome and Milan for the corporate sector is the dependent variable.
With reference to the choice of the influencing factors for the price formulation, it should be pointed out that the selection of the explanatory variables to involve in a mass appraisal model is always somewhat arbitrary, and requires an unavoidable trade-off between bias from omitted variables and increased sampling variance associated with collinearity [
124,
125,
126,
127,
128,
129]. There is relative agreement, however, on what represents the major influencing factors [
130,
131]. Some authors have studied the main characteristics to be considered in the assessment of corporate properties [
132] by identifying the fundamental classes of influencing factors [
133] and outlining the importance of location and site selection factors [
134].
Several studies highlighted that better ‘comfort’ in the workplace [
135,
136,
137] increases buildings’ attractiveness for occupiers and decreases the risk for investors, determining a higher occupancy rate and a premium on rents or property values. Some authors [
138,
139] point out the linkages between the two viewing angles from which a corporate real estate can be observed, i.e., the owner perspective, which aims at maximizing the value of the assets, and the user perspective, which aims at ensuring a suitable work environment for all operational processes [
140]. Rymarzak and Sieminska [
141] illustrated that the demand and supply factors affecting the general location choice of corporate real estate are linked not only to the ordinary locational, technological and market factors (accessibility in terms of transport networks, parking capacity, age and technical standard of existing space, market rents/sale prices, office building pattern and size) but also to the features that make the environment of work familiar and comfortable for the employees (e.g., the office space per employee).
Taking into account the mentioned literature, and through the support of the experience of the appraisers and real estate agents directly consulted, the independent variables considered are:
- -
: the units’ average selling price provided by the Real Estate Market Observatory (OMI) of the Italian Revenue Agency, relating to the semester in which the sale occurred, the specific market micro-zone and the intended use of the property;
- -
: the units’ average rent provided by the OMI, clustered as the variable C;
- -
: the resident population per unit surface relative to the year of sale, built starting from the Italian Institute of Statistics (ISTAT) surveys, processing the data through a grid modeling that considers the subdivision of the municipal territory into grids of 90 meters on each side [
142];
- -
: the saleable surface of the property;
- -
: the architectural quality of the property. In particular, the variable is a dummy set equal to ‘0’ if there is no evident architectural quality; vice versa, it is set to ‘1’. The importance of identifying this variable is connected to the market appreciation for trophy buildings;
- -
: the representative coefficient of the presence of public green areas (and their size) around the property. For the assessment of the influences connected to the public green areas, the parks, gardens and historic villas within two kilometers of the property were considered, and the surface extension of the green area and the distance of the -th property estate from the green areas were simultaneously determined. The green index is obtained through the sum of the ratios of the root of the areas and the distance of the property from the green area: ;
- -
: the representative coefficient of the subways around the property [
143,
144]. A maximum distance of 2 km was considered, in order to limit the computational burdens. Note the distance
of the
-th property from the
-th subway within the 2 km radius; the value of the proximity coefficient from the subway will be:
.
In order to identify the data related to each property, i.e., the value of the dependent variable , the value of the independent variable , the address and the date of sale, the databases provided by the Immobilium site (I) and by Nomisma (N) were considered.
With reference to the aims of this research, it was decided to exclude properties characterized by a saleable surface of less than 500 square meters, as they are not representative of the corporate concept, as previously defined.
In the corporate sector, the most frequent intended uses are executive and commercial, on which this research has consequently concentrated.
In the city of Rome, 59% of the total transactions for the reference period concern executive properties, with an average unit selling price (Y
avg) equal to 4123 €/m
2, whereas 26% is attributable to commercial ones (see
Table 1). Even in the city of Milan, the executive market (71%) is the most widespread sector; the commercial one follows, with 14% and the remaining 15% divided among the other intended uses (see
Table 2).
An analysis of the Moran index for the DB Corporate Real Estate of the city of Rome (see
Table 3) shows a high spatial autocorrelation for the variables
,
,
,
and
C, a good autocorrelation for
, and an absence of correlation for
and
. The analysis of the Moran index relating to the DB Corporate Real Estate for the city of Milan (see
Table 3) shows a high spatial autocorrelation for the variables
,
,
and
, a good autocorrelation for
, and an absence of correlation for
,
and
.
These results can be interpreted taking into account the construction of the database itself; in fact, for both the cities, the most autocorrelated variable is , which reflects a logic of the continuity of the variation of the population in the space, independent from the temporal component. For the variables and there is also a high spatial autocorrelation both in Rome and in Milan: the OMI of the Italian Revenue Agency has detected that there is a difference between the selling prices in the central areas and those in the peripheral areas equal approximately to 50% in the city of Rome, and to 70% in the city of Milan. This observation is consistent with the spatial distribution, and the consequent autocorrelation, of the variable —especially for the city of Milan—and of the variable —especially for the city of Rome, where the properties with the highest architectural quality are located in the central areas.
and are spatially heterogeneous, both for Rome and Milan. This behaviour outlines the absence of a specific spatial distribution of these variables, taking into account the relationships observed through the values associated with the properties in the database. The variable is the only one for which the two cities present a very different Moran I, which is probably more connected to the specific location of the properties than to reasons related to the construction of the variables, or to the geography of the public green areas.
4.3. EPR Implementation
The EPR technique was applied to the database by considering the following inputs: (i) the maximum number of terms is equal to 7, that is, the number of independent variables; (ii)
is the dependent variable in the models A and B, and
is dependent variable in model C of
Table 4; (iii) the exponents of the dependent variables are positive in the models A and C of
Table 4, and negative in model B of
Table 4.
At the end of the elaborations carried out according to the three models A, B and C on the cities of Rome and Milan, the maximum CoD relative to each model was compared (
Table 5): for the city of Rome, the models A and B are characterized by a CoD higher than 75%, which is higher than the statistical accuracy determined for model C; for the city of Milan, there are high performances for all the models, with a CoD of around 80%. For these reasons, model A was selected for both the cities, as it combines a good statistical performance with more simple interpretation related to the absence of negative exponents.
For the city of Rome, the Equation (5) is generated by EPR (model A):
For the city of Milan, the Equation (6) is returned by EPR (model A):
In order to determine the influence of each independent variable on the formation of the selling price according to the EPR models, the function shown in the Equation (7) was determined for each independent variable:
where
represents the independent variable in the analysis, and
is the average value of the other independent variables. Therefore, the contribution of each independent variable on the price formation can be expressed by the Equation (8). (
Table 6 and
Table 7):
The analysis of the results allows interesting considerations.
For the city of Rome, the variables
and
are the most significant ones in the determination of
: for both the variables, a direct linear relationship with the estimated unit selling prices was detected (graphs (a) and (c) in
Figure 2 which reflects the widespread interest in the Italian capital for properties characterized by large size (>4000 sqm), and the relevant correlation and high correlation between the quotations returned by the OMI and the property prices (the correlation between
and
is equal to 0.6). The variables
and
also have a significant contribution to the determination of
; however, contrary to what is empirically expected, the functional relationship between
and
is characterized by an inverse proportionality (graph (d) in
Figure 2), as if the most distant properties from the green areas are more appreciated than those near them. This behavior can be explained by observing that the central properties, which are also those farthest from the large city parks, have a higher selling price (>5000 €/sqm) related to their location close to the city center.
The relationship between the market rent reported by OMI (
) and
is also inverse (
Figure 2a), which is an indication of how the market appreciation is not strictly connected with the ability of the property to generate income, or that there is little consistency in OMI data with this asset typology. The variable
is also characterized by a good contribution to the price formation, since a higher price (>5000 €/sqm) is detected in areas of low population density (<3000 inhabitants/sqkm) or commercial-executive vocations (
Figure 2b). In the expression returned by EPR, the variables
(
Figure 2c) and
M (
Figure 2d) behave like two constants, resulting in little influence, although it is possible to capture a slight increase in the unit price for properties characterized by high architectural quality.
For the city of Milan,
and
are the most significant independent variables (
Figure 3a) and their behavior is practically symmetrical: contrary to what was found for the city of Rome, the price formation seems to be connected more to the market rent detected by OMI, given the exponential relationship linking
a
.
The interpretation of what was described regarding the variables and could lead to the conclusion that, in the city of Rome, property price is essentially linked to its positional characteristics, as returned by the market quotations of the OMI, whereas in the city of Milan, the selling price is more connected to the rental status: this contingency describes the different attitudes with which the investors approach the real estate market in the cities of Rome and Milan. In fact, for the city of Rome, the investors are mainly interested in properties located in prestigious locations, for which a continuity of tenants and a low vacancy rate (<10%) are detected, even if they have a lower entry or initial yield (about 4.5%), whereas, for the city of Milan, the investors show a higher risk appetite, which determines the market demand of peripheral properties with a high entry yield (about 7%), even if they come with more tenant risk and a physiological vacancy (about 20%).
The variables
and
(
Figure 3d) are characterized by a good contribution to the price formation: the closer the property is to a subway, the higher its value is, as was intuitively expected for the city of Milan, where the subway constitutes an effective means of public transport. The relationship between
and
is inverse, probably for the same reasons previously exposed for the city of Rome. The variables
and
influence the price formation: the areas with a lower density (<5000 inhabitants/sqkm) are those for which there is a higher price (
Figure 3b), and properties with smaller sizes (<2000 sqm) have a higher unit price (>7000 €/sqm) (
Figure 3c), probably due to the central position or the lower unit price (<2500 €/sqm) that can be obtained for property assets with high sizes (>10,000 sqm). As for the city of Rome, the architectural quality (graph III in
Figure 3) does not decisively affect the determination of
for the city of Milan, even if the model detects a higher price for the properties with good architectural quality (>5000 €/sqm).
Figure 4 summarizes the contribution that each independent variable makes to the price formation in the corporate property market of the cities of Rome and Milan, according to the EPR equations: the most influential variables are indicated with the green colour; the red colour indicates the less influential variables; the yellow colour indicates those with intermediate influence between the previous ones and with the slash the variables that behave like a constant. The ‘
’ symbol indicates the existence of a direct relationship by which the price increases as the value of the variable increases, contrary to what happens for the variables marked by the symbol ‘
’. The ‘
’ symbol indicates the absence of the specific variable in the analysed model. It should be noted that the behaviour of the more strictly property variables, i.e., those related to OMI market quotations (
and
), is diametrically opposite for the two case studies, whereas the behavior associated with the
and
variables is similar, both in the functional form and in the intensity of the related contribution, which is exactly the opposite of what was found for the variable
.
4.4. GWR Implementation
Similarly to what was described for the EPR models, for both the cities analysed, different elaborations were carried out by considering as the dependent variable and and implementing the fixed and adaptive kernel model. Following the statistical analysis, which aims to determine the error between the estimated and detected unit price, the best performing models were obtained, both for the city of Rome and the city of Milan, for the model that uses the natural logarithm of the unit price as a dependent variable and the adaptive kernel model.
In the elaborations on the city of Milan, there is a multicollinearity caused by the simultaneous presence of and (a correlation index between and equal to 0.99), which prevents the elaboration of the regression. In order to overcome this problem, the variable was excluded, even if it had a higher correlation with than the variable (a correlation index between and equal to 0.73, while the coefficient of correlation between and is equal to 0.74), maintaining the descriptive independent variable of the quotation provided by the OMI, since the latter is more easily interpretable in the analysis of the results.
The functional form considered for the DB Corporate Real Estate concerning the city of Rome is shown in the Equation (9):
whereas the corresponding function for the city or Milan is represented by the Equation (10):
The values assumed by the variables are always positive, whereas each coefficient determined through the GWR technique, taking into account its variability, could be always positive, always negative or both positive and negative. For this reason, in the previous expressions, the ‘’ symbol indicates that the coefficient is almost always negative, but that it can also assume positive values; vice versa, the ‘’ symbol indicates that the coefficient is almost always positive, but that it can locally assume negative values.
The contribution of each independent variable to the price formation is equal to the form expressed in the Equation (11):
where
is the independent variable related to the
-th property of the considered DB Corporate Real Estate. Therefore, the adimensional contribution of each variable with respect to the others was determined by the Equation (12):
The analysis of the coefficients
related to the GWR techniques applied to the DB Corporate Real Estate highlighted that, for both the case studies (see
Table 8 and
Table 9), the most significant variable is the average selling price detected by OMI (
).
In the case of the city of Rome (
Table 8), the average market rent (
) is characterized by a relevant contribution to the price formation, even if the coefficient
is negative and the coefficient
is positive. The variables
and
give a positive (but not relevant) contribution to the determination of
, which is the index of the positive influence of green areas’ proximity and high architectural quality on the price formation. The influence of the other variables is not significant, as for the population density (
) and the property size (
), or it is null with respect to the presence of subways around the property (
).
For the city of Milan (
Table 9),
is the most significant variable, followed by
and
; the areas with a lower population density (<5000 inhabitants/sqkm) are characterized by a higher price (>7000 €/sqm) than the more populated ones; the property price grows with the increase of the distance from green areas, which can be justified by interpreting
as a proxy variable for the distance from the center. Even the variable relating to the size of the properties (
) affects, albeit marginally, the price formation, indicating how the properties characterized by smaller sizes (<2000 sqm) have a higher unit price (>7000 €/sqm). The contribution of the architectural quality (
) and the proximity to the subway (
) is not relevant.
By comparing the results obtained for the two GWR models (
Figure 5), according to the logic already illustrated for the EPR models, and indicating with ‘NA’ the variables that have not been considered, it can be observed that
C is the only variable that has the same behaviour for the two case studies.
4.5. Comparison of the Results Obtained by the Implementation of the Two Techniques
The elaborations produced with the two different techniques (EPR and GWR) for the two case studies considered were compared in terms of statistical performance and the empirical reliability of the results.
In
Table 10, three statistical indicators of EPR and GWR models for the cities of Rome and Milan were determined: the Maximum Absolute Percentage Error (MaxAPE), the Mean Absolute Percentage Error (MAPE) and the Root Mean Square Error (RMSE). In particular, from the comparison of the statistical indicators, it emerges that both the techniques were more effective in their application to the city of Milan, probably due to the lower spatial extension (182 sqkm for the city of Milan and 1285 sqkm for the city of Rome) and the higher concentration of corporate properties.
In order to compare the discrepancy between the detected and the estimated selling prices through the EPR and GWR techniques, the residual values
e
were determined by the Equations (13) and (14):
where, in relation to the
-th property,
(
) is the value estimated through the EPR (GWR) technique, and
is the selling price detected through the DB Corporate Real Estate.
The representation of the distance between the
curve and the
e
curves (
Figure 6 and
Figure 7) also indicates that there is not a more clearly powerful technique than the other one. However, for both the cities of Rome and Milan, the deviation of the
curve from the
curve is less than that of the
curve for the properties characterized by high unit prices; therefore, the EPR technique would be more reliable in the assessment of properties characterized by a unit price higher than 8000 €/m
2 for the city of Rome and 10,000 €/m
2 for the city of Milan.
By graphing the absolute residual values (
e
) through circles whose size increases as the error between the estimated and the detected price increases, it was found that, as for the city of Rome, both the EPR model (
Figure 8a) and the GWR model (
Figure 8b) perform poorly in the forecast of the selling prices in the East area; vice versa, both the models, especially those returned by the GWR technique, are reliable in the central areas. In summary, in the applications of the two techniques to the DB Corporate Real Estate relating to the city of Rome, for 58% of the properties the EPR technique is better performing, whereas in the remaining 42%, the GWR technique should be preferred, confirming what was previously reported by the comparison of the statistical tests.
For the city of Milan, the EPR (
Figure 9a) and GWR (
Figure 9b) models are almost equivalent: by observing the absolute residual values, the EPR technique is more reliable for 51% of the properties, whereas for the remaining 49%, the GWR technique generates better outputs. By representing the properties in which the EPR/GWR technique is more reliable with a green/red dot, respectively, it can be seen that both in the city of Rome (
Figure 10a) and in the city of Milan (
Figure 10b) it is not possible to outline a spatially relevant behaviour for the two techniques.
The qualitative comparison between the results obtained with the two techniques for the city of Rome (
Figure 11), according to the symbolism previously introduced, shows a concordance of the signs and the influence for the variables
,
,
and
. The two models agree in indicating the OMI selling price quotations as the most significant variable, unlike the market rent, and a positive contribution is given by the architectural quality of the properties; the relationship of the properties with the subways is almost irrelevant for the price formation.
For the city of Milan (
Figure 12), as for the city Rome, the comparison between the two models allows us to identify four variables with common behaviour (
,
,
and
); for the city of Milan, the most significant variable it is not the same according the two models, probably due to the exclusion of the variable
in the GWR model, which was necessary to solve the multicollinearity effect. The proximity to the city center (interpreted as a proxy variable of the distance from green areas), the location of the property in areas with lower population density (<5000 inhabitants/sqkm) and smaller sizes (<2000 sqm) appear to be factors that affect the price formation.
The applications of the EPR and GWR techniques to the DB Corporate Real Estate for the cities of Rome and Milan allow us to outline some useful indications for the definition of the SIT Valuation.
The GWR technique has the great limitation of being scarcely usable for forecasting purposes. For each property, through the GWR technique, the coefficients and the interception that define the linear equation are determined; therefore, in the estimation phase, given a generic property for which the values of all the independent variables are known, it will be impossible to assess the property selling price without knowing the coefficients of the equation. To overcome this limitation, the spatial variation of the coefficients for the DB Corporate Real Estate could be determined, and specific areas with homogeneous coefficients for use in the estimation phase could be assumed, as defined through the cluster analysis [
145]. This elaboration presupposes that the coefficients of the independent variables uniformly vary, which is scarcely likely due to the nature of the independent variables.
The EPR technique fulfils the forecasting objective, as well as the descriptive one, even if the developed functional form is rather complex and its validation requires a considerable expansion of the analyzed sample/sufficiently representative study sample. Furthermore, the results are often difficult to interpret, and the contribution of each independent variable cannot be intuitively captured. This obstacle is partially overcome through the graphic representation of the relationship between the dependent variable (selling prices) and the explanatory variables.
5. Conclusions
The global economic crisis triggered by the US subprime highlighted the cogency of the use of innovative valuation methodologies which are able to formulate more reliable valuations and to effectively monitor the evolution of property values.
The present research intended to fill the absence of a valuation tool as a support for determining the market value of corporate properties characterized by large sizes, non-residential intended use and the widespread interest of professional and/or institutional investors, through the integration of geographical information data and mass appraisal techniques in dealing with complex valuation issues. This goal was pursued by defining a GIS-based Territorial Information Tool (the SIT Valuation) that, based on the selling prices of corporate properties and on a series of influencing factors in the prices formation processes, uses two different techniques for the identification of the functional relationships that link the selling prices to the considered characteristics.
The SIT Valuation aims to increase the transparency in the valuation of these property assets, and at creating a support for both the Public Administration and private operators. Through the SIT Valuation, it will therefore be possible: (i) to represent the geographical distribution of the input data, i.e., the dependent variable and the influencing factors that constitute the DB Corporate Real Estate; (ii) to display the geographical distribution of the output data, i.e., the various indicators that represent the difference between the detected and the estimated property prices; (iii) to assess the market value of a property with a moderate average percentage of error; (iv) to periodically update the market value of large property assets.
The SIT Valuation integrates the innovative aspects of the GIS environment, in terms of database construction and map visualization, and the calculation potentialities of the EPR technique, which is useful in providing expressions that can be implemented in the estimation phase. Furthermore, the proposed method contains a broader information set than many similar applications in the reference literature [
146], and directly incorporates the spatial component in the database construction, not only as a correction factor.
The developed method can constitute a valid support for all public (e.g., Italian Revenue Agency) and private entities (buyers, sellers, investors, institutions, insurance companies, banks, etc.) that manage relevant property assets and which, for various reasons (periodic reviews of the balance sheets, sales, enhancement, investment, etc.), require cyclical updated values. Therefore, the method can represent an additional tool—to be integrated by the canonical evaluative procedures (market approach, income approach, cost approach)—to verify the results and to better monitor the evolution of the values of real estate portfolios, taking into account the indications of Capital Requirements Regulation (EU) No. 575/2013, art. 208 (3) (b), which states the possibility to “use statistical methods to monitor the value of the immovable property and to identify the immovable property that needs revaluation”. Moreover, the method could also be an important reference in the assessments of the market values of the properties that constitute the Real Estate Funds, which were characterized by a widespread diffusion in the last decades. Therefore, extending the input database to a greater number of cities and including ordinary properties, it would be possible to obtain an AVM as a support for credit institutions and financial operators who are interested in quickly evaluating large property assets, in order to monitor and update the properties’ market values. In fact, the advantages for the various entities consist in the possibility to identify value creation strategies through the best use of property assets; in particular, through the developed method, on the one hand, the professional valuers, who are responsible for producing highly reliable property valuations, could verify the adequacy of the assessed values, as well as determining and suggesting the most appropriate strategies of investment enhancement; on the other hand, the investors and the company managers could directly and transparently monitor the trends of the property market values and of the investment performance over time.
The main limitations of the SIT Valuation tool are linked to the limited time series available for the detected DB Corporate Real Estate: the low transparency that has always characterized the Italian real estate market has not allowed us to collect a relevant database in terms of the considered time period and, consequently, to develop an econometric model (e.g., a Vector Autoregressive model) capable of appropriately describing and interpreting the analysed property markets and forecasting the future trends in the medium–long term. This weakness of the tool could be overcome through specific agreements with the institutions and the public entities that manage large amounts of property data (e.g., the Italian Revenue Agency), in order to detect more relevant samples of transaction temporal series.
The future insights of the research may consist in the automated implementation in ArcGIS of the elaborated algorithms, in order to obtain an AVM to be especially implemented in the valuation of corporate properties. Furthermore, the method could be applied by government agencies that, in order to guarantee the fair payment of the taxes [
146], can periodically update the market values of the properties according to actual and current real estate trends.