Artificial Neural Networks (ANNs [
22]) represent the smart core of the Decision Sup-port System (DSS) presented in this paper. ANNs are analytical structures the working of which depends on both their topology and a set of parameters. Several paradigms of ANNs exist referring to the topology, the kind of problem to handle, and the time dependency. The most popular type of ANNs is the Multi-Layer-Perceptron (MLP,
Figure 5), which can be used to solve both Classification and nonlinear Regression problems. MLPs have a unidirectional structure with no feedbacks. The layout consists of an input layer of nodes (neurons), one or more intermediate (hidden) layers, and an output layer. Whatever its type, an ANN typically leads to the creation of a model of a physical system starting from the data, rather than from the knowledge of the analytical relationship among the variables. In particular, the aim of MLPs is to imitate the relationship between independent (input) and dependent (output) variables that describe the physical system. For this purpose, a training process is performed to calculate the parameters of the ANN (weights), starting from a randomly assigned set of values. The training consists of an iterative procedure where the minimum of a performance function is sought. Most parts of used algorithms are first or second order minimization procedures, where the performance function is typically represented by the mean squared error of the network, the error being the gap between the output and the measured (target) values.
The training process could be a challenging task since the performance depends on choosing an appropriate training set of examples, adopting a suitable layout of the MLP and assuming an efficient training strategy. Usually a trial and error procedure is performed to determine the best set of these hyper parameters.
3.1. Inversion Algorithm
A representation of the relationship between inputs and outputs is conveyed in the trained MLP. This information can be exploited to solve the inverse problem, that is to determine the design parameters which meet the set performance requirements [
14,
15,
16,
17,
23,
24,
25,
26,
27]. Since the design parameters represent the input and the performance of the physical system is the output, finding the input corresponding to a given output means determining the design parameters which guarantee the fulfilment of the given requirements. The model frozen inside the MLP is represented by a set of equations (Equation (1)), which describe the relationship between the input
and the output
of the network, see
Figure 5.
To disentangle the algebraic structure of the MLP, two auxiliary variables, namely
and
, are introduced. They represent respectively the input and the output of the hidden layer. Equation (1) (a) describes the linear relation between the output of the hidden layer and the output of the MLP (
and
, respectively). Equation (1) (b) relates the input and the output of the hidden layer (
and
, respectively), this being a non-linear relation. Finally, Equation (1) (c) describes the linear relation between the input of the MLP and the input of the hidden layer (
and
, respectively). Once the MLP has been trained, Equation (1) allow to propagate the input
up to the output
, which leads to calculate the performance of the system corresponding to a given set of design parameters. On the other hand, if the output variable
is set, the three Equation (1) can be solved in series to obtain the input
, which means finding the design parameters that correspond to the given performance.
In general, a domain of feasibility can be set, rather than requiring the fulfillment of a given value of the performance. For the sake of simplicity, in this formulation such domain is assumed to be linear and convex, so that it can be expressed by means of the set of inequalities given by Equation (2).
The feasibility domain of the output, introduced in Equation (2), sets the requirements of the building. The constraints on the output correspond to as many constraints on the input, namely the design parameters. Thanks to Equation (1), we can transfer the feasibility domain from the output space to the input space. In other words, the requirements are translated into a feasibility domain of the design parameters. In their turn, a set of bounds will be stated on the design parameters, defining a feasibility domain of the input. Solving the inverse problem means finding a design solution that falls within the intersection between the two feasibility domains, respectively defined on the input and on the output.
As described in
Figure 6, we will make use of the following four geometrical spaces: Input Space X where the design parameters are defined, Upstream Hidden Space K representing the input of the hidden layer, Downstream Hidden Space H representing the output of the hidden layer, Output Space that is the output of the network. Equation (1) is subdivided into three subsystems: a linear equations system that relates space X with space K, a nonlinear equations system that relates space K with space H, and, finally, a linear equations system that relates space H with the output space Y.
Equation (1) allows us to project both points and domains from one space to any other of the structures. In particular, the feasibility domain of the output, expressed by Equation (2), namely the performance requirements of the building, can be projected into space H by substituting the output variable
as derived from Equation (1) (a) in Equation (2):
Equation (3) denotes a constraint for variable
deriving from the feasibility domain of the output. At the same time, variable
is limited by the range of the nonlinear activation function of the hidden neurons. Such activation function typically has a sigmoidal shape, and has a range in the interval
. In this work, the hyperbolic tangent function is assumed (see
Figure 7).
Vector
corresponding to the sought solution must be attainable from a feasible
, the feasibility range being
, where
is a margin from the saturation of the activation function. We can re-write this equation in the following form:
The two systems (3) and (4) can be included in a unique system as:
A first check is required to establish if the domain (5) is empty. In this case, no design solution exists which meets the requirements, and the requirements should be relaxed.
The domain (5) can be projected on space K by means of Equation (1) (b):
Finally, the domain (6) can be projected on the input space
Equation (7) describes a nonlinear domain in the input space X.
Generally, a feasibility domain is a priori stated on the input space. Often, this domain contains a feasibility range for each design parameter, making the feasibility domain to be a box:
. More in general, the feasibility domain of the input could be defined by a set of constraints of any kind. For the sake of simplicity, in this work we assumed that such constraints are linear, thus, the domain can be expressed as:
The solution of the problem must fall within both the feasibility domains (7) and (8). Therefore, the design problem is transformed into an existence problem. Unfortunately, one of the two systems is nonlinear, therefore finding a feasible solution could be a challenging task.
The existence problem can be solved iteratively making use of the linear programming (LP) [
28]. The number of neurons of the hidden layer usually is larger than the number of inputs, which implies that the number of Equation (1) (c) is larger than the number of the input variables, thus the equations system is overdetermined. By means of the pseudoinverse matrix of Moore–Penrose, Equation (1) (c) can be solved with respect to
according to the least square error criteria:
By substituting Equation (9) into Equation (8), the feasibility domain of the input is projected in space K, which is still linear:
By exploiting Equation (1) (b), any point from space K can be projected to space H and vice-versa. The feasibility domain of the input is linear in space K and nonlinear in space H, while the feasibility domain of the output is linear in space H and nonlinear in space K. Therefore, an iterative procedure can be defined, where the current solution is projected alternatively on the two domains changing space K and H, so that the domain where the solution should be projected is always linear.