**2. Problem Statement**

The statement of the training problem for a multilayer perceptron (MP) is formulated as an MCOU problem:

**W,Z,F(w,z)**, (1)

where **<sup>w</sup>** <sup>∈</sup> **<sup>W</sup>** <sup>⊂</sup> **Erw** is a vector of weight coefficients of MP synaptic connections; **<sup>z</sup>** <sup>∈</sup> **<sup>Z</sup>** <sup>⊂</sup> **<sup>E</sup>rz** is a vector of uncertain factors; **<sup>Z</sup>** is a finite set of possible values of the uncertain

**Citation:** Serov, V.A.; Dolgacheva, E.L.; Kosyuk, E.Y.; Popova, D.L.; Rogalev, P.P.; Tararina, A.V. Artificial Neural Networks Multicriteria Training Based on Graphics Processors . *Eng. Proc.* **2023**, *33*, 57. https://doi.org/10.3390/ engproc2023033057

Academic Editors: Askhat Diveev, Ivan Zelinka, Arutun Avetisyan and Alexander Ilin

Published: 25 July 2023

**Copyright:** © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

factor; and **F(w,z) = [***f* <sup>1</sup>**(w,z)**, ... *f <sup>m</sup>***(w,z)]***<sup>T</sup>* <sup>∈</sup> **<sup>E</sup>***<sup>m</sup>* is a vector criterion defined on the Cartesian product **W** × **Z**.

In problem (1), it is required to determine the value of the vector **w** ∈ **W**, which provides the minimum values for the components of the vector criterion **F(w,z)** under the influence of an uncertain factor **z** ∈ **Z**, about which it is known only that it can take values from a finite set **Z**.

To solve this problem, it is proposed to use the vector minimax principle. In this case, the original statement of problem (1) is reduced to a deterministic multiobjective optimization problem:

$$\mathbf{V(w)} \to \min\_{\mathbf{w} \in \mathbf{W}} \tag{2}$$

where **V(w)** is a vector indicator, the components of which are the points of extreme pessimism of the vector criterion **F(w,z)** on the set **Z** with fixed **w**.

To solve problem (1), (2), the hierarchical evolutionary algorithm (HEA) of the MCOU developed in [6,7] is used. As studies [5,6] show, the HEA MCOU, when used in ANN training tasks, shows a high computational complexity. Therefore, it is proposed to implement the HEA software for solving problem (1), (2), based on the GPU architecture and OpenCL technology.

#### **3. GPU-Based Parallel Implementation of the Hierarchical Evolutionary MCOU Algorithm**

The architecture of the developed HEA MCOU software reflects the following main stages of the HEA MCOU implementation.

**Stage 1.** Formation of a set of points of extreme pessimism (Figure 1).

Step 1. The initial population **W**, |**W**| = *n* is formed on the host (CPU).

Step 2. Constant memory is allocated on the GPU, into which arrays **Z** and **W** are entered. A buffer is allocated in the global memory of the GPU for the set of points of extreme pessimism **V(W)**.

**Figure 1.** Parallel algorithm for finding a set of points of extreme pessimism on the GPU.

Step 3. A grid is formed on the GPU that determines the number of working blocks and the threads executed in them.

Step 4. The kernel is called with a subsequent transfer from the CPU to each thread of a set of instructions for execution. Threads start to work in parallel. Within each thread, the corresponding set of values of the vector criterion **F(w,Z)** is calculated for each **w** ∈ **W** and the extreme pessimism point **V(w)** is calculated on the set **F(w,Z)**, which is stored in the local memory of the thread. Upon completion, each thread transfers its value **V(w)** to the global memory of the GPU. After the GPU has signaled that all threads have terminated, the CPU moves the array **V(W)** from the GPU's global memory to the CPU's RAM.

**Stage 2.** Assessment of the fitness of each point **w** ∈ **W** (Figure 2).

Step 5. Constant memory is allocated on the GPU, into which array **V(W)** is entered. In the GPU's global memory, a buffer is allocated for the set of values of the fitness function *Φ*(**V(W)**).

**Figure 2.** Parallel algorithm for calculating fitness function values on GPU.

Step 6. A new grid is formed on the GPU.

Step 7. The kernel is called and threads are started to work in parallel. Within each thread, the corresponding value of the fitness function *Φ***(V)** is calculated for each element of **V** ∈ **V(W)**, which is stored in the local memory of the thread. Upon completion, each thread transfers its *Φ***(V)** value to the GPU's global memory. After the GPU sends a signal to terminate all threads, the CPU transfers array *Φ***(V(W))** from the GPU's global memory to the CPU's RAM.

Next, a population of descendants is formed on the CPU and the execution of stages 1, 2 is repeated.

The developed algorithm can be easily modified to solve the MCOU problem (1), (2), where there are many uncertain factors.

The developed software is cross-platform, as CUDA and OpenCL technologies are available on various operating systems, both on Windows and Linux.

#### **4. Computational Experiment**

The effectiveness of the developed technology was tested on the following test task MCOU:

$$
\Gamma = \langle \mathbf{X}, \mathbf{Z}, \mathbf{F}(\mathbf{x}, \mathbf{z}) \rangle,\tag{3}
$$

where **x** = **[***x*1, *x*2**]** *<sup>T</sup>* <sup>∈</sup> **<sup>X</sup>** is the vector of control parameters; **<sup>z</sup>** <sup>=</sup> **[***z*1, *<sup>z</sup>*2**]** *<sup>T</sup>* <sup>∈</sup> **<sup>Z</sup>** is the vector of uncertain factors; and **F(x,z) = [***f* <sup>1</sup>**(x,z)**, *f* <sup>2</sup>**(x,z)]***<sup>T</sup>* is the vector performance indicator with components:

$$f\_1(\mathbf{x}, \mathbf{z}) = x\_1^2 + x\_2^2 - x\_1(z\_1^2 - z\_2^2),\tag{4}$$

$$f\_2(\mathbf{x}, \mathbf{z}) = x\_1^2 - x\_2^2 - x\_1(z\_1^2 + z\_2^2). \tag{5}$$

The restrictions were set in the form:

$$\mathbf{X} = \{0 \le x\_1, x\_2 \le 2\},\tag{6}$$

$$\mathbf{Z} = \{ \mathbf{z}^i, i = \overline{1, |\mathbf{Z}|} \, | \, 0 \le z\_1, z\_2 \le 2 \}. \tag{7}$$

It is required to maximize the components of the vector efficiency indicator on the set **X** × **Z** based on the vector maximin principle.

Figures 3–5 show the results of searching for a set of vector maximins using the HEA MCOU (elite points in each generation are highlighted in red). Algorithm parameters: population cardinality <sup>|</sup>**X**˜ <sup>|</sup> <sup>=</sup> 1000; <sup>|</sup>**Z**<sup>|</sup> <sup>=</sup> 1000; real coding and SBX-crossover were used.

**Figure 3.** Evolutionary MCOU algorithm, generation No. 1. Elite points in each generation are highlighted in red.

**Figure 4.** Evolutionary MCOU algorithm: generation No. 5.

Table 1 provides a comparative analysis of the running time of sequential and parallel evolutionary algorithms for solving the considered MCOU test task.


**Table 1.** Comparative analysis of sequential and parallel MCOU algorithms.

A comparative analysis shows that, with a small population size <sup>|</sup>**X**˜ | ≤ 500, the running time of the parallel evolutionary algorithm MCOU *tpar* is greater than or comparable to the running time of the sequential algorithm *tseq*. This is due to the fact that the parallel algorithm spends additional time preparing and transferring data to the GPU. However, with a further increase in the size of populations, the advantage of the parallel evolutionary MCOU algorithm in relation to the sequential analog increases. In particular, for <sup>|</sup>**X**˜ <sup>|</sup> <sup>=</sup> 100, 000, the running time of the parallel evolutionary MCOU algorithm is *tpar* ∼= 10−4*tseq*.

#### **5. Conclusions**

The formulation of the MP training problem was formalized as an MCOU problem, where the vector minimax principle was used for its solution.

A parallel implementation of a hierarchical evolutionary algorithm for searching for a set of vector minimaxes in the MCOU problem based on GPU and OpenCL technology is presented. The developed algorithm can be easily modified to solve the MCOU problem (1), (2), where the set of uncertain factors is infinite.

The results of the computational experiment on the test task show a significant advantage of the parallel GPU implementation of the developed co-evolutionary MCOU algorithm in relation to the sequential analog.

**Author Contributions:** Conceptualization and methodology, V.A.S.; software and validation, D.L.P. and P.P.R.; computational experiments, E.L.D., E.Y.K. and A.V.T. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Disclaimer/Publisher's Note:** The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
