**1. Introduction**

In general, criteria that minimize the number of false decisions concerning the truth of tested hypotheses are used to solve application problems that require statistical processing of texts as meaningful character sequences. The criteria based on the exact distributions

**Citation:** Melnikov, A.; Levin, I.; Dordopulo, A.; Slasten, L. Evaluation of Computer Technologies for Calculation of Exact Approximations of Statistics Probability Distributions. *Eng. Proc.* **2023**, *33*, 40. https:// doi.org/10.3390/engproc2023033040

Academic Editors: Askhat Diveev, Ivan Zelinka, Arutun Avetisyan and Alexander Ilin

Published: 21 June 2023

**Copyright:** © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

of reference statistics [1] have the greatest relative efficiency, but the calculation of exact distributions is a computationally laborious task [1,2], depending on the power of the alphabet *N* and the sample size *n* (the length of the text sequence).

We can reduce the computational complexity of the problem if we use exact approximations (limit distributions) instead of exact distributions which minimally reduce the efficiency of the used criteria [1,2]. As exact approximations of distributions of reference statistics, we use Δ-exact distributions [2] which differ from the initial ones by an arbitrarily small Δ. To calculate exact distributions, there are methods such as the calculation method of the exact distributions statistics of the Kolmogorov–Smirnov type [3,4], the well-known classical Monte Carlo method [5], etc. The most preferable is the second multiplicity method [2], which provides calculations of the exact approximations of distributions for the maximum values of the samples' parameters with the same resource. The second multiplicity method, based on solving systems of linear equations, has polynomial complexity, but its computational complexity for real application problems is still quite large [2], so calculation of exact approximations in a reasonable time using modern computational means is difficult. An evaluation of the required computing resources and the problem solution time by means of modern processors, graphics accelerators and FPGAs for the distribution parameters required in practice is given in [2]. In this paper, we consider an evaluation of the required resources and possible problem solution times for calculating exact approximations of probability distributions of statistics by means of promising computer technologies: quantum and photon computer systems.

#### **2. Setting of the Problem**

We consider probability distributions of statistics for an alphabet *AN* = {*a*1, ... , *aN*} with a power | *AN* |= *N* and its sample *n*. To calculate the exact approximations of statistics probability distributions *P*Δ{*SN*,*<sup>n</sup>* ≥ *c*}, which differ from the exact distributions *PT*{*SN*,*<sup>n</sup>* ≥ *c*} by a specified arbitrary small value Δ

$$\left| \mid P\_T \{ S\_{N,n} \ge c \} - P\_\Lambda \{ S\_{N,n} \ge c \} \mid \le \Delta \,, \tag{1} \right|$$

we use the second multiplicity method (SMM) based on the solution of a system of linear equations

$$\begin{cases} \mu\_0^{(v)} + \mu\_1^{(v)} + \dots + \mu\_n^{(v)} = N, \\ 1 \cdot \mu\_1^{(v)} + 2 \cdot \mu\_2^{(v)} + \dots + n \cdot \mu\_n^{(v)} = n \end{cases} \tag{2}$$

where *μ*(*v*) *<sup>i</sup>* is the number of characters that occurred *j* times in the sample *v* of the alphabet *AN* and *n* is the size of the sample *v*.

The SMM [2] is based on sequential enumeration of the vectors *<sup>μ</sup>*(*v*) <sup>=</sup> {*μ*(*v*) <sup>0</sup> , *<sup>μ</sup>*(*v*) <sup>1</sup> , ... , *<sup>μ</sup>*(*v*) *n* } and determination of whether each *μ*(*v*) is a solution of the system of linear Equations (2) or not. A detailed theoretical review of the use and implementation of the SMM is given in [2]. The SMM algorithm's complexity is defined by

$$\mathbb{E}\_{\mathrm{MVK}}(\mathbb{P}\_{\Gamma}\{\mathcal{S}\_{\mathrm{N},\mu} \ge \varepsilon\}) = L\_{\mu(\mathrm{N},\mu\,\prime)} \times \dots \times K\_{\mu(\mathrm{N},\mu\,\prime)} \cdot (\mathbb{S} \cdot (r+1) + 2(\mathrm{N} + r) + 3) \\ + 2 \cdot K\_{\mu(\mathrm{N},\mu\,\prime)} \cdot \log\_2 K\_{\mu(\mathrm{N},\mu\,\prime)} + 2 \cdot K\_{\mu(\mathrm{N},\mu\,\prime)} \tag{3}$$

where *Lμ*(*N*,*n*,*r*) is the reduced number of tested vectors of possible solution candidates of (2) with the limitations {*<sup>r</sup>* <sup>≤</sup> *<sup>n</sup>* | ∀*<sup>i</sup>* <sup>=</sup> *<sup>r</sup>* <sup>+</sup> 1, *<sup>n</sup>*, *<sup>μ</sup>*(*v*) *<sup>i</sup>* = 0} and *Kμ*(*N*,*n*,*r*) is the number of non-negative integer solutions of (2) with the limitations *r*. According to [2], the main complexity of (3) depends on the first term of the polynomial

$$L\_{\mu(N,n,r)} = (N+1)^{\min\left(\lceil n/N,r \rceil\right)+1} \cdot \frac{\left(\min\left(\lceil n,N \rceil,r\right)\right)!}{\left(n+\min\left(\lceil n,N \rceil,r\right)\right)!} \cdot \frac{(n+r)!}{r!} \tag{4}$$

It follows from (3) and (4) that the algorithmic complexity of calculating the exact approximations *CMVK*(*P*Δ{*SN*,*<sup>n</sup>* ≥ *c*}) is a polynomial which depends on both the parameters

of the sample *N* and *n*, and on the limitation parameter *r*, which also functionally depends on the sample's parameters and the accuracy Δ, i.e., *r* = *m*(*N*, *n*, Δ).

The most laborious part of the SMM is the procedure of calculating and testing solution candidate vectors. For practical problems, the power of the alphabet *N* belongs in the range from 128 to 256, and the sample sizes are in the range from 320 to 1280, when the required total solution time does not exceed 30 days. Therefore, to evaluate the algorithmic complexity [2], we specified the following samples as (the alphabet power, the sample size): (256, 1280), (128, 640), (128, 320) and (192, 320) with the accuracy Δ = 10<sup>−</sup>5. The algorithmic complexity and the required performance of the task with these sample parameters are given in Table 1.

According to Table 1, the computational complexity is in the range from 9.68 <sup>×</sup> 1022 to 1.60 <sup>×</sup> <sup>10</sup><sup>52</sup> operations, the average complexity is about 4.55 <sup>×</sup> 1025 operations, and it is necessary to test from 6.50 <sup>×</sup> <sup>10</sup><sup>23</sup> to 1.39 <sup>×</sup> 1050 vectors and to obtain from 4.67 <sup>×</sup> 1012 to 5.60 <sup>×</sup> <sup>10</sup><sup>25</sup> solutions.

The number of tested variables in (2) does not exceed (*r* + 1), i.e., it does not exceed 24 for the parameters given in Table 1. Accordingly, *<sup>μ</sup>*(*v*) <sup>=</sup> {*μ*(*v*) <sup>0</sup> , *<sup>μ</sup>*(*v*) <sup>1</sup> , ... , *<sup>μ</sup>*(23) *<sup>n</sup>* }, and all other variables are equal to zero: {*μ*(*v*) *<sup>j</sup>* <sup>=</sup> <sup>0</sup>|*<sup>j</sup>* <sup>=</sup> 24, ... , *<sup>n</sup>*}. The values of *<sup>μ</sup>*(*v*) *<sup>i</sup>* are integers and belong to the range {<sup>0</sup> <sup>≤</sup> *<sup>μ</sup>*(*v*) *<sup>j</sup>* <sup>≤</sup> <sup>23</sup>|*<sup>i</sup>* <sup>=</sup> 0, 23}. One *<sup>μ</sup>*(*v*) *<sup>j</sup>* is no less than 5 bits (2<sup>4</sup> < <sup>23</sup> < 25), and the whole vector *μ*(*v*), which contains 24 *μ*(*v*) *<sup>i</sup>* , is no less than 5 × 24 = 120 bit.

The important property of the method is its parallelization by data because any *μ<sup>i</sup>* and *<sup>μ</sup><sup>j</sup>* can be tested independently and concurrently when *<sup>i</sup>* <sup>=</sup> *<sup>j</sup>*.

**Table 1.** Characteristics of the calculation method of exact approximations for various parameters of samples.


#### **3. Use of General Purpose Processors to Calculate Exact Approximations of Distributions**

The ×86 processors with the traditional von Neumann architecture and 32-bit and 64-bit data processing compose the main and most widespread general purpose computing architecture for the design of cluster multiprocessor high-performance systems (MHPS) [6].

For calculating exact approximations, the performance *PCPU* of a CPU-based MHPS with unlimited scalability is

$$P\_{\rm CPUI} = N\_{\rm CPUI} \cdot K\_{\rm CPUI} \cdot H\_{\rm I\\_CPUI} \tag{5}$$

where *NCPU* is the number of processors; *KCPU* is the number of provided parallel threads; and *H*1\_*CPU* is the frequency of one processor.

To estimate the number of processors needed to solve the problem with time constraints, let us consider a hypothetical CPU that supports eight parallel threads at 3000 MHz, i.e., *KCPU* <sup>=</sup> 8 and *<sup>H</sup>*1\_*CPU* <sup>=</sup> <sup>3</sup> <sup>×</sup> <sup>10</sup>9. Then, to achieve the minimal performance (according to Table 1) of *PCPU* <sup>≥</sup> 1.75 <sup>×</sup> <sup>10</sup><sup>19</sup> op/s, it is necessary to use

$$N\_{CPU} \geq \frac{1.75 \times 10^{19}}{K\_{CPU} \cdot H\_{1\\_CPU}} = \frac{1.75 \times 10^{19}}{8 \times 3.0 \times 10^9} = 7.29 \times 10^8 \text{ .} $$

i.e., about 729 million hypothetical processors. The obtained number exceeds the number of cores (not the number of processors!) of the most high-performance supercomputers in the world (such as Fugaku [6,7] with 7 million cores and Sunway TaihuLight [6,7] with 10 million cores, included in TOP500 [6]) by 1.5–2 decimal orders.

To calculate exact approximations for all values of the samples' parameters {*N* = 2, 256, *n* = 1, 5*N*}, the number of required nodes is

$$N\_{CPII} \ge \frac{6.15 \times 10^{45}}{K\_{CPII} \cdot H\_{1\\_CPII}} = \frac{6.15 \times 10^{45}}{8 \times 3.0 \times 10^9} = 5.56 \times 10^{35} \text{J}$$

which cannot be achieved if we take into account the modern level of processor architectures and technologies in cluster MHPS designs. The performance of modern CPUs is insufficient for the solution of the considered problem. Furthermore, calculations of exact approximations cannot be provided if a computer system is based only on modern CPUs because hundreds of million CPUs are needed to solve the problem even in its minimal form.

#### **4. Use of Graphics Accelerator Technology to Calculate Exact Approximations of Distributions**

The development of GPU (Graphic Processing Unit) technology [7], originally designed to calculate 3D graphics in real time, has led to their application in high-performance computing. Modern graphics accelerators, containing thousands of special purpose cores, provide a high degree of parallelism and can perform tasks in a multithread parallel mode. For example, the standard GEFORCE RTX-3090 game graphics card contains 10,496 NVIDIA CUDA cores operating at 1.70 GHz (1.7 <sup>×</sup> <sup>10</sup><sup>9</sup> op/s) [8]. We can roughly define the performance of a GPU-based computer system *PGPU* as the following product

$$P\_{\rm GPI} = N\_{\rm GPI} \cdot K\_{\rm CP} \cdot H\_{\rm CP} \tag{6}$$

where *NGPU* is the number of graphics accelerators GPU in the system; *KCP* is the number of cores of each accelerator; and *HCP* is the clock frequency.

To provide *PGPU* <sup>≥</sup> 1.75 <sup>×</sup> <sup>10</sup>19, the system based on a GEFORCE RTX-3090 must contain no less than

$$N\_{RTX3090} \ge \frac{1.75 \times 10^{19}}{K\_{RTX} \times H\_{RTX}} = \frac{1.75 \times 10^{19}}{1.05 \times 10^4 \times 1.7 \times 10^9} = 9.80 \times 10^5$$

nodes, and the system based on an NVIDIA Quadro K6000 [9] with the performance

$$K\_{K6000} \cdot H\_{K6000} = 16.3 \times 10^{12} \text{ op/s}$$

of one videocard, must contain no less than

$$N\_{K6000} \geq \frac{1.75 \times 10^{19}}{K\_{K6000} \cdot H\_{K6000}} = \frac{1.75 \times 10^{19}}{1.63 \times 10^{13}} = 1.07 \times 10^6$$

nodes. Despite the difference in the properties of the considered graphics accelerators, the number required for calculation of exact approximations is estimated at about one million, which is impossible with the current level of development of graphics accelerator architectures and design technologies of GPU-based computer systems. Thus, due to the insufficient performance of modern graphics accelerators, it is impossible to use them as the only basis to calculate exact approximations.

#### **5. Use of Parallel Pipeline FPGA Technologies to Calculate Exact Approximations of Distributions**

The FPGA (Field Programmable Gate Array) technology combines the capabilities of parallel and pipeline computing. In contrast to computer systems with a fixed architecture designed on a CPU and GPU basis, it allows reconfiguration [10] of the architecture of an FPGA-based computer system to the architecture of the problem to be solved. Taking into account information dependencies in the structure of the application [10], it is possible to provide high real performance for labor-intensive problems in various fields of science and technology [11]. For example, the computational block (CB) of the latest Seguin system based on the UltraScale+ FPGAs (3U height, 96 XCVU9P-1FLGC2104E chips designed using 16 nm technology) achieves the performance of 240 Tflops (2.4 <sup>×</sup> <sup>10</sup><sup>14</sup> op/s) [12].

For the problem of calculating exact approximations, the performance of an FPGAbased reconfigurable multiprocessor computer system *PFPGA* can be roughly estimated as the product of the number of computational blocks (CB) *NCB*\_*FPGA* in the system and the performance of one CB *PCB*\_*FPGA*.

$$N\_{\rm CB\\_FPGA} \ge \frac{1.75 \times 10^{19}}{P\_{\rm CB\\_FPGA}} = \frac{1.75 \times 10^{19}}{2.4 \times 10^{14}} = 7.29 \times 10^4. \tag{7}$$

To calculate exact approximations for all values of the considered parameters of the samples {*N* = 2, 256, *n* = 1, 5*N*}, it is necessary to use

$$N\_{CB\\_FPGA} \ge \frac{6.15 \times 10^{45}}{P\_{CB\\_FPGA}} = \frac{6.15 \times 10^{45}}{2.4 \times 10^{14}} = 2.56 \times 10^{31}$$

computational blocks. Parallel pipeline FPGA technologies with the real performance of up to 10<sup>14</sup> operations per second for one computational block have the greatest potential for implementation of the second multiplicity method for the largest values of the sample parameters. However, a computer system, which is based only on FPGA technologies and implements the considered problem, cannot be easily designed because it requires a very large number of computational blocks.

According to the analysis of the capabilities of computer technologies to calculate exact approximations of distributions, none of the existing computer technologies can provide the required computing resources. To solve this computationally expensive problem during the specified time, it is necessary to analyze the capabilities of promising computer technologies such as quantum and photon computers.

#### **6. Use of Quantum Computer Technologies to Calculate Exact Approximations of Distributions**

Quantum computer technologies were proposed in the 1980s by a number of famous scientists, such as Richard F. Feynman [13], Paul Benioff [14], K.A. Valiev and A.A. Kokin [15] and Yu. I. Manin [16]. The main idea of quantum computing is the following: a quantum system of *q* concurrently operating qubits has 2*<sup>q</sup>* linearly independent states. According to the principle of quantum superposition, the state space of such a quantum register is a 2*q*D Hilbert space [17]. One operation on a group of *q* qubits is calculated immediately for all its possible values, unlike a group of classical bits, when only one current value exists. This ensures the maximum possible parallelization of data calculations and an increase in performance to 2*q*, which is called "quantum acceleration". Any object with two quantum states, such as polarization states of photons, electronic states of isolated atoms or ions, spin states of atomic nuclei, etc., can be a physical system implementing qubits. The structure of the quantum computer proposed by Russian scientists K.A. Valiev and A.A. Kokin [15] is shown in Figure 1. According to the principle formulated by R. Feynman [13], for any algorithm, it is possible to obtain an implementation on a quantum system, which will be no worse than its implementation on the classical von Neumann system. At the same time, it is shown [16] that "quantum acceleration" is not possible for any algorithm and for an arbitrary algorithm, the possibility of quantum acceleration is not guaranteed. A feature of quantum calculations is the probabilistic nature of the result of calculations, i.e., the result is true only with some probability.

Quantum computers were designed at Harvard University (51 qubit system) [18] and at the Polytechnic School of the University of Paris-Saclay (70 qubit system) [19]. The American company IonQ [20] announced the first commercial quantum computer based on ion traps, which contains 160 qubits, but for quantum operations only 79 qubits are used, and only 11 qubits are used for implementations of arbitrary quantum algorithms. In 2020, IBM [21] introduced the most powerful 64 qubit quantum computer. In Russia, the design of a general purpose quantum computer of 100 qubits is expected by 2024 [19]. At present, a quantum computer capable of operating with 120 bit data does not exist in the world.

Let us evaluate the possibility of calculating exact approximations of distributions on a quantum computer. The structure of the vector testing procedure of the considered problem [2] corresponds to the structure of the quantum computer (Figure 1), which allows us to expect a significant effect when solving the problem on quantum computers. According to the minimum evaluation, 120 bits are required to place the test vector candidate *μ*(*v*); therefore, the quantum computer must contain 120 qubits operating concurrently. Let us suppose that a quantum computer operating with 120 qubits (let us call it QC-120) exists, and the problem of valid results is solved. If exact approximations are calculated on QC-120, we obtain all {*μ*(1), *<sup>μ</sup>*(2), ... , *<sup>μ</sup>*(2120)} for the 120-bit test vector *<sup>μ</sup>*(*v*) <sup>=</sup> {*μ*(*v*) <sup>0</sup> , *<sup>μ</sup>*(*v*) <sup>1</sup> , ... , *<sup>μ</sup>*(*v*) <sup>23</sup> }, i.e., (2<sup>120</sup> <sup>∼</sup><sup>=</sup> 1.3292 <sup>×</sup> <sup>10</sup>36) possible solutions. Then, it is necessary to select *<sup>K</sup>μ*(*N*, *<sup>n</sup>*,*r*) real solutions. For *<sup>N</sup>* <sup>=</sup> 256 and *<sup>n</sup>* <sup>=</sup> 1280, we obtain *<sup>K</sup>μ*(256, 1280, 23) = 5.60075 <sup>×</sup> <sup>10</sup><sup>25</sup> from Table 1.

**Figure 1.** The structure of the quantum computer.

Thus, if we calculate the values of *i*-th possible solution, we obtain the SLE state {*μ*(*i*) , *μ*(*i*) ,..., *μ*(*i*) 2<sup>120</sup> }. To calculate the (*i* + 1)-th possible solution, we obtain the SLE state

{*μ*(1), *<sup>μ</sup>*(2), ... , *<sup>μ</sup>*(2120)} and read the (*<sup>i</sup>* <sup>+</sup> <sup>1</sup>)-th possible solution. To test all possible solutions and to obtain *Kμ*(*N*, *n*,*r*) real solutions, we need 2<sup>120</sup> accesses to QC-120, which drastically reduces the effect of concurrent calculations of 2120 possible solutions.

The need to check all the obtained solutions, the number of which corresponds to the dimension of the problem, does not allow achieving quantum acceleration and is a significant and fundamental limitation of the use of promising quantum computer technologies for calculating accurate approximations of distributions. The lack of a technical and technological base not only in the Russian Federation, but also in the world, is an additional, technological limitation that does not allow creation of a quantum computer system operating 120 qubits required to solve the problem of calculating exact approximations of statistical probability distributions. Therefore, the prospect of using quantum computer systems at the existing level of development of science and technology is quite modest for the considered problem, despite the potentially high performance. Perhaps the given evaluations for the considered problem can be revised with the development of quantum computing.

#### **7. Use of Photon Computers to Calculate Exact Approximations of Distributions**

Another relevant area for the development of promising computer facilities based on new physical principles is the design of computer systems that use the effects of interactions of coherent light waves generated by laser radiation and its carriers, i.e., photons [22,23].

The structure of the photon computer developed at the All-Russian Research Institute of Experimental Physics (Sarov) [22] is shown in Figure 2. The photon computer (Figure 2) consists of four large units: a unit for converting a task into a program for the photon computer (UCT), a laser radiation source (LRS), an input–output unit (IOU) and a photon processor (PP). In turn, the photon processor contains four processor elements which comprise arithmetic logic devices (ALU), control devices (CD) and switches (SW). The units of the photon computer are interconnected by electronic and optical channels, and the components of the processor elements are connected only by optical channels.

**Figure 2.** The structure of the photon computer.

The UCT transfers the photon program to the LRS which generates laser radiation and supplies it to the IOU, where it is divided into light rays according to the number of digits simultaneously supplied to the PE. The IOU generates a photon program and transmits it to a photon processor where calculations are performed by the processor elements. Light beams interact within the photon processor, and optical delay lines [23] perform synchronization. Within the photon processor, the PEs can be connected by optical channels to a multiprocessor environment of any topology [23]. A low power consumption and a high performance are the important advantages of photon computer systems.

It has been shown that the performance of a photon computing node can reach up to 10<sup>18</sup> op/s per 100 W of power consumption [24].

To achieve the required performance of the photon computer *PPHOTON*, it is necessary to use *NPH*\_*ND* nodes with the performance *PPH*\_*ND* = 10<sup>18</sup> op/s each:

$$P\_{PHOTON} = N\_{PH\\_ND} \times P\_{PH\\_ND} \,\tag{8}$$

The performance level *PPHOTON* <sup>≤</sup> 1.75 <sup>×</sup> <sup>10</sup><sup>19</sup> for calculating exact approximations (the parameters of the sample №3 in Table 1) is provided by

$$N\_{PH\\_ND} \ge 1.75 \times 10^{19} / P\_{PH\\_ND} = 1.75 \times 10^{19} / 1.0 \times 10^{18} \cong 17.5$$

nodes, i.e., no less than 18 photon computing nodes. To achieve the performance for calculating exact approximations of the whole spectrum of the parameters of the considered samples {*N* = 2, 256, *n* = 1, 5*N*}, it is necessary to use no less than

$$N\_{PH\\_ND} \ge 6.15 \times 10^{45} / P\_{PH\\_ND} = 6.15 \times 10^{45} / 1.0 \times 10^{18} = 6.15 \times 10^{27}$$

computing nodes. This is much less than in all technologies that were reviewed earlier.

Unlike quantum computer systems, there is no available information about existing prototypes of digital photon computer systems. According to most works [22–24], scientific teams simulate the functioning of individual nodes and then evaluate the possible parameters of the calculator. There are works of academician V. A. Soifer [25–27] and some other scientists in the field of analog photonics. There, each computing node is created for a task with certain parameters, and the technology of developing and creating an analog photon computing node for an arbitrary task requires a whole cycle of research and development work. Therefore, there are no active prototypes of photon computer systems, technological bases and/or commercially available components today. Due to the current development level of science and technology, in the next 5–7 years it is impossible to consider the use of photon computer systems to solve the problem of calculating exact approximations of statistical probability distributions, although potentially such computer systems will have very high performances and require the smallest number of low-power computing nodes. Perhaps, owing to the development of photon computer technologies, it will be possible to revise and improve the presented evaluations for the considered task in the next few years.

#### **8. Architecture-Independent Representation of the Exact Approximation Calculation Problem for Hybrid Computer Systems**

According to the reviewed computer technologies and their current level of development, a possible solution is to design hybrid or heterogeneous computer systems containing the computing nodes of modern architectures (such as general purpose processors, graphics accelerators and FPGAs) capable of solving the problem with the given parameters in a reasonable time. Since the calculation of distributions is carried out by one and the same method for all parameters of samples, the possibility of programming different computer architectures in a single loop, which is provided in the architecture-independent programming paradigm, is especially important for such a system [28]. Taking into account the possible use of promising architectures of quantum computers and/or photon computing nodes, we consider an architecture-independent representation of the problem of calculating exact approximations for hybrid computer systems as especially significant.

Such capabilities are provided by the architecture-independent aspect-oriented language Set@l, which allows the developer to focus their attention on parallelizing methods when solving a problem, and not on their technical implementation in the selected computer architecture. The Set@l language reduces the problem of software transfer between different configurations and architectures of the HCS to the creation of aspects (descriptions) of technical means, which describe the key points of parallelization of the method on these technical means. At the same time, the source program implementing the data processing method remains unchanged [28].

The language Set@l is based on the paradigm of aspect-oriented programming (AOP) [29], according to which the algorithm of an application problem and the peculiarities of its implementation are described in the form of separate program modules. The Set@l program represents the information graph of the computational problem in the form of sets, whose partitioning and typing specify different parallelization options and other aspects of the implementation of the algorithm. Unlike other programming languages based on the set theory, for example, the languages SETL, SETL2 and SETLX, the Set@l language uses typing of sets according to different criteria and allows operations with fuzzy collections in accordance with the alternative set theory [30].

The Set@l language provides separate descriptions of the algorithm and the peculiarities of its implementation on a computer system by the use of the AOP paradigm. According to this paradigm, the cross-cutting concern of the program, which causes the negative effects of code tangling and scattering, is presented in the form of separate program modules (aspects). The source code, implementing the main functionality of the program, contains the user's markup, which determines its interaction with aspects during translation or execution. Analyzing the markup and source code, the preprocessor translator forms a new executable cross-cutting concern program. As a rule, the use of AOP technology simplifies the development and further support of software and increases the adaptability of programs to various modifications.

To implicitly describe various methods of parallelizing algorithms, the language of architecture-independent programming Set@l provides classification of collections by the type of parallelism of their elements during processing. When the parallelism nature of the set's elements is clearly defined, the types "par" (parallel-independent processing), "seq" (sequential processing), "pipe" (pipeline processing) and "conc" (parallel-dependent processing by iterations) are used. However, in some aspects of the program, the type of parallelism of a number of sets cannot be determined uniquely, since the architecture of the computer system on which the algorithm will be implemented is unknown. In this case, a special type of "imp" (implicit) is used, and the typing of collections is refined in other aspects of the program using special syntactic structures. If the aspects of the Set@l program do not change the algorithm in the process of its adaptation to the computer system's architecture, then the solution to the problem can be presented according to the classical Cantor–Bolzano set theory. However, in some cases, it is reasonable to modify the algorithm in accordance with the peculiarities of its implementation on the computer system with a certain architecture. In such cases, some collections are fuzzy and are not sets, so they cannot be specified as objects of the classical set theory. Using the Set@l language, we can describe different implementations of the same algorithm in a singleaspect-oriented program. To do this, classification of collections by the definiteness of their elements is performed according to the alternative set theory of P. Vopenka. The type "set" (set) corresponds to the classical clearly distinguished collection of elements, the type "semi-set" (sm) corresponds to a fuzzy collection and the type "class" (cls) corresponds to a set of objects, the type and partition of which cannot be uniquely determined at a given level of abstraction. An example of a computational problem that requires the Set@l fuzzy collections for its description is the Jacobi SLAE algorithm [31]. Unlike other approaches to parallel programming of high-performance computer systems, the Set@l programming language specifies not only boundary cases of the algorithm implementation, but also a family of intermediate options. They cannot be distinguished from the point of view of procedural programming but provide continuity of the description of the calculation model. Owing to the use of the Set@l language, it is possible to synthesize many variants of the problem solution and to switch between its elements depending on the architecture and configuration of the computer system. The program of fast Fourier transform illustrates this possibility. According to the available memory of the computer system, complex coefficients W are calculated in advance and loaded from memory or calculated with the help of basic and auxiliary components [32].

Thus, the algorithm of the problem in the Set@l language is presented as an architectureindependent source code by means of aspect-oriented programming, set theoretical code representation and relational calculus. The peculiarities of the algorithm's implementation are represented as the separate aspects that define the division into subsets and typing of the main collections of the program. The program can be quickly ported between computer systems with different architectures and adapted to any changes in hardware resources due to the use of aspects of the processing method, architecture and configuration. Owing to the suggested approach to parallel programming of high-performance computer systems in the Set@l language, there are new prospects for the development of architecture-independent and resource-independent software.

The language Set@l has been successfully used to solve algorithmically complex and resource-intensive problems, such as solving SLAE by the Gaussian [28] and Jacobi [31] methods and implementing the fast Fourier transform algorithm [32] and others with the same structure.

Therefore, the use of the Set@l language for the calculating exact approximations of distributions on a hybrid computer system will significantly decrease the programming complexity of computing nodes of various architectures such as general purpose processors, graphics accelerators, FPGAs, quantum computers and/or photon computing nodes.

#### **9. Conclusions**

In this paper, we have analyzed the use of promising computer technologies to solve the computationally expensive problem of calculating Δ-exact approximations of statistical probability distributions by the second multiplicity method based on solving the SLAE of the second multiplicity. The method has polynomial complexity and allows parallelization into fragments by data. We have evaluated the possibilities of promising computer technologies based on quantum and photon computers for calculating distributions by the second multiplicity method. An analysis of the capabilities of quantum and photon computer technologies shows that they have great potential for solving this problem. However, at present, these technologies cannot provide a solution to the problem of calculating exact approximations of sample distributions with an alphabet power less than 256, a size less than 1280 characters and an accuracy of Δ = 10−5.

We have performed a theoretical analysis of quantum computer systems and their applicability for the problem, which showed that quantum acceleration is impossible. Thus, the current level of development of quantum computers is insufficient, and there are also no algorithms and criteria for choosing a suitable solution from a huge number of obtained solutions to the problem. In addition, the level of development of photon computers does not allow creating a computer with the required number of computing nodes.

**Author Contributions:** Conceptualization, A.M., I.L. and A.D.; methodology, A.M. and I.L.; validation, A.M., A.D. and L.S.; formal analysis, I.L.; investigation, A.M.; resources, A.M.; writing—original draft preparation, A.D. and L.S.; writing—review and editing, A.D. and L.S.; supervision, A.M.; project administration, I.L.; funding acquisition, I.L. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the Russian Foundation for Basic Research, project number 20-07-00545.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Disclaimer/Publisher's Note:** The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
