Analytical and Numerical Study of Information Retrieval Method Based on Single-Layer Neural Network with Optimization of Computing Algorithm Performance

Kostromitin, Konstantin; Melnikov, Konstantin; Nikonova, Dar’ya

doi:10.3390/math11173648

Open AccessArticle

Analytical and Numerical Study of Information Retrieval Method Based on Single-Layer Neural Network with Optimization of Computing Algorithm Performance

by

Konstantin Kostromitin

^*

,

Konstantin Melnikov

and

Dar’ya Nikonova

Department of Information Security, South Ural State University, Chelyabinsk 454080, Russia

^*

Author to whom correspondence should be addressed.

Mathematics 2023, 11(17), 3648; https://doi.org/10.3390/math11173648

Submission received: 26 June 2023 / Revised: 12 August 2023 / Accepted: 15 August 2023 / Published: 23 August 2023

(This article belongs to the Special Issue Mathematical Modeling of Engineering and Socio-Economic Processes and Systems)

Download

Browse Figures

Versions Notes

Abstract

:

This work presents a mathematical model of a fast-acting single-layer artificial neural network applied to the task of image reconstruction after noise. For research purposes, this algorithm was implemented in the Python and C++ programming languages. The numerical simulation of the recovery efficiency of the described neural network was performed for different values of the noise factor, the number of samples required to train elements in the sample and the dimensionality of the coupling coefficients, w. The study of the mathematical model of this neural network is presented; as a result, it is possible to identify its essence, to reduce the number of operations required to recover a single element and to increase recovery accuracy by changing the order of calculation of coupling coefficients, w.

Keywords:

single-layer neural network model; data recovery; noise; numerical simulation; coupling coefficients

MSC:

68T05; 68T07; 68T10

1. Introduction

Currently, mathematical models of artificial neural networks are actively used to solve a large number of different problems.

Neural networks have revolutionized the field of machine learning and artificial intelligence. Over the past few decades, significant advancements have been made in the development and application of neural networks, leading to remarkable achievements in various domains. This introduction will provide an overview of the accomplishments in neural networks and their implications for the field.

Neural networks, due to the structure and function of the human brain, consist of connected nodes (neurons) for the control and transmission of information. These networks are capable of learning complex patterns and relationships from vast amounts of data, enabling them to perform tasks that were previously challenging or impossible for traditional machine learning algorithms [1].

One of the major achievements in neural networks is their success in image and speech recognition tasks. Convolutional neural networks (CNNs) have demonstrated exceptional performance in image classification, object detection, and segmentation. Recurrent neural networks (RNNs) have made significant contributions to speech recognition and natural language processing [2].

Neural networks have also shown remarkable progress in other domains, including natural language processing, machine translation, recommender systems, and medical diagnosis. These advancements have been fueled by the availability of large-scale datasets, improved network architectures, and advancements in computational power [3,4].

The world of neural networks is constantly evolving, with the number of publications and research growing every day. Let us talk about the most interesting ones.

Ref. [5] introduces a novel approach to reinforcement learning that leverages unsupervised auxiliary tasks. The authors demonstrate how incorporating auxiliary objectives, such as predicting future states or generating synthetic data, can significantly improve the learning efficiency and generalization capabilities of reinforcement learning agents. The proposed method achieves high performance based on various parallel models.

The authors of ref. [6] propose Efficient Neural Architecture Search (ENAS), a fast and cost-effective method for generating automatic models. ENAS creates a large computer graph where each layer represents a neural network structure, thereby encouraging all developers to share their layers. The operator is trained using gradients to find the subset that maximizes the expected value for a given set of values. At this point, the model is trained on the selected fields to reduce the acceptance loss. The distribution of parameters between sub-models enables ENAS to exhibit strong performance using less GPU time than current methods do; in fact, the GPU time required for it is 1000 times less than that under Neural Architecture Search.

In recent years, the development of new methods for data analysis has received attention. One such technology is the artificial neural network (ANN). Artificial neural networks have many attractive theoretic properties; specifically, these include the ability to detect non-predefined relations such as non-linear effects and/or interactions. The benefits of these considerations come at the cost of reduced model interpretation. Many authors have analyzed the same data using all statistical methods (e.g., logistic regression or Cox regression) and ANN according to these characteristics [7].

Uncertainties associated with the simulation and prediction of solar PV system performance can be easily and efficiently addressed through an intelligent design approach. During the decade from 2009 to 2019, artificial neural networks (ANN), fuzzy logic (FL), genetic algorithms (GA) and their hybrid models became intelligent tools for solar PV system prediction and simulation. Furthermore, there is no comprehensive review on the forecasting and modeling of solar photovoltaic systems using ANN, FL, GA and their hybrid models from this decade [8].

In ref. [9], the authors focus on neural networks that can learn to control behavior according to leadership goals and learn how to create management strategies or goals. The essence of the operation management system consists of four parts, namely the forecasting network, operating production network, control network and optimization network. This method and how to view it are described and explained in this article.

The authors provide an overview of recent developments in deep learning (RL). This paper starts with the concepts of deep learning and reinforcement learning, including test beds. It then discuss deep Q-networks (DQNs) and their extensions, default methods, process optimization, costs, and planning. Next, we talk about attention and memory, unsupervised learning, and learning to learn. It will discuss various applications of reinforcement learning, including gaming (notably AlphaGo), robotics, communication systems (also known as chatbots), machine translation, speech recognition, neural network design, personal web applications, healthcare, finance, and music production. Authors also cites unreviewed topics/papers. After listing the RL project, the author closes with a comment [10].

High-level data can be converted into low-level code by training a multilayer neural network with a small central layer to generate high-level vectors. Gradient descent can be used to tune the weights in an “autoencoder” network, but only if the initial weights are close to the optimal solution. They describe an efficient weighting algorithm that allows deep network coders to learn low-level codes that outperform principal component analysis as a tool for data dimensionality reduction [11,12].

The book “Neural Networks and Deep Learning” covers old and new models of deep learning. The main topic is the theory and algorithm of deep learning. The concepts and algorithms of neural networks are important for understanding the fundamentals in order to understand the basic concepts of neural architecture design in various applications. Why are neural networks effective? In what ways do they perform better than random machine learning does? Does depth matter? Why is it so difficult to train neural networks? What are the problems? The book also covers a wide range of applications to better understand how neural architectures can be used to solve various problems, including applications in various fields such as dynamical systems, machine learning, image interpretation, image segmentation, dynamic game learning, and information analysis [13].

Since the article is a study of the mathematical model of ANN, within the framework of the literature review below, in the introduction, key works on elementary ANN models are presented, starting from the time of discovery and the first studies.

Minsky and Papert’s book is the first example of mathematical analysis aimed at uncovering the true limits of a class of computing systems that can be viewed as models of the brain. Now, new mathematical tools, physicists’ interest in the theory of random objects, new views and mental models of brain function, and the development of fast computers capable of simulating networks of automata have given new importance to perceptrons. The authors observe a big data problem related to communication: the difficulty of understanding exactly how individual “things” and “agents” appear in a network. Advances in this area can be linked to what the authors call “social thinking” [14].

Rosenblatt’s work was supported as part of an internal research program by Cornell Aeronautical Laboratory, Inc. The concepts discussed have their origin in some of the author’s independent studies in the field of physiological psychology, in which the goal was to formulate an analogue of the brain that is useful in analysis. As a result, the initial concept of the perceptron was obtained—a recent product of this research program; further efforts of the author were directed toward establishing the technical and economic feasibility of the perceptron [15].

Using the tools of critical theory, Stephen Judd proposes a comprehensive definition of participatory research in interaction networks. His work carefully reveals the computational challenges of training neural networks and examines how certain design principles can or cannot alleviate these problems. When more neurons are used, new questions arise and provide new insights into strategies that lead to the creation of artificial and biological connections. He accepts new scientific theories. His work outlines key ideas, outlines general learning problems and looks at the computational challenges of tasks, reviews current learning theory, describes textbook learning models, and more. With alternative models, other than interaction models, research begins. The following chapters establish the coherence of e-learning contexts, explain the implications of this coherence, and present several sequences that have implications for different specific contexts. Judd discusses the problem of shallow networks and the unique characteristics of the family, discusses the problem of the ability of neural networks to maintain stability, and summarizes the results, implications, and implications [16].

In ref. [17], the authors describe a new method for the learning, training and networking of neuronal units. The method reduces the difference between the actual network vector and the target vector by adjusting the correlation measure in the network. Internal “hidden” units that do not have partial inputs or outputs become important for learning due to changes in weights, and the behavior in the problem is captured through the interaction of these units. The ability to create new features differentiates reduction from earlier simpler methods such as the perceptron-switching process [17].

For a more general introduction to the topic of neural networks, the book by Simon S. Haykin was considered; it is a comprehensive guide to the mathematical foundations of neural networks. The book covers a wide range of topics, including supervised and unsupervised learning, deep learning, optimization algorithms, and various types of neural networks such as feed-forward, recurrent, and convolutional networks [18].

Kevin P. also provides a comprehensive introduction to machine learning. Murphy uses probabilistic models and assumptions as methods of connection. Combining breadth and depth, this overview provides background information on topics such as probability, optimization, and parallelism, and discusses recent developments in the field, including standard deviation, stable L1, and deep learning [19].

The practical aspects of implementation based on modern libraries were considered in studies in the book by Aurélien Géron.

This bestselling book uses simple examples, small projects, and Python programming (Scikit-Learn, Keras, and TensorFlow) to help readers understand the concepts and tools for building intelligent systems. Author Aurélien Géron explores a variety of approaches, from linear regression all the way to deep neural networks [20].

In conclusion, this introduction highlights the diverse range of publications and research endeavors focused on neural networks. These studies span various domains, from incorporating unsupervised auxiliary tasks to improve reinforcement learning, to automating the design of efficient neural architectures for image classification, and applying generative adversarial networks for video synthesis. Each of these contributions showcases the continuous advancements and innovations in the field of neural networks, pushing the boundaries of what is possible in machine learning and artificial intelligence. By exploring new techniques, algorithms, and architectures, researchers strive to enhance the learning capabilities, efficiency, and generalization of neural networks, leading to improved performance and practical applications in a wide array of domains. As we delve further into the articles linked above, we gain deeper insights into the specific approaches and findings that contribute to the overall progress of neural network research.

One of the fields of application of the ANN is the task of data recovery, which is subjected to distortion as a result of external interference during transmission through communication channels, as well as that received from sources of analog nature.

In addition, in some cases, a critical parameter of the system is its performance, for example, in solving computational problems in real-time systems, such as process control systems, automation, SCADA systems, tomographs and others.

One area of using of ANN is machine vision. Machine vision is the ability of computers to extract information and meaning from images and videos. With the help of neural networks, computers can distinguish and recognize images in the way that humans do. Machine vision is used in several fields, such as visual recognition in unmanned cars so they can react to road signs and other road users, content moderation to automatically remove unsafe or inappropriate content from image and video archives, facial recognition to identify people and recognize attributes such as open eyes, glasses and facial vegetation, and image labeling to identify brand logos, clothing, protective gear and other image details. Also, it can be used in natural language processing (NLP) tasks. NLP is the ability to process natural, human-generated text. Neural networks help computers extract information and meaning from textual data and documents. NLP has such applications as automated virtual agents and chatbots.

In connection with the given tasks and working conditions, the application of a fast algorithm of data recovery on a set of initial samples seems to be feasible and have applied value.

The purpose of this work is to investigate and improve an original fast algorithm for recovering information distorted by noise based on the work in [21], where a numerical study of the influence of noise level, the number and size of samples and the dimensionality of the coefficient w on the recovery efficiency is carried out.

The analytical study of the restoration algorithm was carried out, as a result of which it was possible to reveal the mechanism of operation of this method, to find a matrix-free method of data restoration, to improve the accuracy of restoration taking into account the preservation of the values of weight coefficients on the main diagonal of the matrix W.

2. Materials and Methods

This work presents a study of the ANN algorithm for recovering noisy information, consisting of the stages formatting, training, noising and recovery from [21].

2.1. Formation of Samples for ANN Training

To test the working algorithm from [21], the initial samples were formed in the form of two-dimensional arrays filled with the values −1 and 1. For clarity, they were selected in a pattern similar to the Cyrillic letters “E” (Figure 1), “O”(Figure 2) and “Ш” (Figure 3):

2.2. ANN Training on the Generated Samples

The ANN training procedure is performed by calculating the weight coefficients,

w_{i j}

, characterizing the connection of neurons (1). Moreover, within the framework of the model [21], it is assumed that this coefficient is equal to zero for the same neuron.

w_{i j} = \{\begin{matrix} \sum_{k = 1}^{m} a_{i}^{k} a_{j}^{k}, i \neq j, i, j = \bar{1, n} \\ 0, i = j \end{matrix}

(1)

where

w_{i j}

—weight value of the connection between neurons i and j;

n—number of neurons in ANN;
m—number of samples used to train the network;
$a_{i}^{k}$ — $i$ -th element of the $k$ -th training sample.

When calculating the coefficients,

w_{i j}

, two-dimensional tables with training samples are converted into strings, after which they are transposed in the original algorithm and the expressions

a_{i}^{k} a_{j}^{k}

are calculated for each sample. Further, summation is performed over all m samples.

As a result, a matrix of weights, W, with dimension n*n is formed. In this case, the value n = 100—according to the total number of neurons in the samples of the letters “E”, “O”, and “Ш”.

A graphical representation of the overall matrix of coupling coefficients and separate ones for the three samples are presented in the figures (Figure 4, Figure 5, Figure 6 and Figure 7).

2.3. Sample Noise and Recovery

For the algorithm of obtaining a noisy image, one of the original samples is selected, then (num * n) times the node state is changed from −1 to 1 if its value is −1, otherwise the value does not change. The value num is set during testing and determines the final level of sample noisiness.

The restoration of sample element

a_{i}^{r}

is defined by the following expression (2):

a_{i}^{r} = f (\sum_{j = 1}^{n} w_{j i} a_{j} (t))

(2)

where f is the activation function (3):

f (x) = \{\begin{matrix} 1, f o r x > 0 \\ - 1, f o r x \leq 0 \end{matrix}

(3)

Thus, to perform the recovery operation of a single element it is necessary to form a matrix of weight coefficients, W, of size n*n elements, and then conduct its backward reduction to the recoverable matrix of n elements.

For increasing the dimensionality of neuron connectivity coefficients, w, with this approach, we will need a proportional increase in the dimensionality of matrix W. For example, to connect three elements of ANN it is necessary to calculate the

w_{j i k}

coefficients for matrix W, containing n*n*n elements, which will significantly increase the computational complexity of the algorithm.

The computational scheme is presented in Figure 8.

3. Results

3.1. Results of Numerical Investigation of the Reconstruction Algorithm

Recovery testing was carried out with a sequential increase in the noise factor of 1% from 0% to 100%. In total, 100 tests were carried out, in each of which the percentage of deviations of the restored sample from the original was calculated and averaged (Figure 9).

Based on the data obtained, the following conclusions can be drawn:

The algorithm at any level of noise is able to recover at least 60% of the data for the samples.

The sample of the letter “Sh” with a high level of noise is the worst-restored, but with an average level of noise, it has the highest average recovery percentage compared to that of other samples. This is due to the large amount of useful information.

In general, all symbols recover equally well at 35–60% noise and have significant recovery errors at noise percentages less than 20 and those greater than 65.

The choice of elements subjected to noise has a significant effect on the result of restoration. With the same level of noise, with the random selection of different points, the results of restoration can be radically different.

3.1.1. Recovery Error Analysis

Sample letter «E» (typical errors are shown in Figure 10):

With a low percentage of noise (0–30%), the algorithm restores the image with an error of 1;

At a noise level from 30% to 65%, as a result of the algorithm, the initial letter «E» is often successfully restored;

At 70% noise, the algorithm restores the image with an error of 2; at 80% noise, it restores the image with an error of 3; at 85% noise, it restores the image with an error of 2, at 90% there is an erroneous recovery of another sample, “Letter O”; at 90% noise, it restores the image with an error of 3; at 100% noise, it restore the image with an error of 4 (inverted error 1).

Sample letter «O» (typical errors are shown in Figure 11):

With a low percentage of noise (0–25%), the algorithm restores the image with an error of 1;

With a noise level of 25% to 70%, as a result of the algorithm, the initial letter “O” is often successfully restored;

At 75% and 80% noise. it is almost possible to restore the sample with minor artifacts; up to 100% noise the sample is restored with an error of 2, and at 100% noise, the sample is restored with an error of 3 (inverted error 1).

Sample letter W (typical errors are shown in Figure 12):

Up to 15% noise, there is an error of 1;

From 20% to 80% as a result of the algorithm, the initial letter Ш is often successfully restored;

At 70% noise, the restoration is similar to that with an error of 2;

At 85–95% noise, there is an error of 3 and an error of 4.

At 100% noise, there is an error of 5 (inverted error 1).

Based on the results of this study, the following conclusion can be drawn:

Under low noise, the algorithm often does not work correctly, since all elements are equal to it at the time of training, and it draws an image that is common to all samples.

With medium noise, as a result of the algorithm, the initial sample is often successfully restored. With high noise, recovery occurs with errors, and sometimes restores erroneous samples, since in noise it is impossible to uniquely determine the original sample among the many samples on which the model was trained.

3.1.2. Influence of the Number of Samples on the Recovery Result

For this study, three additional data samples with a matrix size of 10 × 10 were introduced. The original samples have not been changed, and their graphical and numerical representation can be seen in Section 2.2. We give a graphical and numerical representation of the embedded samples below (Figure 13).

To build a graph for each of the six samples, the percentage of recovery was determined at various noise levels. The data obtained are shown in Figure 14.

Based on the results obtained, it can be concluded that with an increase in the number of initial samples, the accuracy of restoring each sample has decreased. Namely, for sample E, from 50 to 60% of noise, the program produces the letter «E» that is underdrawn to varying degrees, and after 60%, various artifacts begin to appear during restoration or other samples begin to be restored.

For the sample «O» with up to 50% noise, the program produces the letter «O», which is underdrawn to varying degrees, and at 55–65% noise, the recovery percentage is large, and almost always gives the correct answer, though sometimes a couple of elements are missing, and then it gives a strongly underdrawn sample.

With this set of samples, the algorithm works best with the «Γ» sample, restoring about 80% of the information for any percentage of noise. Worst of all, with the «Ш» sample, with a noise of about 55%, only 25% of the information can be restored. This is due to the overlap of other samples. The larger the area of the object that needs to be restored, the worse the restoration is.

3.1.3. Influence of Sample Matrix Sizes on Signal Recovery

For this study, the initial data samples were changed—their size was increased to that of a 40 × 40 matrix (Figure 15).

To construct a graph with both standard and enlarged samples, several runs of the program were performed for each sample with different noise levels and the final recovery result was obtained.

A graph for standard matrix sizes is shown in Figure 16, and a graph for enlarged ones is shown in Figure 17.

Based on the data obtained, the following conclusions can be drawn:

The restoration of enlarged samples is generally better; the completion of the restoration of the sample (100%) occurs at noise values from 25% to 80%, with a complete restoration of the original samples in the noise range from 50% to 65%.

This is due to the low number of samples and the large amount of data being recovered, as the data have less of an overlap in the weighting matrix.

3.1.4. Influence of the Dimensionality of the Matrix of Weight Coefficients, w

Consider the mathematical model of the neural network. Consider expression (1) for finding weight coefficients. If the dimensionality of the matrix of weight coefficients, w, increases, we obtain the following expression (4).

w_{i j l} = \{\begin{matrix} \sum_{k = 1}^{m} a_{i}^{k} a_{j}^{k} a_{l}^{k}, i \neq j \neq l, f o r i, j, l = \bar{1, n} \\ 0, f o r i = j o r i = l o r j = l \end{matrix}

(4)

For each sample, x, a three-dimensional matrix is created by multiplying the elements at positions

i, j, l

(w_{t} [i] [j] [l] = x [i] * x [j] * x [l])

, and then the resulting three-dimensional matrices are added to create the final matrix of weights: (

w [i] [j] [l] = w_{t 0} [i] [j] [l] + w_{t 1} [i] [j] [l] + \dots + w_{t n} [i] [j] [l]

).

To evaluate the effectiveness of increasing the dimensionality of the coefficients, w, of the weight matrix, the noise is increased in steps of 5%, and then the recovery algorithm is executed and the number of errors is calculated. As in the previous cases, the recovery percentage is determined by a character-by-character comparison of the unnoised sample with the recovered one (Figure 18).

It is worth noting that these calculations are much more demanding on computer performance than the standard weighting factor matrices are, since approximately 4,000,000 computational operations are required, compared to the standard 40,000 for the standard matrix view.

However, the recovery results will be comparable to those of the original algorithm. Successful recovery occurs in the noise range of 50–70%. Significant artifacts are also noticeable, mainly due to the odd dimensionality of the matrix, inverted restoration of the sample occurs. This problem can be solved by counting the amount of useful and useless information and comparing it during the playback stage after applying the threshold function. If there is more useful information than useless information, it is necessary to invert the output, and thus the output information will be reconstructed more accurately. Also, if the recovery percentage is 0 (no value matches the sample), then inverting the output will produce a 100% recovery.

However, for some samples within the proposed algorithm, the recovery is more successful than that for others. Namely, the letter «O» was restored in the range of 0–80%. It was also erroneously recovered when trying to recover other samples with strong noise.

The analysis revealed that the accuracy of image reconstruction did not increase, while the computational requirements became many times higher. Thus, increasing the dimensionality of weighting coefficient matrices, w, has no effect on the quality of problem solving.

For the downsizing case’s analysis, expression 4 for finding the weight coefficients will take the following form (5).

w_{i j} = \sum_{k = 1}^{m} a_{i j}^{k}, f o r i = \bar{1, n}

(5)

To analyze recovery efficiency, 1000 test runs were performed with a noise step of 5%, and then the average recovery percentage for each level of noise was calculated (Figure 19).

When using this algorithm for all the tests, the original sample was not restored to 100%. Often, the algorithm produces noise that is not similar to that of the desired sample. In addition, there is a tendency for the percentage of recovery to decrease proportionally to the level of noise.

The presented modification of the algorithm is also not applicable to improve the considered neural network, since it is not able to perform the required recovery tasks.

3.2. Investigation of the Mathematical Model

3.2.1. Hypothesis Statement

For this study, let us change the formulation of the problem

(1): Samples will be noisy in any way, and the data may be distorted, losing information;
(2): The main diagonal of the matrix of weight coefficients is not zeroed, because it increases the uncertainty of sample recovery and degrades the quality of the algorithm.

3.2.2. Investigation of the Hypothesis: The Possibility of Obtaining a Reconstructed Image without Forming a Matrix of Weight Coefficients

Formulation of Hypothesis: «A matrixless way to calculate

a_{i}^{r}

exists»

Substitute the expression for the matrix of weighting coefficients (1) into the expression for sample element recovery (2):

\begin{array}{c} a_{i}^{r} = f (\sum_{j = 1}^{n} w_{j i} a_{j} (t)) = f (\sum_{j = 1}^{25} {a_{1} a}_{j} a_{j}^{n}) = \\ = f (a_{1} * a_{1} * a_{1}^{n} + a_{1} * a_{2} * a_{2}^{n} + a_{1} * a_{3} * a_{3}^{n} + \dots + a_{1} * a_{25} * a_{25}^{n}) = \\ = f (a_{1} (a_{1} * a_{1}^{n} + \dots + a_{25} * a_{25}^{n}) \end{array}

(6)

Thus, we can see that there is a constant expression (7),

a_{1} * a_{1}^{n} + \dots + a_{25} * a_{25}^{n},

(7)

which does not change depending on the data being changed. The sum of products shows the ratio of coincidence of values in cells in the original and noisy samples. If the biggest part coincides, the value will be positive (the original image is restored), otherwise it will be negative (the inverted image is restored).

The sign and value of the sum vary depending on the ratio of noisy to original values of the corresponding elements of the samples.

The weighting matrix can be omitted on the grounds that expression (7) is a constant for the original and noisy sample, which can be calculated separately from the values of the original signal, Q.E.D.

Block diagrams of the original [21] and improved recovery algorithms are presented below in Figure 20.

3.2.3. Identifying the Nature of the Recovery Mechanics

Let us investigate the constant expression (6) and find its mathematical meaning. Considering the case for several samples, the multiplication of elements will look as follows:

a_{1} = f (\sum_{j = 1}^{m} a_{j}^{n} (\sum_{k = 1}^{n} a_{1}^{k} a_{j}^{k})))

(8)

where n is the number of samples.

Then, for three samples the multiplication will look as follows (9):

a_{1} = f (\sum_{j = 1}^{36} (a_{1}^{1} a_{j}^{1} + a_{1}^{2} a_{j}^{2} + a_{1}^{3} a_{j}^{3}) a_{j}^{n})

(9)

if

n = 3

, then

a_{1}^{3} = f (\sum_{j = 1}^{36} (a_{1}^{1} a_{j}^{1} a_{j}^{n} + a_{1}^{2} a_{j}^{2} a_{j}^{n} + a_{1}^{3} a_{j}^{3} a_{j}^{n})

.

The multiplication of

a_{j}^{k} a_{j}^{ω}

estimates the degree of sample matching (Pearson correlation).

Thus, we can conclude that for any sample, the contribution will be determined by the sum of the products of the coefficients of values in the cells of the original and noisy sample in the range [−n,n].

3.2.4. Investigation the Influence of the Dimensionality of the Matrices w

Consider the cases of multidimensionality for matrices of weight coefficients in the case where the dimensionality of the connection of the weight coefficients is increased to four (10):

\begin{array}{c} a_{1}^{r} = f (\sum_{j = 1}^{n} \sum_{k = 1}^{n} a_{1} a_{j}^{1} a_{k}^{1} a_{k}^{n}) = \\ = f (a_{1} (a_{1}^{1} (a_{1}^{1} a_{1}^{n} + \dots + a_{25}^{1} a_{25}^{n}) + a_{2}^{1} (a_{1}^{1} a_{1}^{Ш yM} + \dots + a_{25}^{1} a_{25}^{n}) + a_{25}^{1} (a_{1}^{1} a_{1}^{n} + \dots + a_{25}^{1} a_{25}^{n}) = \\ = f (a_{1} (a_{1}^{1} + a_{2}^{1} + \dots + a_{25}^{1}) (a_{1}^{1} a_{1}^{n} + \dots + a_{25}^{1} a_{25}^{n})) \end{array}

(10)

It can be noted that

a_{1}^{1} + a_{2}^{1} + \dots + a_{n}^{1}

is a constant expression.

Then, for the four-dimensional case, by analogy, the element recovery will be determined by the following expression (11):

a_{1}^{r} = f ({{(a_{1} (a}_{1} + \dots + a_{25})}_{j} * {{(a}_{1} + \dots + a_{25})}_{k} * {{(a}_{1} a_{1}^{n} + \dots + a_{25} a_{25}^{n})}_{m}

(11)

Thus, multiplication by a constant multiplier before the expression takes place. This multiplier has no effect on the nature and rate of recovery. As a consequence, it does not make sense to increase the dimensionality.

In the case of problems of this kind, we can avoid forming weighting factor matrices by multiplying the expression for the recovery function by a constant multiplier of the form n, where

n^{k}

is the number of elements, and

k

is the difference between the new and the initial value of the dimensionality of the weighting factor relationship, w.

3.2.5. Effect of Zeroing of Coefficient, w, on the Main Diagonal

To clearly demonstrate the work of the algorithm, we present it in an expanded form: Figure 21 shows its work with the zeroing of the main diagonal, and Figure 22 shows its work without zeroing. One sample is used in the analysis.

In the block “INITIAL SAMPLE”, the original sample is entered.

In the “NOISED SAMPLE” block, a noisy sample is entered.

The “RESTORED SAMPLE” block generates a restored image.

3.2.6. Reconstructing a Noisy Image for Several Samples

An example of reconstruction for three samples is shown in Figure 22. In the upper part, three initial samples (“First sample”, “Second sample”, and “Third sample”), the noised sample and restored sample are represented.

In the editable block, “Noised Sample”, noising is entered, and the resulting item “Restoration” displays the result of restoration.

All samples in the table according to their numbers (1, 2, and 3) are converted into a line for a better representation of the signal sequence. The line “Noise” similarly reflects the noise in line form.

The lines “1—Noise”, “2—Noise”, “3—Noise” reflect the result of the multiplication of samples by the noised signal. The last column of these lines (Sum) shows the correlation of this or that sample in order to analytically represent the influence of this or that sample on the recovery result.

Also, Figure 23 shows the process of reconstruction of the noisy sample. The resulting noisy image differed from the three original images, which were restored using the algorithm optimized to the third sample.

4. Conclusions

This work presents a study of a mathematical model of a high-speed single-layer artificial neural network applied to the problems of information recovery as a result of interference of various natures—for example, as a result of the influence of electromagnetic noise during transmission and the receipt of data via analog channels, which is an urgent task in communication systems.

The formation of samples of the considered model of the neural network in order to visualize and simplify the study was implemented in the form of two-dimensional square arrays of various sizes with signal values of ±1. Next, the weight coefficients of interaction between individual nodes (neurons) of the samples formed for training were calculated by calculating the weight coefficients, w. After the completion of the training stage for all samples, a noisy sample was generated and further restored based on the weight coefficients, w.

The implementation and study of the described algorithm was carried out in the programming languages Python and C++, as a result of which the numerical modeling of the recovery efficiency of the described neural network was carried out for various values of the ANN parameters: (1) noise factor values, (2) the number of samples for training, (3) elements in the sample, and (4) dimensions of the coupling coefficients, w.

A study of the mathematical model of the described neural network is presented, as a result of which a number of key results were obtained:

(1): The essence of this class of neural networks and the mechanism of influence of the original and noisy samples on the final result of information recovery in the nodes are identified.
(2): A method has been analytically proven to improve the speed of the information recovery algorithm, which allows this operation to be carried out without the formation of an intermediate matrix of weight coefficients, W. Using a matrix-free image recovery method will allow us to move from a power exponent of the increase in the number of operations with an increase in the dimension of the coupling coefficient, w, toward multiplying the sums of products of elements of the original and noisy samples by a constant factor.
(3): It has been proven that zeroing the weight coefficients located on the main diagonal of the matrix W worsens the result of data recovery due to the formation of artifacts on the resulting sample, which is why it is proposed to calculate them similarly to other elements.
(4): It is proved that an increase in the dimension of the coupling coefficients, w, does not lead to an improvement in the quality of data recovery and the initial dimension of the model has an optimal value.
(5): The calculation algorithm has been improved by eliminating matrix transposition operations by using a string representation of data.

Author Contributions

Conceptualization: K.K.; methodology: K.K., D.N. and K.M.; software: D.N. and K.M.; validation: K.K., D.N. and K.M.; formal analysis: K.K., D.N. and K.M.; investigation, K.K., D.N. and K.M.; resources: D.N. and K.M.; data curation: K.K., D.N. and K.M.; writing—original draft preparation: D.N.; writing—review and editing: K.K.; visualization: D.N., K.M.; supervision: K.K.; project administration: K.K.; funding acquisition, K.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by RSF (project no. 22-71-10095).

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to privacy.

Conflicts of Interest

The authors declare no conflict of interest.

References

Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; MIT Press: Cambridge, MA, USA, 2016; pp. 27–54. [Google Scholar]
LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
Schmidhuber, J. Deep Learning in Neural Networks: An Overview. Neural Netw. 2015, 61, 85–117. [Google Scholar] [CrossRef] [PubMed]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012, 25, 1097–1105. [Google Scholar] [CrossRef]
Jaderberg, M.; Mnih, V.; Marian Czarnecki, W.; Schaul, T.; Leibo, J.Z.; Silver, D.; Kavukcuoglu, K. Reinforcement Learning with Unsupervised Auxiliary Tasks. arXiv 2016, arXiv:1611.05397. [Google Scholar]
Hieu, P.; Melody, G.; Barret, Z.; Quoc, L.; Jeff, D. Efficient Neural Architecture Search via Parameters Sharing. In Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2018; pp. 4095–4104. [Google Scholar]
Daniel, J.S. Conference on Prognostic Factors and Staging in Cancer Management: Contributions to Artificial Neural Networks and Other Statistical Methods. Medicine 2001, 91, 1589–1697. [Google Scholar]
Kunal, S.G.; Simon, J.; Lee, M.-Y. Special Issue: Smart Energy Technologies. Int. J. Energy Res. 2021, 45, 6–35. [Google Scholar]
Pao, Y.H.; Phillips, S.M.; Sobajic, D.J. Neural-net computing and the intelligent control of systems. Int. J. Control 1992, 56, 263–289. [Google Scholar] [CrossRef]
Li, Y. Deep Reinforcement Learning: An Overview. arXiv 2018, arXiv:1701.07274. [Google Scholar]
Kriesel, D. Neural Networks. Available online: https://www.dkriesel.com/_media/science/neuronalenetze-en-zeta2-2col-dkrieselcom.pdf (accessed on 9 June 2023).
Hinton, G.; Salakhutdinov, R. Reducing the Dimensionality of Data with Neural Networks. Science 2006, 313, 504–507. [Google Scholar] [CrossRef] [PubMed]
Charu, C. Aggarwal, Neural Networks and Deep Learning A Textbook; Springer: Berlin/Heidelberg, Germany, 2018; pp. 1–493. [Google Scholar]
Minsky, M.; Papert, S. Perceptrons: An Introduction to Computational Geometry; MIT Press: Cambridge, MA, USA, 1969; Volume 258, pp. 1–311. [Google Scholar]
Rosenblatt, F. The Perceptron, a Perceiving and Recognizing Automaton Project Para; Cornell Aeronautical Laboratory: Buffalo, NY, USA, 1957; pp. 1–33. [Google Scholar]
Stephen, J. Neural Network Design and the Complexity of Learning, A Bradford Book; A Branford Book: West Yorkshire, UK, 1990; pp. 1–170. [Google Scholar]
Rumelhart, D.E.; Hinton, G.E.; Williams, R.J. Learning representations by back-propagating errors. Nature 1986, 323, 533–536. [Google Scholar] [CrossRef]
Haykin, S. Neural Networks: A Comprehensive Foundation; Prentice Hall: Upper Saddle River, NJ, USA, 1998; pp. 1–842. [Google Scholar]
Kevin, P. Murphy, Machine Learning: A Probabilistic Perspective (Adaptive Computation and Machine Learning Series); The MIT Press: Cambridge, MA, USA, 2012; pp. 1–1104. [Google Scholar]
Géron, A. Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems; O’Reilly Media: Sebastopol, CA, USA, 2022; pp. 1–861. [Google Scholar]
Kasatikov, N.N.; Brekhov, O.M.; Zhelanov, S.A. Neural network programming for pattern recognition. Sci. Bus. Ways Dev. 2021, 2021, 104–110. [Google Scholar]

Figure 1. Sample letter «E».

Figure 2. Sample letter «O».

Figure 3. Sample letter «Ш».

Figure 4. The total matrix of weights, W. The red color of the cell corresponds to the value of the sums of weights equal to −3. Yellow: −1; white: 0; green: 1; blue: 3.

Figure 5. Matrix of weights for letter «E». Yellow: 1; white: 0; pink: −1.

Figure 6. Matrix of weights for letter «O». Yellow: 1; white: 0; pink: −1.

Figure 7. Matrix of weights for letter «Ш». Yellow: 1; white: 0; pink: −1.

Figure 8. Graphical representation of sample processing. (a) Shaping, (b) noising, and (c) recovery. Symbol * used for presentation of initial noised and recovery samples.

Figure 9. Detailed recovery results. Green corresponds to the letter “E”, blue corresponds to the letter “O”, and yellow corresponds to the letter “Ш”.

Figure 10. Typical errors of the sample letter «E»—from left to right—with characteristic errors 1, 2, 3 and 4. Symbol * used for presentation of initial noised and recovery samples.

Figure 11. Typical errors of the sample letter «O»—from left to right—with errors 1, 2, and 3. Symbol * used for presentation of initial noised and recovery samples.

Figure 12. Typical errors of the sample letter «Ш»—from left to right—with errors 1, 2, 3, 4, and 5. Symbol * used for presentation of initial noised and recovery samples.

Figure 13. Graphic representation of the sample—letters «A», «Γ», and «C». Symbol * used for presentation of initial noised and recovery samples.

Figure 14. The results of recovering different data samples (blue—letter «E»; orange—letter «O»; gray—letter «S»; yellow—letter «A»; blue—letter «G»; green—letter «C»).

Figure 15. Graphical representation of 40 × 40 samples. From left to right: Letter «E», Letter «O», and Letter “Ш”.

Figure 16. Restoration of standard samples (10 × 10). Green—«E»; blue—«O»; yellow—«Ш».

Figure 17. Restoration of enlarged samples (40 × 40). Green—«E»; blue—«O»; yellow—«Ш».

Figure 18. Results of sample reconstruction with the connection coefficient, w, of three neurons. Green—letter “E”; blue—letter “O”; yellow—letter “S”.

Figure 19. Results of image reconstruction in the case of a one-dimensional matrix of weight coefficients, w. Green—«E»; blue—«O»; yellow—«W».

Figure 20. On the left is the original algorithm for the restoration of the noisy image from [21]. On the right is the improved algorithm, with matrix-free computation.

Figure 21. Initial sample, noised sample, restored sample (left), and matrix W (rigth). Artifacts on the recovered specimen are marked in red.

Figure 22. Initial sample, noised sample, restored sample (left), and matrix W (rigth). No recovery artifacts on restored sample.

Figure 23. Noise reconstruction based on three samples using an optimized algorithm. Model interpretation: in the case of several samples, the reconstructed image will depend on the sum of products

({a_{}}_{i j}^{} * a_{i j}^{n})

for all three samples, and the determining influence on the element value in the cell will depend on the total correlation/anti-correlation of the original and noisy samples. In the case of a single sample, it will be recovered (either as the original or inverted original sample), depending on if the sign Sum

({a_{}}_{i j}^{} * a_{i j}^{n})

is in the range [−n, n] and actually describes the Pearson correlation coefficient.

Figure 23. Noise reconstruction based on three samples using an optimized algorithm. Model interpretation: in the case of several samples, the reconstructed image will depend on the sum of products

({a_{}}_{i j}^{} * a_{i j}^{n})

for all three samples, and the determining influence on the element value in the cell will depend on the total correlation/anti-correlation of the original and noisy samples. In the case of a single sample, it will be recovered (either as the original or inverted original sample), depending on if the sign Sum

({a_{}}_{i j}^{} * a_{i j}^{n})

is in the range [−n, n] and actually describes the Pearson correlation coefficient.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kostromitin, K.; Melnikov, K.; Nikonova, D. Analytical and Numerical Study of Information Retrieval Method Based on Single-Layer Neural Network with Optimization of Computing Algorithm Performance. Mathematics 2023, 11, 3648. https://doi.org/10.3390/math11173648

AMA Style

Kostromitin K, Melnikov K, Nikonova D. Analytical and Numerical Study of Information Retrieval Method Based on Single-Layer Neural Network with Optimization of Computing Algorithm Performance. Mathematics. 2023; 11(17):3648. https://doi.org/10.3390/math11173648

Chicago/Turabian Style

Kostromitin, Konstantin, Konstantin Melnikov, and Dar’ya Nikonova. 2023. "Analytical and Numerical Study of Information Retrieval Method Based on Single-Layer Neural Network with Optimization of Computing Algorithm Performance" Mathematics 11, no. 17: 3648. https://doi.org/10.3390/math11173648

APA Style

Kostromitin, K., Melnikov, K., & Nikonova, D. (2023). Analytical and Numerical Study of Information Retrieval Method Based on Single-Layer Neural Network with Optimization of Computing Algorithm Performance. Mathematics, 11(17), 3648. https://doi.org/10.3390/math11173648

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Analytical and Numerical Study of Information Retrieval Method Based on Single-Layer Neural Network with Optimization of Computing Algorithm Performance

Abstract

1. Introduction

2. Materials and Methods

2.1. Formation of Samples for ANN Training

2.2. ANN Training on the Generated Samples

2.3. Sample Noise and Recovery

3. Results

3.1. Results of Numerical Investigation of the Reconstruction Algorithm

3.1.1. Recovery Error Analysis

3.1.2. Influence of the Number of Samples on the Recovery Result

3.1.3. Influence of Sample Matrix Sizes on Signal Recovery

3.1.4. Influence of the Dimensionality of the Matrix of Weight Coefficients, w

3.2. Investigation of the Mathematical Model

3.2.1. Hypothesis Statement

3.2.2. Investigation of the Hypothesis: The Possibility of Obtaining a Reconstructed Image without Forming a Matrix of Weight Coefficients

3.2.3. Identifying the Nature of the Recovery Mechanics

3.2.4. Investigation the Influence of the Dimensionality of the Matrices w

3.2.5. Effect of Zeroing of Coefficient, w, on the Main Diagonal

3.2.6. Reconstructing a Noisy Image for Several Samples

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI