1. Introduction
Deep learning means the learning of deep neural networks, called deep and if multiple hidden layers exist. Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction [
1]. The convolution in convolutional deep neural network (CNN) is the tool for obtaining a feature map from the original image data, it sweeps the original image with a kernel matrix, and transforms the original data into a different shape. This distorted image is called a feature map. Therefore, in CNN, the convolution can be regarded as a tool that creates a feature map from the original image. Herein, the concept of convolution in artificial intelligence is demonstrated mathematically.
The core concept of CNN is the convolution which applies the weight to the receptive fields only, and it transforms the original data into a feature map. This process is called convolution. This is a similar principle to integral transform. The method of Integral transform maps from the original domain to another domain to solve a given problem more easily. Since the matrix expression of convolution is an essential concept in artificial intelligence, we believe that this study would certainly be meaningful. In addition to this, the generalized continuous form of convolution has also been studied, and thus this form is expressed as a new variant of Laplace-type transform.
On one hand, the transform theory is extensively utilized in fields involving medical diagnostic equipment, such as magnetic resonance imaging or computed tomography. Typically, a projection data are obtained by an integral transform, and an image using an inverse transform is produced. Although plausible integral transforms exist, almost all existing integral transforms are not sufficiently satisfied with fullness, and can be interpreted as a Laplace-type transform. One of us proposed a comprehensive form of the Laplace-type integral transform in [
2]. The present study is being conducted to investigate the matrix expression of convolution and its generalized continuous form.
In [
2], a Laplace-type integral transform was proposed, expressed as
For values of
as 0,
, 1, and
, we have, respectively, the Laplace [
3], Sumudu [
4], Elzaki [
5], and Mohand transforms [
6]. This form can be expressed in various manners. Replacing
t by
, we have
where
. In the form,
values of 1, 0, 2, and
correspond to the Laplace, Sumudu, Elzaki, and Mohand transforms, respectively. If we substitute
in (1), we then obtain the simplest form of the generalized integral transform as follows:
where
. In this form, the Laplace, Sumudu, Elzaki, and Mohand transforms have
values of 0, 1,
, and 2, respectively. It is somewhat paved, but essentially a simple way to derive the Sumudu transform is to multiply the Laplace transform by
s. Similarly, it can be obtained multiply by
to obtain the Elzaki transform, and multiply by
to obtain the Mohand transform. The natural transform [
7] can be obtained by substituting
with
. Additionally, by substituting
, the Laplace-type transform
can be expressed as
As a similar form, there is a Mellin transform [
8] of the form
As shown above, many integral transforms have their own fancy masks, but most of them can essentially be interpreted as Laplace-type transforms. From a different point of view, a slight change in the kernel results in a significant difference in the integral transform theory. Meanwhile, plausible transforms exist, such as the Fourier, Radon, and Mellin transforms. Typically, if the interval of integration and the power of kernel are different, it can be interpreted as a completely different transform. Studies using Laplace transform were conducted in [
9,
10]. The generalized solutions of the third-order Cauchy–Euler equation in the space of right-sided distributions has found [
9], studied the solution of the heat equation without boundary conditions [
10], and investigated further properties of Laplace-type transform [
11]. As an application, a new class of Laplace-type integrals involving generalized hypergeometric functions has been studied [
12,
13]. As for research related to the integral equation, Noeiaghdam et al. [
14] presented a new scheme based on the stochastic arithmetic. The scheme is presented to guarantee the validity and accuracy of the homotopy analysis method. Different kinds of integral equations such as singular and first kind are considered to find the optimal results by applying the proposed algorithms.
The main objective of this study is to investigate the matrix expression of convolution and its generalized continuous form. The generalized continuous form of the matrix expression was carried out in the form of a new variant of Laplace-type transform. The obtained result are as follows:
- (1)
If the matrix representing the function (image) f is A and the matrix representing the function g is B, then the convolution is represented by the sum of all elements of and this is the same as where ∘ is array multiplication, T is the transpose, and is the trace. Thus, the convolution in artificial intelligence (AI) is the same as .
- (2)
The generalized continuous form of the convolution in AI can be represented as
where
is an arbitrary bounded function and
2. Matrix Expression of Convolution in Convolutional Neural Network (CNN)
Note that functions can be interpreted as images in artificial intelligence (AI). The convolution is changed from
to
by the discretization. The convolution in CNN is the tool for obtaining a feature map from the original image data, plays a role to sweeping the original image with kernel matrices (or filter), and it transforms original data into a different shape. In order to calculate the convolution, each
part of the original matrix is element-wise multiplied by the kernel matrix and all its components are added. Typically, the kernel matrix is using by
matrix. On the one hand, the pooling (or sub-sampling) is a simple job, reducing the size of the image made by convolution. It is the principle that the resolution is increased when the screen is reduced.
Let the matrix representing the function
f is
A and the matrix representing the function
g is
B. For two matrices
A and
B of the same dimension, the array multiplication (or sweeping)
is given by
For example, the array multiplication for
matrices is
The array multiplication appears in lossy compression such as joint photographic experts group and the decoding step. Let us look at an example.
Example 1. In the classification field of AI, the pixel is treated as a matrix. When the original image isarray multiplying the kernel matrixon the first matrix, we obtain the matrix Now, adding all of its components, we obtain . Next, if we array-multiply the kernel matrix to the matrixon the right and add all the components, we get by stride . If we continue this process to the final matrixwe get . Consequently, the original matrix changes toby using the convolution kernel. This is called the convolved feature map. This is just an example for understanding, and in perceptron the output uses a value between
and 1 using the activation function. Note that the perceptron is an artificial network designed to mimic the brain’s cognitive abilities. Therefore, the output of neuron (or node)
Y can be represented as
where
w is a weight,
is the threshold value, and
X is the activation function with
. In the backpropagation algorithm of deep neural network, the sigmoid function
is used as the activation function [
15]. This function is easy to differentiate and ensures neuron output is in
. If max-pulling is applied to the above convolved feature map, the resulting matrix becomes
matrix
.
As discussed above, convolution in AI can be obtained by array multiplication. We would like to associate this definition with matrix multiplication in mathematics.
Definition 1. (Convolution in AI) If the matrix representing the function (image) f is A and the matrix representing the function g is B, then the convolution is represented by the sum of all elements of and this is the same as where ∘ is array multiplication, T is the transpose, and is the trace. Thus, the convolution in AI is the same as , the sum of all elements on the diagonal with the right side facing down in .
Typically, the convolution kernel is used as a matrix, but for easy understanding, let us consider a matrix.
Example 2. Ifandthen the convolution in AI is calculated as by the sweeping. On the other hand,andfor T is the transpose and is the trace. This is the same result as in AI. 3. Generalized Continuous Form of Matrix Expression of Convolution
If the matrix representing a function
f is
A and the matrix representing a function
g is
B, then the convolution of the functions
f and
g can be denoted by
. Intuitively, the diagonal part of
corresponds to a graph of
. The overlapping part of the graph can be interpreted as the concept of intersection, that is, the concept of multiplication. Thus, the generalized continuous form of the convolution in AI can be represented in a variant of Laplace-type transform given by
If
is a function defined for all
, an integral of Laplace-type transform
is given by
for
with
Additionally, let
be an arbitrary bounded function and let
be a variant of Laplace-type transform of
. If
is a function defined for all
,
is defined by
for
with
.
Based on the above two definitions, it is clear that the above variant of Laplace-type transform is represented as
for an arbitrary function
. If so, let us see the relation with other integral transforms. Since
if
and
, then it corresponds to the
transform. When we take
,
, and
, we get the Laplace transform. Similarly, when we take
,
, we get the Sumudu transform (Elzaki transform), respectively. In order to obtain a simple form of generalization, it is better to set
to
for an arbitrary integer
. However, it is judged that
is better than
as a suitable generalization, where
is a bounded arbitrary function. The reason is that
can express more integral transforms.
Lemma 1. (Lebesgue dominated convergence theorem [16,17]). Let be a measure space and suppose is a sequence of extended real-valued measurable functions defined on X such that (a) exists μ-a.e.
(b) There is an integrable function g so that for each n, μ-a.e.
Then, f is integrable and Beppo Livi’s theorem is a special form of Lemma 1. Its contents are as follows:
for
is a nondecreasing sequence. The details are can be found on page 71 in [
16]. Note that the convolution of
f and
g is given by
The following theorem is as follows. Since the proof is not difficult, we would like to cover just a few.
Theorem 1. - (1)
(Duality with Laplace transform) If is the Laplace transform of a function , then it satisfies the relation of .
- (2)
(Shifting theorem) If has the transform , then has the transform . That is, Moreover, If has the transform , then the shifted function has the transform . In formula, for is Heaviside function (We write h since we need u to denote u-space).
- (3)
(Linearity) Let be the variant of Laplace-type transform. Then is a linear operation.
- (4)
(Existence) If is defined, piecewise continuous on every finite interval on the semi-axis and satisfies for all and some constants M and k, then the variant of Laplace-type transform exists for all .
- (5)
(Uniqueness) If the variant of Laplace-type transform of a given function exists, then it is uniquely determined.
- (6)
where h is Heaviside function.
- (7)
(Dirac’s delta function) We consider the function In a similar way to Heaviside, taking the integral of Laplace-type transform, we get If we denote the limit of as , then - (8)
(Shifted data problems) For a given differential equation subject to and , where and a and b are constant, we can set . Then gives and so, we have for input . Taking the variant, we can obtain the output .
- (9)
(Transforms of derivatives and integrals) Let a function f is n-th differentiable and integrable, and let us consider the fraction Δ
as an operator. Then of the n-th derivatives of satisfies - (10)
(Convolution) If two functions f and g are integrable for * is the convolution, then satisfies for .
Proof. (5) Assume that
exists by
and
both. If
for
, then
This is a contradiction on
, and hence the transform is uniquely determined. Conversely, if two functions
and
have the same transform (i.e., if
), then
and so
a.e. Hence
excepting for the set of measure zero.
(9) Note that
, and let us approach the proof by induction. In case of
,
Integrating by parts, we have
which is true by (2).
Next, let us suppose that
is valid for some
m. Thus,
holds for
is the
m-th derivative of
f. Let us show that
Now we start with the left-hand side of (2).
Therefore, this theorem is valid for an arbitrary natural number
n. Putting
,
follows. □
As the direct results of (9), and are follow.
For example, we consider
subject to
and
. Taking the integral of Laplace-type transform on both sides, we have
for
. Organizing this equation, we get
. Simplification gives
From the relation of
, we have the solution
where
h is hyperbolic function.
Example 3. (Integral equations of Volterra type) Find the solution of Solution.
- (1)
Since this equation is
, taking the integral of Laplace-type transform on both sides, we have
for
. Thus
and so, we obtain the solution
.
Let us do the check by expansion. Expanding, we get . Since , we get and . Thus, we obtain .
- (2)
This is rewritten as a convolution
Taking the integral of Laplace-type transform, we have
for
. The solution is
and gives the answer
- (3)
Note that the equation is the same as
. Taking the transform, we get
and hence
Simplification gives
and so, we obtain the answer
by the relation of
.
Let us turn the topic to initial value problem of the convolution. The initial value problem
gives
where
and
. Simplification gives
If we put the system function
, then
Since
for
, taking the inverse transform, we have
Theorem 2. (Differentiation and integration of transforms) Let us put and . Then Proof. This is an immediate consequence of and . For this reason, detailed proofs are omitted.
The statements below are the immediate results of Theorem 2.
Let us check examples for temperature in an infinite bar and displacement in a semi-infinite string by the variant of Laplace-type transform. □
Example 4. (Semi-infinite string) Find the displacement of an elastic string subject to the following conditions [3]. - (a)
The string is initially at rest on the x-axis from to ∞.
- (b)
For the left end of the string is moved in a given fashion, namely, according to a single sine wave - (c)
Furthermore, as for .
Then the displacement w iswhere h is Heaviside function. The proof is simple, and the interchangeability of limit and integral in the proof process guarantees its validity by the Lebesgue dominated convergence theorem.
Example 5. (Temperature in an infinite bar) Find the temperature w in an infinite bar if the initial temperature iswith . Solution. Taking the integral of Laplace-type transform on both sides of
, we have
for
. Organizing the equality, we get
Organizing this equality, we get
where the Wronskian
The value
gives
, and hence
. Thus, from
, we get
By the direct calculation, we have
From the formula of
for
and
, we know
and
. Taking the inverse transform, we obtain the temperature
as follows:
on
, and * is the convolution. In case of
, we have the solution
In the above equality, we note that
because
for
.