Next Article in Journal
Generation of Virtual Patient Populations That Represent Real Type 1 Diabetes Cohorts
Next Article in Special Issue
A Low Dissipative and Stable Cell-Centered Finite Volume Method with the Simultaneous Approximation Term for Compressible Turbulent Flows
Previous Article in Journal
The Modigliani–Miller Theory with Arbitrary Frequency of Payment of Tax on Profit
Previous Article in Special Issue
A Survey on Software Defect Prediction Using Deep Learning
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Analysis, Evaluation and Exact Tracking of the Finite Precision Error Generated in Arbitrary Number of Multiplications

by
Constantin Papaodysseus
1,*,
Dimitris Arabadjis
2,
Fotios Giannopoulos
1,
Athanasios Rafail Mamatsis
1 and
Constantinos Chalatsis
1
1
School of Electrical and Computer Engineering, National Technical University of Athens, Iroon Polytechneiou 9, 15780 Athens, Greece
2
School of Engineering, University of West Attica, Petrou Ralli & Thivon 250 Egaleo, 12241 Athens, Greece
*
Author to whom correspondence should be addressed.
Mathematics 2021, 9(11), 1199; https://doi.org/10.3390/math9111199
Submission received: 3 March 2021 / Revised: 7 April 2021 / Accepted: 19 May 2021 / Published: 25 May 2021
(This article belongs to the Special Issue Numerical Analysis and Scientific Computing)

Abstract

:
In the present paper, a novel approach is introduced for the study, estimation and exact tracking of the finite precision error generated and accumulated during any number of multiplications. It is shown that, as a rule, this operation is very “toxic”, in the sense that it may force the finite precision error accumulation to grow arbitrarily large, under specific conditions that are fully described here. First, an ensemble of definitions of general applicability is given for the rigorous determination of the number of erroneous digits accumulated in any quantity of an arbitrary algorithm. Next, the exact number of erroneous digits produced in a single multiplication is given as a function of the involved operands, together with formulae offering the corresponding probabilities. In case the statistical properties of these operands are known, exact evaluation of the aforementioned probabilities takes place. Subsequently, the statistical properties of the accumulated finite precision error during any number of successive multiplications are explicitly analyzed. A method for exact tracking of this accumulated error is presented, together with associated theorems. Moreover, numerous dedicated experiments are developed and the corresponding results that fully support the theoretical analysis are given. Eventually, a number of important, probable and possible applications is proposed, where all of them are based on the methodology and the results introduced in the present work. The proposed methodology is expandable, so as to tackle the round-off error analysis in all arithmetic operations.

1. Introduction

All contemporary computing machines store both integer and floating-point numbers with a finite number of digits. This piece of fixed-sized data that is handled as a unity by the instruction set or the processor’s hardware is called finite word; the number of bits that form this piece of data, is frequently called “finite word length” or “employed precision”. In addition, on the hardware level, a computer performs fundamental operations, using a finite word length. Nowadays, dedicated software programs have been developed, which perform operations with a finite number of digits, the value of which is chosen by the programmer and/or the user, the only limitation being the memory and time constraints. We shall also use for this number of digits the term “finite word length” or “employed precision”.
Due to the fact that the precision with which all arithmetic operations are made is always limited, a numerical error is, as a rule, accumulated during the execution of most algorithms. In particular, in various algorithms and corresponding applications, the obtained results may be totally erroneous and/or unreliable due to the aforementioned reasons, which are inherent to all computing machines. We stress that this problem exists even when an arbitrarily large finite word length is employed for the execution of the algorithm, as it will become evident from the analysis introduced in the present paper. For this reason, we will use the term “finite precision error” for this numerical error; various authors and researchers also use the term “quantization error”, “round-off error” or other equivalent terms.
Consequently, a number of articles address the associated issues and the problems they generate. Thus, for example, authors in [1] study the finite precision error in the least mean square (LMS) adaptive algorithm and they show that the error’s mean squared value is inversely proportional to the adaptation step size μ. Reference [2] introduces a fast algorithm for exponentially weighted least squares Kalman filtering, which suffers less from finite precision error drawbacks, intrinsic to this class of algorithms. Reference [3] presents algorithms for accurately converting floating-point numbers to decimal representation. Article [4] studies the finite precision effects on the execution of the Lanczos algorithm for solving the standard non-symmetric eigenvalue problem. The authors of [5] study round-off error propagation in an algorithm which computes the orthonormal basis of a Krylov subspace with Householder orthonormal matrices. Authors in [6] study the propagation of round-off error near the periodic orbits of a discretized linear area-preserving map. The round-off error probability distribution, considered as a function of time, is shown to be a calculable algebraic number. In [7] it is shown that there are theoretically convergent schemes that solve non-linear partial differential equations, which can produce numerical steady state solutions that do not correspond to steady state solutions of the boundary value problem. In [8], it is pointed out that the convergence of Gegenbauer polynomials at the endpoints is affected by round-off error; the article proposes both parameter optimization and reduction of the round-off error for the Gegenbauer reconstruction method. In [9], a set of specific semantics is introduced which describes the propagation of round-off error during a calculation. The authors of [10] give an estimation of the round-off error generated in long-time integration in a number of standard, nonlinear systems. Authors in [11] introduce an algorithm for the computation of the orthogonal Fourier–Mellin moments which is of linear complexity and is resistant to finite precision error effects. Reference [12] proposes a method for dealing with the instability of the digital frequency synthesis (DFS), caused by the round-off error. Article [13] presents bounds for round-off error, generated in various algorithms. Moreover, another approach is presented in [14], according to which, the evolution of round-off error in chaotic maps is treated as an additive noise to the expected exact solutions; the introduced method spots a threshold below which global errors may be ignored. Article [15] studies the round-off error generated during computation of Hardy’s multiquadric and its related interpolators and proposes the use of arbitrary precision arithmetics to circumvent the associated finite precision error problems. Authors of [16] propose a fast, resistant to finite precision error method for evaluation of high order Zernike moments. Article [17] proposes a method for an improved scaling of finite precision error analysis. Finally, in [18,19], a preliminary form of the approach introduced here is presented.
In the present paper, we introduce a novel approach for studying and evaluating the finite precision error generated during the operation of multiplication. It is shown that, as a rule, this operation is very “toxic”, in the sense that it may force the finite precision error accumulation to grow arbitrarily large. The exact amount of the generated number of erroneous digits added or subtracted to the result (product) of this operation is given. Consequently, the probabilities the number of erroneous digits of the product differ by k from the maximum number of erroneous digits of the operands are explicitly computed. In the process of doing so, a set of general definitions is given, applicable to all operations performed with finite word length by a computing machine. Then, the accumulation of erroneous digits after an arbitrary number of successive multiplications is extensively analyzed. Statistical properties of this accumulated error are stated that allow for exact error prediction when the distribution of the associated operands are given. In addition, a number of theoretical results are introduced, which allow for the exact tracking of the generated and accumulated finite precision error during any number of multiplications in general. Numerous experimental results are presented, which fully support the presented theoretical analysis. We stress that the introduced methodology is expandable, so as to tackle all arithmetic operations.

2. A Set of Basic Definitions, Notations and Abbreviations

The entire analysis will be mainly made in the decimal arithmetic system, only because this system is far more familiar to most users. However, all results referred to in the present work hold perfectly well for the binary system too, or any other radix; the corresponding analysis and deductions may be obtained by means of a quite straightforward and slight modification of the approach introduced here.
In any arithmetic system, we assume that all numbers are expressed in scientific/canonical form. Thus, any number x is written as m a n t i s s a · 10 τ , in the decimal arithmetic system, where m a n t i s s a 1 , 10 ,   τ ; in the binary system, the same number is expressed as m a n t i s s a · 2 τ , where m a n t i s s a 1 , 2 ,   τ . Independently of the employed radix, we will use the symbols m a n x and E x for the mantissa and the exponent of any quantity x respectively. We shall demonstrate in the following that the analysis introduced here based on the decimal radix offers accurate results and prediction for the multiplication(s) performed by computing machines.
Abbreviations 1.
We will use the acronym e. d. d. in place of “erroneous decimal digits” and c. d. d. for “correct decimal digits”. In general, abbreviation “d. d.” stands for “decimal digits”. The abbreviation f. p. e. stands for “finite precision error”. The symbols # e d d x and # c d d x stand for the number of e. d. d. accumulated in quantity x , due to f. p. e., and its number of c. d. d. respectively.
Notation 1.
The expressions the algorithm “has failed” or it “has been destroyed due to f. p. e.” mean that the algorithm in hand offers completely unreliable results, at a certain iteration.
Suppose that any two numbers α and β are given, both written in canonical form. In order to unambiguously verify if these two numbers share a common number of initial digits (i.e., stem of digits), starting from the most significant one, we shall employ the following:
Definition 1.
Consider two numbers, α , β of the same sign, written in scientific form, with the same number, n , of decimal digits in the mantissa:
α = m a n α · 10 τ ,   β = m a n β · 10 ρ
where τ = Ε α and ρ = Ε β . Let us assume, without any loss of generality, that τ ρ holds. Then, these two numbers share the first μ ,   μ digits (they have the first μ digits in common) if and only if:
α β = w · 10 τ μ   ,   w 1 , 10 .
Consequently, α and β differ in the last λ ,   λ digits if and only if:
α β = z · 10 τ n λ   ,   z 1 , 10 .
Evidently, in the binary system, the two numbers share the first μ ,   μ bits if and only if
α β = w · 2 E α μ   ,   w 1 , 2 ,
where n is the finite word length, while they differ in the last λ , λ digits if and only if
α β = z · 2 E a n λ   ,   z 1 , 2 .
If the aforementioned relation offers a negative λ , then, by definition, λ = 0 , namely α and β are identical as far as all their n digits are concerned.
Now, we shall give a couple of examples in order to clarify the content of Definition 1.
Example 1.
n 1 = 4.269587962 400597 · 10 4
n 2 = 4.269587962 393951 · 10 4
A simple inspection might lead someone to deduce that these two numbers differ by six (6) decimal digits. Actually and according to Definition 1, the following holds:
τ = ρ = 4
n = 16
where the absolute difference is n 1 n 2 = 6.6459 · 10 8
Hence,
τ n k = 8   k = 4
Therefore, the two aforementioned numbers differ in four (4) decimal digits, contrary to a probable initial expectation.
Example 2.
According to Definition 1, the two numbers with 32 decimal digits in the mantissa,
n 1 = 6.98765432123456789 77777777777777 · 10 1 ,
n 2 = 6.98765432123456789 01523451234533 · 10 1   ,
differ by 14 decimal digits, shown in bold, since τ = ρ = 1 ,   n = 32 , while their absolute difference is n 1 n 2 = 7.6254326543244 · 10 17 ; hence, τ n k = 5   k = 14 .
Additionally, the two numbers n 1 = 1.112324567422342 · 10 4   and n 2 = 1.112324567421112 · 10 4 differ by 4 decimal digits in the mantissa, since   τ = ρ = 4 ,   n = 16 and   n 1 n 2 = 1.230000634677708 · 10 8 ; therefore, τ n k = 8   k = 4 .
Similarly, the two numbers n 1 = 1.00000000 · 10 τ and n 2 = 9.99999999 · 10 τ 1 do not differ at all, since   n 1 n 2 = 9.999999717180685 · 10 10 ; hence τ n k = 10     k = 1 τ < 0 .
Suppose, moreover, that all operations were made with infinite precision; then, let an arbitrary quantity α have the value a ˜ c , where superscript c indicates the ideally correct value of α . Next, suppose that the very same quantity α is calculated in a computing machine which performs the same operations as in the infinite case, using n digits in the mantissa; suppose that this machine generates the representation α n for the specific quantity a . Then, a rigorous relation between α n and a ˜ c is obtained via the following:
Definition 2.
Let us assume that we restrict the infinite precision quantity a ˜ c to its first n digits, obtaining quantity a c . Let us also assume that comparison of α n and a c by means of Definition 1, manifests that these two quantities differ by λ α digits. Then, we deduce that quantity α n has the first λ c = n λ α digits correct and all its other digits erroneous. The aforementioned statement holds for both the binary system, which is the base of contemporary computing machines, as well as for the decimal radix.
A number of practical examples associated with the above Definition, will be given in Section 6.
It is known that a mantissa represented by a number of bits, say   ν ,   ν , in a computing machine is approximated in the decimal radix by a number n of d. d., pretty close to the nearest integer of ν · l o g 10 2 . Since ν · l o g 10 2 is, as a rule, not an integer, then the number of correct digits of a quantity’s decimal representation may fluctuate by one digit at most.

3. Generation of Finite Precision Error in a Single Multiplication and Corresponding Probabilities

This Section presents a solution to the following problem: consider two arbitrary numbers, say α n , β n found in a computing machine that uses a finite word length of n decimal digits in the mantissa. Moreover, suppose due to an ensemble of previous calculations α n has been computed with λ α erroneous decimal digits (e. d. d.) in its mantissa, while β n with λ β e. d. d. in the mantissa. In addition, consider that multiplication γ n = α n β n is executed in this computing machine. Then, so far, it has been an open question to determine the exact number of e. d. d. with which γ n is evaluated; moreover, the corresponding probabilities that γ n is computed with a specific number of e. d. d. must be evaluated.

3.1. Bounds and Evaluation of the Finite Precision Error Produced in a Single Multiplication

Consider any two quantities α , β having a ˜ c   and   β ˜ c ideally correct digits, should all operations and representations be made with infinite precision. Next, suppose that quantities α and β have been evaluated in a computing machine using n d. d. in the mantissa; we let the representations of these two numbers in this computing machine be a n and β n , respectively. In addition, following Definition 2, we let the restriction of a ˜ c   and   β ˜ c in this machine be α c , β c respectively. We would like to emphasize that the difference between a n and a c is the following: quantity α n may have been evaluated with finite precision error, due to previous calculations. On the contrary, a c is free of finite precision error since it is always considered to be a restriction of the ideally correct value of a ˜ c in n decimal digits.
Consider, moreover, the product γ = α · β , executed both with infinite precision yielding product γ c = α c · β c , as well as in a computing machine using n digits in the mantissa, generating γ n = α n · β n . In addition, suppose that, due to previous calculations, α n has been computed with λ α erroneous decimal digits (e. d. d.), ( λ a c = n λ α correct decimal digits), while β n with λ β e. d. d. (   λ β c = n λ β c. d. d.) due to the fact that all operations have been made with a finite word length. We note, as it will become evident in the following analysis, that the finite precision error generated in the multiplication process is located only in the mantissae of the involved terms. Hence, we may assume that α n , β n , α c and β c are plain mantissae, namely that E α n = Ε β n = τ = 0 . In order to study the finite precision error generated in the computation of the product γ , we distinguish a number of cases, which are analytically presented below; in addition, a concise presentation of all these cases takes place in Table 1 and Table 2, positioned in the end of the present sub-section. Thus:
Case 1. Quantities α n and β n share the same number of correct decimal digits λ a c = λ β c .
Therefore, according to Definition 2, it holds that
α n a c = z · 10 0 n λ a   ,   z 1 , 10 ,
from which we deduce that we can express quantities α c and β c as follows:
α c = α n + y · 10 λ a c β c = β n + x · 10 λ β c ,
where x and y are the signed mantissae of the finite precision error. Taking (3.1) into consideration, we may write:
γ c = α c · β c = 3.1 α n β n + α n x · 10 λ β c + β n y · 10 λ a c + x y · 10 λ a c + λ β c .
Since, by hypothesis, λ a c = λ β c = λ c , the above expression becomes
γ c = α n β n + α n x + β n y · 10 λ c + x y · 10 2 λ c
Thus, according to Definition 2, the finite precision error (f. p. e.) with which product γ n has been evaluated is
ε γ = α n x + β n y · 10 λ c + x y · 10 2 λ c .
We point out that the subsequent analysis may use (3.3) with slight, straightforward modifications; in fact, in practice, it is sufficient to keep the first-order terms when λ c 3 , since term x y · 10 2 λ c is practically negligible. Should the algorithm tend to fail, i.e., if λ c < 3 , then, ε γ of (3.3) can be used in the subsequent analysis, in a very straightforward manner. To compute the number of erroneous decimal digits (e. d. d.) of γ n , it is absolutely necessary to distinguish the cases m a n α n m a n β n < 10 and m a n α n m a n β n 10 , for reasons that will become evident in the following. In fact:
Case 1.i. It refers to inequality
m a n α n m a n β n < 10 .
Immediately below we will show that, in this case, the maximum number of additional erroneous decimal digits generated in the multiplication γ n = α n · β n is 2. Indeed, here, since we have assumed that all involved quantities have zero exponents, the product α n β n , is given by α n β n = m a n α n m a n β n = m a n α n β n ; now, (3.2) becomes
γ c = α c β c = m a n α n β n + α n x + β n y · 10 λ c
using the aforementioned first-order approximation. Hence, given that m a n α n m a n β n < 10 , it is rather straightforward to show that the supremum of quantity α n x + β n y may acquire is U B = 110 , since all terms, α n , β n , x , y , are mantissae. Therefore, we distinguish the following sub-cases:
Case 1.i.a:
100   α n x + β n y < U B .
Then, α n x + β n y = m a n α n x + β n y · 10 2 , which implies that
m a n α c β c = m a n α n β n + m a n α n x + β n y · 10 λ c 2 .
The above relation (3.7) implies that
m a n α c β c m a n α n β n = m a n α n x + β n y · 10 λ c 2 ,
Which according to Definition 2 shows that quantity γ n = α n β n has been computed with two less correct decimal digits, namely with λ c 2 correct decimal digits (c. d. d.) or equivalently with two additional erroneous decimal digits than those of the operands α n and β n .
Case 1.i.b:
10   α n x + β n y < 100 .
In this sub-case, α n x + β n y = m a n α n x + β n y · 10 , which implies that
γ c = m a n α n β n + m a n α n x + β n y · 10 λ c 1 .
Consequently, Definition 2, implies that γ n has been computed with one less c. d. d. than α n and β n .
Case 1.i.c:
1   α n x + β n y < 10 .
Now, α n x + β n y = m a n α n x + β n y , implying that
γ c = m a n α n β n + m a n α n x + β n y · 10 λ c .
Together with Definition 2, this means that γ n has the same number of c. d. d. with α n and β n , namely λ c .
Case 1.i.d:
10 1   α n x + β n y < 1 .
Here it holds that α n x + β n y = m a n α n x + β n y · 10 1 , implying that
γ c = m a n α n β n + m a n α n x + β n y · 10 λ c + 1 .
Consequently, one may deduce that the number of γ n ’s erroneous decimal digits (e. d. d.) has been reduced by one.
Case 1.i.e. This constitutes a generalization of Case 1.i.d.; in fact, now, we assume that the following inequality holds:
10 k α n x + β n y < 10 k 1 ,   k = 2 , 3 , 4 .
In this, more general case, it holds that α n x + β n y = m a n α n x + β n y · 10 k , therefore,
γ c = m a n α n β n + m a n α n x + β n y · 10 λ c + k .
Hence, the number of correct decimal digits (c. d. d.) of γ n has been increased by k . The same approach may be applied for k 5 ; however, we will show that the corresponding probabilities are negligible in practice.
Case 1.ii, which concerns inequality
m a n α n m a n β n 10 .
Since α n ,   β n ,   x ,   y are mantissae, α n x + β n y < 200 holds. Therefore, we distinguish the following cases:
Case 1.ii.a:
100   α n x + β n y < 200 .
In this case, α n x + β n y = m a n α n x + β n y · 10 2 , which implies that
γ c = m a n α n β n + m a n α n x + β n y · 10 λ c 2 .
However, α n β n = m a n α n m a n β n = m a n α n β n · 10 α c β c = m a n α c β c · 10 , if the algorithm has not failed, which means that E α c β c = E α n β n . Thus, (3.18) now reads:
γ c = α c β c = m a n α c β c · 10 = m a n α n β n · 10 + α n x + β n y · 10 λ c = m a n α n β n · 10 + m a n α n x + β n y · 10 2 · 10 λ c m a n α c β c = m a n α n β n + m a n α n x + β n y · 10 λ c 1 m a n α c β c m a n α n β n = m a n α n x + β n y · 10 λ c 1 .
The above equality (3.19), together with Definition 2 dictates that γ n has been evaluated with λ c 1 correct decimal digits (c. d. d.). Even though (3.6) and (3.17) are quite similar, now, the number of erroneous decimal digits (e. d. d.) of γ n has been reduced by one, due to the right shift the computing machine has performed, to represent γ n in its canonical form.
Case 1.ii.b:
10   α n x + β n y < 100 .
In this case, α n x + β n y = m a n α n x + β n y · 10 holds. However, now, once more, provided that the algorithm has not failed, one obtains α n β n = m a n α n β n · 10 and α c β c = m a n α c β c · 10 . Hence,
γ c = α c β c = m a n α c β c 10 = m a n α n β n + m a n α n x + β n y · 10 λ c · 10 m a n α c β c = m a n α n β n + m a n α n x + β n y · 10 λ c .
Definition 2 indicates that γ n has been evaluated with λ c c. d. d. (i.e., with no additional finite precision error (f. p. e.)).
Case 1.ii.c:
1   α n x + β n y < 10 .
Now it holds that α n x + β n y = m a n α n x + β n y . Supposing that the algorithm has not failed, one deduces
γ c = α c β c = m a n α c β c · 10 = m a n α n β n · 10 + m a n α n x + β n y · 10 λ c m a n α c β c = m a n α n β n + m a n α n x + β n y · 10 λ c + 1 m a n α c β c m a n α n β n = m a n α n x + β n y · 10 λ c + 1 .
The later implies that quantity α n β n has been computed with an additional correct decimal digit, i.e., that the multiplication operation has relaxed the finite precision error (f. p. e.) by one decimal digit.
Case 1.ii.d:
10 k α n x + β n y < 10 k 1 ,   k = 1 , 2 , 3 , 4 .
In this case, it holds that α n x + β n y = m a n α n x + β n y · 10 k , hence,
γ c = m a n α c β c · 10 = m a n α n β n · 10 + m a n α n x + β n y · 10 λ c + k   m a n α c β c = m a n α n β n + m a n α n x + β n y · 10 λ c + k + 1 .
Consequently, the number of correct decimal digits (c. d. d.) of product γ n has been increased by k + 1 in this case.
Case 2. α n and β n have been calculated with different number of correct decimal digits λ a c λ β c ·   λ α λ β Without any loss of generality, we may assume that λ α λ β λ a c λ β c . Consequently, once more it holds that:
α c = α n + y · 10 λ a c β c = β n + x · 10 λ β c γ c = α c · β c = α n β n + α n x · 10 λ β c + β n y · 10 λ a c + x y · 10 λ a c + λ β c .  
As in Case 1, we will use a first-order approximation in (3.26).
Again, the introduced analysis may be extended in a straightforward manner to incorporate the higher order term, too; however, as it will become clear from the subsequent sections, the accuracy improvement is negligible, given also the dramatic increase in complexity. Thus, we may safely assume that γ c = α n β n + α n x · 10 λ β c + β n y · 10 λ a c · α c β c = α n β n + α n x + β n y · 10 λ a c λ β c · 10 λ β c . After setting δ = λ a c λ β c 1 , we obtain:
α c β c = α n β n + α n x + β n y · 10 δ · 10 λ β c .
We must now repeat the analysis previously made in connection with Case 1, by letting α n x + β n y · 10 δ play the role of α n x + β n y and λ β c play the role of λ c . Hence, we again distinguish the cases m a n α n m a n β n < 10 and m a n α n m a n β n 10 , thus getting:
Case 2.i: m a n α n m a n β n < 10
Case 2.i.a:
100   α n x + β n y · 10 δ .
In this case, m a n α c β c = m a n α n β n + m a n α n x + β n y · 10 δ · 10 λ β c 2 .
Namely, product γ n is computed with two additional erroneous decimal digits (e. d. d.) than λ β .
Case 2.i.b:
10   α n x + β n y · 10 δ < 100 .
Now m a n α c β c = m a n α n β n + m a n α n x + β n y · 10 δ · 10 λ β c 1 which means that product γ n is calculated with one additional erroneous decimal digits (e. d. d.) than β n .
Case 2.i.c:
1   α n x + β n y · 10 δ < 10 .
Here, it holds that m a n α c β c = m a n α n β n + m a n α n x + β n y · 10 δ · 10 λ β c .
Hence, product γ n is calculated with no additional e. d. d. when compared to β n , namely λ β .
Case 2.i.d:
10 k α n x + β n y · 10 δ < 10 k 1 ,   k = 1 , 2 , 3 .
In this case, m a n α c β c = m a n α n β n + m a n α n x + β n y · 10 δ · 10 λ β c + k .
Then, γ n is computed with k less e. d. d. than λ β = max λ α , λ β . The same approach may be applied for k 4 , however, the probability that such a case holds, is negligible in practice.
Case 2.ii: m a n α n m a n β n 10
For this case, we distinguish the following sub-cases:
Case 2.ii.a:
100   α n x + β n y · 10 δ < 200 .
In this case we obtain m a n α c β c = m a n α n β n + m a n α n x + β n y · 10 δ · 10 λ β c 1 .
The above equation dictates that product γ n has been evaluated with λ β c 1 correct decimal digits (c. d. d.).
Case 2.ii.b:
10   α n x + β n y · 10 δ < 100 .
Then, m a n α c β c = m a n α n β n + m a n α n x + β n y · 10 δ · 10 λ β c . Consequently, Definition 2 implies that γ n has exactly the same number of erroneous decimal digits (e. d. d.) as β n .
Case 2.ii.c:
1   α n x + β n y · 10 δ < 10 .
Now m a n α c β c = m a n α n β n + m a n α n x + β n y · 10 δ · 10 λ β c + 1 . Therefore, quantity α n β n has been computed with an additional c. d. d., as compared to λ β c .
Case 2.ii.d:
10 k α n x + β n y · 10 δ < 10 k 1 ,   k = 1 , 2 , 3 .
Here, it holds that m a n α c β c = m a n α n β n + m a n α n x + β n y · 10 δ · 10 λ β c + k + 1 . Hence, the number of c. d. d. of γ is greater by k + 1 than λ β c = min λ α c , λ β c .

3.2. Probabilities for Obtaining a Specific Number of Erroneous Digits in the Execution of a Single Multiplication

We once more adopt the distinction in cases made in Section 3.1, which are presented in Table 1 and Table 2 below, in a very concise manner. In fact,
Case 1: λ a = λ β = λ .
Moreover, in connection with Case 2 ( λ α c λ β c ), we cite the following Table 2:
Consider any multiplication of two numbers α n and β n sharing the same number λ of e. d. d. Thus, quantity γ n = α n β n is computed with λ + ξ e. d. d. If ξ > 0 , γ n is computed with ξ additional e. d. d., while if ξ < 0 , γ n is computed with ξ less e. d. d. Then, following Section 3.1, ξ is a random variable, independent of λ . Therefore, the probabilities for obtaining a specific value of ξ are independent of λ ; this suggests the use of the following notation:
Notation 2.
Let γ n = α n β n ; then, quantity γ n is computed with λ + ξ e. d. d., ξ = 2 , 1 , 0 , 1 , 2 , . We denote the corresponding probabilities by P E Q ξ ; α n , β n .
As before, α n and β n are mantissae and x , y are the mantissae of the f. p. e. stochastic part. Hence, for the evaluation of P E Q ξ ; α n , β n , it is necessary to know the joint probability density function (pdf) of the random variables X , Y , which express the f. p. e. of the mantissae x , y respectively; we shall symbolize this joint pdf as f X Y x , y . We shall give the general formulae of the sought-for probabilities for a generic pdf. Later on, we shall specify a class of pdfs encountered in practice, we shall calculate the corresponding probabilities and present the associated numeric results. At this point, since x , y 1 , 10 , we form the square shown in Figure 1, where every mantissae couple x , y corresponds to a certain point of the sub-domain
J = A Λ 1 Τ 1 Λ 8 A Β Λ 3 Τ 2 Λ 2 B Γ Λ 5 Τ 3 Λ 4 Γ Δ Λ 7 Τ 4 Λ 6 Δ
We point out that the joint probability f X Y x , y is a conditional pdf, where x , y J , in the sense that it satisfies relation J f X Y x , y d x d y = 1 . If the initial pdf f X Y I x , y is defined in a superset of J , then, we restrict it to J by means of the conditional probability rule. Notice that the points of the “cross” C = Λ 1 Λ 2 Τ 2 Λ 3 Λ 4 Τ 3 Λ 5 Λ 6 Τ 4 Λ 7 Λ 8 Τ 1 Λ 1 do not belong to J , since x and y are mantissae. We again distinguish the sub-cases introduced in Section 3.1.
Case 1.i:  m a n α n m a n β n < 10 , namely relation (3.4).
Case 1.i.a: 100   α n x + β n y < U B , which is (3.6).
In order to determine the sub-domain of J , where inequality (3.6) holds, we assume, first, that both α n , β n are positive mantissae and we draw the straight lines:
E 100 : α n x + β n y = 100 ,   E u b : α n x + β n y = U B .
Let P a i be the set of points x , y of J that lie between E 100 and E u b , where superscript i and subscript a express the last two letters of the Case in hand. Further, consider the straight lines:
E 100 : α n x + β n y = 100 ,   E u b : α n x + β n y = U B .
Let N a i be the set of points x , y of J lying between E 100 and E u b and D a i = P a i N a i ; D a i includes all points of J satisfying (3.6). Then, probability P E Q x , y D a i = D a i f X Y x , y d x d y . However, in this case only, according to the analysis of Section 3.1, γ n = α n β n is computed with ξ = 2 additional e. d. d. than λ . Hence,
P E Q 2 ; α n , β n = D a i f X Y x , y d x d y .
Case 1.i.b:  10   α n x + β n y < 100 , namely inequality (3.8).
For an arbitrary pair of multiplication operands α n ,   β n , consider, now, the straight lines:
E 10 : α n x + β n y = 10 and E 10 : α n x + β n y = 10 .
Let P b i be the set of points x , y J lying between E 100 and E 10 and N b i be the set of points x , y J lying between E 100 and E 10 . D b i = P b i N b i is the entire ensemble of points in J satisfying (3.8), depicted in magenta in Figure 2. Then, probability
P E Q x , y D b i P E Q 1 ; α n , β n = D b i f X Y x , y d x d y .
Case 1.i.c: 1   α n x + β n y < 10 , that is (3.10).
Next, in accordance with the previous analysis, we draw the straight lines:
E 1 : α n x + β n y = 1 and E 1 : α n x + β n y = 1 .
Then, P c i is the sub-domain of J bounded by E 1 and E 10 , while N c i is the sub-region bounded by E 1 and E 10 . Setting D c i = P c i N c i (cyan area in Figure 2), the probability that a pair x , y of error mantissae satisfies (3.10) is:
P E Q x , y D c i P E Q 0 ; α n , β n = D c i f X Y x , y d x d y .
Finally, concerning the remaining Case 1.i.d, it holds that:
Case 1.i.d: 10 k α n x + β n y < 10 1 k ,   k = 1 , 2 , 3 , namely the condition (3.14).
With a similar reasoning, we define the lines E 0.1 : α n x + β n y = 10 1 ,   E 0.01 : α n x + β n y = 10 2 ,   E 0.001 : α n x + β n y = 10 3 ,   E 0.1 : α n x + β n y = 10 1 ,   E 0.01 : α n x + β n y = 10 2 ,   E 0.001 : α n x + β n y = 10 3 which in turn give rise to the sub-domains D d 1 i , D d 2 i and D d 3 i . Sub-domains D d 1 i and D d 2 i are depicted in green and yellow respectively in Figure 2. Eventually, the corresponding probabilities are
P E Q x , y D d 1 i P E Q 1 ; α n , β n = D d 1 i f X Y x , y d x d y P E Q x , y D d 2 i P E Q 2 ; α n , β n = D d 2 i f X Y x , y d x d y P E Q x , y D d 3 i P E Q 3 ; α n , β n = D d 3 i f X Y x , y d x d y .
Case 1.ii:  m a n α n m a n β n 10 , specifically inequality (3.16).
This case may be treated as Case 1.i; however, here, as stated in Section 3.1, the computing machine performs a right shift in order to restore the product γ n = α n β n in its canonical form. Therefore, product γ n is computed with a number of e.d.d. reduced by one with respect to the previous Case 1.i. Thus, briefly, we note the following:
Case 1.ii.a: 100 α n x + β n y < 200 , which is (3.17).
Once again, lines E 100 and E u b , confine P a i i J , and lines E 100 and E u b , confine N a i i J . Let, again, D a i i = P a i i N a i i (shown in magenta in Figure 3, for a specific pair of multiplication operands α n ,   β n ). When x , y D a i i , product γ n is computed with exactly one additional e. d. d. with probability
P E Q x , y D a i i P E Q 1 ; α n , β n = D a i i f X Y x , y d x d y .
Case 1.ii.b:  10   α n x + β n y < 100 , i.e., (3.20).
We draw the straight lines E 100 , E 10 to obtain P b i i , lines E 100 , E 10 to confine N b i i and we let D b i i = P b i i N b i i (shown in cyan in Figure 3). The probability is
P E Q 0 ; α n , β n P E Q x , y D b i i = D b i i f X Y x , y d x d y .
Case 1.ii.c:  1   α n x + β n y < 10 , that is condition (3.22).
We select points x , y J lying between E 10 and E 1 , forming P c i i and points x , y J lying between E 10 and E 1 , forming N c i i ; again, we let D c i i = P c i i N c i i (green area in Figure 3). Now, the probability that f. p. e. is relaxed by one digit is
P E Q x , y D c i i P E Q 1 ; α n , β n = D c i i f X Y x , y d x d y .
Case 1.ii.d:  10 k α n x + β n y < 10 k 1 ,   k = 1 , 2 , namely the condition (3.24).
Along very similar lines we define sub-domains of J , D d 1 i i and D d 2 i i (shown in yellow and blue respectively in Figure 3 for a specific pair of operands α n , β n and n = 5 ). We eventually evaluate
P E Q x , y D d 1 i i P E Q 2 ; α n , β n = D d 1 i i f X Y x , y d x d y P E Q x , y D d 2 i i P E Q 3 ; α , β = D d 2 i i f X Y x , y d x d y .
Case 2: λ a λ β .
Suppose that without any loss of generality λ a < λ β   λ a c > λ β c and that λ γ is the number of erroneous d. d. with which product γ n = α n β n ; let, moreover, ξ = λ γ λ β .
Notation 3.
In this case, the f. p. error also depends on δ (see Section 3.1). Hence, for the corresponding probability, we use the notation P U N ξ , δ ; α n , β n , where as always, without any loss of generality, we assume that α n and β n are mantissae and that λ a < λ β . If the opposite inequality λ α > λ β holds, then we use the notation P U N ξ , δ ; β n , α n .
We once more consider straight lines E 100 δ , E 100 δ ,   E 10 δ , E 10 δ ,   E 1 δ ,   E 1 δ ,   E 0.1 δ ,   E 0.1 δ ,   E 0.01 δ , E 0.01 δ ,   E 0.001 δ , E 0.001 δ , which confine the corresponding sub-domains of J : G a i ,   G b i ,   G c i ,   G d 1 i ,   G d 2 i ,   G d 3 i ,   G a i i ,   G b i i ,   G c i i ,   G d 1 i i ,   G d 2 i i . The probabilities that a pair of mantissae x , y lies in one of the aforementioned domains are:
Case 2.i: m a n α n m a n β n < 10 , namely condition (3.4).
Case 2.i.a:  100   α n x + β n y 10 δ   , i.e., (3.28).
P U N x , y G α i = G α i f X Y x , y d x d y = P U N 2 , δ ; a n , β n
Case 2.i.b:    10   α n x + β n y · 10 δ < 100 , that corresponds to (3.29)
P U N x , y G b i = G b i f X Y x , y d x d y = P U N 1 , δ ; a n , β n .
Case 2.i.c:   1   α n x + β n y · 10 δ < 10 , which is the one of (3.30)
P U N x , y G c i G c i f X Y x , y d x d y = P U N 0 , δ ; α n , β n .
Case 2.i.d: 10 k α n x + β n y · 10 δ < 10 k 1 ,   k = 1 , 2 , 3 , i.e. (3.31)
P U N x , y G d 1 i P U N 1 , δ ; α n , β n = G d 1 i f X Y x , y d x d y P U N x , y G d 2 i P U N 2 , δ ; α n , β n = G d 2 i f X Y x , y d x d y P U N x , y G d 3 i P U N 3 , δ ; α n , β n = G d 3 i f X Y x , y d x d y .
Case 2.ii: m a n α n m a n β n 10   3.16 .
Case 2.ii.a: 100   α n x + β n y · 10 δ < 200 , i.e., the inequality of (3.32).
P U N x , y G a i i P U N 1 , δ ; α n , β n = G a i i f X Y x , y d x d y .
Case 2.ii.b: 10   α n x + β n y · 10 δ < 100 , that corresponds to (3.33)
P U N x , y G b i i P U N 0 , δ ; α n , β n = G b i i f X Y x , y d x d y .
Case 2.ii.c: 1   α n x + β n y · 10 δ < 10   , namely inequality (3.34)
P U N x , y G c i i P U N 1 , δ ; α n , β n = G c i i f X Y x , y d x d y .
Case 2.ii.d: 10 k α n x + β n y · 10 δ < 10 k 1 ,   k = 1 , 2 , 3 , i.e. the case of (3.35)
P U N x , y G d 1 i i P U N 2 , δ ; α n , β n = G d 1 i i f X Y x , y d x d y P U N x , y G d 2 i i P U N 3 , δ ; α n , β n = G d 2 i i f X Y x , y d x d y .

3.3. Experimental Confirmation of the Previous Theoretical Results

In order to test the validity of the analysis and the results of previous Section 3.1 and Section 3.2, we have performed the following experiment:
First, we have chosen a set Σ 16 consisting of 100,000 couples of randomly chosen mantissae α 16 i , β 16 i , having n = 16 d. d. in the mantissa. We assume that these numbers are all correct, concerning the first 16 d. d.
We have “contaminated” all α 16 i , β 16 i , each one with a different error obtained from a normal population, with various values σ of the std. In fact, for each σ , we have produced 25,000 normally distributed error values θ α , i N , θ β , i N that will play the role of f. p. e. of α 16 i and β 16 i , should all operations had been made with n = 16 d. d. precision and the set of contaminated pairs α ˜ 16 i , β ˜ 16 i = α 16 i + θ α , i N , β 16 i + θ β , i N . In addition, we have extended α ˜ 16 i and β ˜ 16 i into a representation of n = 64 d. d., by simply zeroing all decimal digits from the seventeenth one up to 64-th digit.
We have performed all multiplications γ ˜ 16 i = α ˜ 16 i · β ˜ 16 i   , evidently in 16 d. d. precision, as well as multiplications γ ˜ 64 i = α ˜ 64 i · β ˜ 64 i , in 64 d. d. precision. Then, using Definition 2 and Theorem 5, we have obtained the number of e. d. d. of quantity γ ˜ 16 i   , with respect to the e. d. d. of α ˜ 16 i and β ˜ 16 i and the set of f. p. e. differences # e d d γ ˜ i 16 m a x # e d d α ˜ 16 i , # e d d β ˜ 16 i . Using this set, we have compared the corresponding experimental frequencies with the theoretical probabilities predicted in the present section, for various standard deviations of f. p. e. Representative results are shown in Table 3, Table 4, Table 5 and Table 6; Table 4 refers to the case where # e d d α ˜ 16 i = # e d d β ˜ 16 i , for four arbitrarily chosen pairs α i , β i , shown in Table 3. On the contrary, Table 5 and Table 6 refer to the case where # e d d β ˜ 16 i # e d d α ˜ 16 i < 0 . Table 5 corresponds to the case in which δ = 1 , while Table 6 corresponds to the one in which δ = 2 . From both tables, the excellent agreement between theory and experiment is pretty evident.
We have repeated the previous step, using uniformly distributed “contamination numbers” θ i U , in the interval [ 10 16 , 10 3 ] , producing numbers α ^ 16 i = α 16 i + θ i U .
By repeating all actions of previous Step 3 for the case of uniform contamination, the obtained results have confirmed an excellent agreement between theory and practice.
A set of concrete experiments and associated tables.
We have randomly chosen 25 , 000 couples of mantissae terms α 16 i , β 16 i , covering all cases referred to in Section 3.1 and Section 3.2. We have embedded both α 16 i and β 16 i in 64 d. d. precision, as described previously in the present sub-section, thus forming corresponding couples α 64 i , β 64 i . We have contaminated each such pair α 16 i , β 16 i with 25,000 normally distributed error values for various distinct values of standard deviation σ . In this way, we have generated 25 , 000 corresponding contaminated pairs α ˜ 16 i , β ˜ 16 i .
We have performed all 25 , 000 multiplications γ ˜ 16 i = α ˜ 16 i · β ˜ 16 i , as well as the associated products γ 64 i = α 64 i · β 64 i and, finally, we have evaluated the number of erroneous decimal digits of γ ˜ 16 i by comparing it with γ 64 i .
The results of this experiment for a specific value of σ are shown in Table 3, Table 4, Table 5 and Table 6. In Table 3, four arbitrarily chosen different pairs α 16 i , β 16 i ,   i = 1 , 2 , 3 , 4 are presented. Table 4 refers to the case where # e d d α ˜ 16 i = # e d d ( β ˜ 16 i ), for the corresponding contaminated pairs α ˜ 16 i , β ˜ 16 i , while Table 5 and Table 6 refer to the cases in which δ = 1 and δ = 2 respectively, where δ = # e d d β ˜ 16 i # e d d α ˜ 16 i .
For all arbitrarily chosen contaminated pairs α ˜ 16 i , β ˜ 16 i , we have evaluated the theoretical probabilities introduced in Section 3.1, numerically. From all tables, the excellent agreement between theory and experiment is pretty evident. We would like to point out that this excellent agreement appears in all performed experiments, concerning 10 t h s of different values of standard deviation σ .

4. Analysis of the Case of Many Successive Multiplications

In this section, we will compute the probability that M successive multiplications generate λ erroneous d. d. in the final product.
In fact, suppose that any two numbers, γ n 0 and β n 0 , are multiplied in a computing machine using n decimal digits (d. d.) in the mantissa; let γ n 1 = γ n 0 β n 0 . Next, γ n 1 is multiplied by an arbitrary number, say β n 1 , giving rise to γ n 2 = γ n 1 β n 1 and so on. The analysis of Section 3 indicates that a different number of erroneous d. d. emerges as it is analytically presented in Table 1 and Table 2. Therefore, in order to estimate the number of erroneous decimal digits (e. d. d.) accumulated in a result of many successive multiplications, one may employ the following:
  • The mantissa y of the finite precision error (f. p. e.) accumulated at an arbitrary quantity, say α , is a random variable, already symbolized as Y . Therefore, when two quantities α n and β n are multiplied with f. p. e. mantissae x and y , then the f. p. error of the product γ n = a n β n is itself a random variable.
  • As before, without any loss of generality, suppose that λ β is the maximum number of e. d. d. between α n and β n . Then, reminding that the symbol “ # ” stands for cardinal number, # e d d γ n differ from λ β by ξ e. d. d. Evidently, ξ is a random variable itself, having integer values = 2 , 1 , 0 , 1 , 2 , .
  • In Section 3, we have given a method for evaluating the probabilities P E Q ξ ; α n , β n , namely the probability that product γ n is computed with a number of erroneous decimal digits (e. d. d.) differing by ξ decimal digits from the common e. d. d. of α n   a n d   β n . In the same section, we have also proposed a method for evaluating the probabilities P U N ξ , δ ; α n , β n , i.e., the probability that product γ n is computed with a number of e. d. d. differing by ξ decimal digits from the maximum number of e. d. d. between α n   a n d   β n . For brevity, in the present section, we will assume that in all successive multiplications the worst case always takes place, namely that the two multiplication operands share the same number of correct decimal digits (c. d. d.). Moreover, we will momentarily simplify notation by letting P 2 = P E Q 2 ; α n , β n , P 1 = P E Q 1 ; α n , β n , P k = P E Q k ; α n , β n ,   k = 0 , 1 , 2 , .
  • We have performed an extensive number of multiplications γ n i + 1 = γ n i β n i ,   i = 0 , 1 , 2 , , where, initially, γ n 0 and β n 0 are chosen uniformly from the interval 10 ,   10 and with n d. d. precision in the mantissa. Then, the f. p. e. mantissa x of the product γ n i + 1 follows a normal distribution with zero (0) mean value and standard deviation σ 1.1 ,   6.5 . Hence, probabilities P j ,   j = 2 , 1 , 0 , 1 , 2 , are immediately obtained via the analysis of Section 3. However, the present analysis is valid for any distribution of error mantissae that gives rise to a set of probabilities P j .
  • For brevity and simplicity reasons, we shall assume that the ensemble of probabilities P j remains unaltered throughout the entire successive multiplications process. Should any concern on that arise, as we will explicitly state below, a proper source code may be used in order to compute P E Q ξ ; α n , β n dynamically, while the essence of the following analysis remains intact.
Subsequently, we will compute the probability that M successive multiplications generate λ erroneous d. d. in the final product γ n M . In fact, suppose that one performs M successive multiplications and that i 1 of them produce two additional e. d. d. ( ξ = 2 ), i 2 of them produce one additional e. d. d. ( ξ = 1 ), i 3 of them produce no additional e. d. d. ( ξ = 0 ) and i k products “enjoyed” relaxation of the number of e. d. d. by k digits ( ξ = k ). Then, the number ω of e. d. d. with which the final product of M successive multiplications is obtained, is given by ω = μ ξ μ i μ . We are interested in the mean value and variation of quantity ω . To achieve that, we shall present a set of quite general lemmas and theorems; for this reason, for the present section only, we shall introduce an alternative, equivalent notation described below:
Notation 4.
Let ξ be defined as in the previous analysis above. Then, one may define events A 1 ,   A 2 ,   A 3 ,   as follows: A 1 : ξ = 2 , A 2 : ξ = 1 , A 3 : ξ = 0 , A 3 + κ : ξ = κ ,   κ = 1 , 2 , , where events A 3 + κ refer to error correction by κ digits.
Intimately associated with Notation 4 is the following:
Hypothesis 1.
In order to obtain proper bounds of the number λ of e. d. d. accumulated in the final product of M successive multiplications, it is sufficient to assume that at each one of these M successive multiplications, the corresponding probabilities P E Q ξ ; γ n , β n remain constant.
Under this assumption, we let:
P 1 = P E Q 2 ; γ n , β n = P A 1 , P 2 = P E Q 1 ; γ n , β n = P A 2 , P 3 + k = P E Q κ ; γ n , β n = P A 3 + k ,   k = 0 , 1 , 2 , .
In order to obtain the aforementioned bounds for λ , we shall employ the subsequent quite general results.
Lemma 1.
Consider a multinomial distribution with possible outcomes A 1 ,   A 2 ,   , A N , with corresponding probability of appearance P 1 ,   P 2 ,   , P N . Suppose that one performs an experiment M times, whose outcome is modeled by this distribution. Let the first event with outcome A 1 be observed i 1 times, the second event with outcome A 2   i 2 times and so on. Then, quantity ω = A 1 i 1 + + A Ν i Ν = μ = 1 Ν A μ i μ has a mean value ω ¯ and a variance S ω 2 :
ω ¯ = M μ = 1 Ν A μ P μ .
S ω 2 = Μ μ = 1 Ν A μ 2 P μ μ = 1 Ν A μ 2 P μ 2 2 i = 1 Ν j = i + 1 Ν A i A j P i P j .
Proof of Lemma 1.
The probability that ω occurs is given by
P ω = M i 1 P 1 i 1 M i 1 i 2 P 2 i 2 P N 1 i N 1 P Ν M i 1 + i 2 + + i N 1   .
For the mean value ω ¯ : By definition:
ω ¯ = i 1 i 2 i N A 1 i 1 + A 2 i 2 + + A Ν i Ν P ω 4.3 ω ¯ = i 1 i 2 i N A 1 i 1 M i 1 P 1 i 1 M i 1 i 2 P 2 i 2 M i 1 i 2 i 3 P 3 i 3 M i 1 i N 2 i N 1 P N 1 i N 1 P Ν M i 1 + i 2 + + i N 1 + + i 1 i 2 i N A N i N M i 1 P 1 i 1 M i 1 i 2 P 2 i 2 M i 1 i 2 i 3 P 3 i 3 M i 1 i N 2 i N 1 P N 1 i N 1 P N M i 1 + i 2 + + i N 1
We treat each multiple sum separately. Therefore,
ω 1 ¯ = i 1 i 2 i N 1 A 1 i 1 M i 1 P 1 i 1 M i 1 i 2 P 2 i 2 P N 1 i N 1 P N M i 1 + i 2 + + i N 1 = i 1 = 0 M A 1 i 1 M i 1 P 1 i 1 i N 1 = 0 M i 1 i N 2 M i 1 i N 2 i N 1 P N 1 i N 1 P N M i 1 i N 1 = A 1 P 1 M i 1 = 1 M 1 M 1 ! i 1 1 ! M i 1 ! P 1 i 1 1 i N 1 = 0 M i 1 i N 2 M i 1 i N 2 ! i N 1 ! M i 1 i N 1 ! P N 1 i N 1 P N M i 1 i N 1
i N 1 -sum is a version of the identity
P 1 + P 2 + + P Ν M = i 1 = 0 M M i 1 P 1 i 1 P 2 + + P Ν M i 1 .
Hence,
ω ¯ 1 = A 1 P 1 M   .
By employing the same approach, we obtain the previous relation (4.1)
ω ¯ = ω ¯ 1 + ω ¯ 2 + + ω ¯ N ω ¯ = M μ = 1 Ν A μ P μ .
For the variance S ω 2 : By definition:
S ω 2 = ω ω 2 P ω ω ¯ 2 .
By employing the previously given expression for ω , we obtain:
S ω 2 = i 1 = 0 M i N 1 = 0 M i 1 i N 2 A 1 i 1 + + A N i N 2 M i 1 P 1 i 1 M i 1 i N 2 i N 1 P N 1 i N 1 P Ν M i 1 + i 2 + + i N 1 M 2 μ = 1 Ν A μ P μ 2 .
Expanding A 1 i 1 + + A N i N 2 , we obtain the partial sums:
s 1 2 = i 1 = 0 M i N 1 = 0 M i 1 i N 2 A 1 2 i 1 2 M i 1 P 1 i 1 M i 1 i 2 P 2 i 2 M i 1 i N 2 i N 1 P N 1 i N 1 P Ν M i 1 + i 2 + + i N 1 = A 1 2 i 1 = 0 M i 1 2 M i 1 P 1 i 1 i N 1 = 0 M i 1 i N 2 M i 1 i N 1 i N P N 1 i N 1 P N M i 1 + i 2 + + i N 1 s 1 2 = i 1 = 0 M i N 1 = 0 M i 1 i N 2 A 1 2 i 1 2 M i 1 P 1 i 1 M i 1 i 2 P 2 i 2 M i 1 i N 1 i N P N 1 i N 1 P Ν M i 1 + i 2 + + i N 1 = A 1 2 i 1 = 0 M i 1 i 1 1 M i 1 P 1 i 1 P 2 + + P N M i 1 + A 1 2 i 1 = 0 M i 1 M i 1 P 1 i 1 P 2 + + P N M i 1 = A 1 2 M M 1 P 1 2 i 1 = 2 M 2 M 2 ! i 1 2 ! M i 1 ! P 1 i 1 2 P 2 + + P N M i 1 + A 1 2 P 1 M i 1 = 1 M 1 M 1 ! i 1 1 ! M i 1 ! P 1 i 1 1 P 2 + + P N M i 1
s 1 2 = A 1 2 M M 1 P 1 2 + A 1 2 P 1 M
Following an analogous process for the other similar terms of quantity S ω 2 , we obtain
s μ 2 = A μ 2 M M 1 P μ 2 + A μ 2 P μ M ,   μ = 1 , 2 , , N
We now calculate the cross-product
s 1 , 2 2 = i 1 = 0 M i N 1 = 0 M i 1 i N 2 2 A 1 A 2 i 1 i 2 M i 1 P 1 i 1 M i 1 i 2 P 2 i 2 M i 1 i N 2 i N 1 P N 1 i N 1 P Ν M i 1 i N 1 s 1 , 2 2 = 2 A 1 A 2 P 1 P 2 M M 1 i 1 = 1 M 2 M 2 i 1 1 P 1 i 1 1 i 2 = 1 M i 1 M i 1 i 2 1 P 2 i 2 1 P 3 + + P N M i 1 i 2 s 1 , 2 2 = 2 A 1 A 2 P 2 P 1 M M 1 .
Similarly, for the remaining cross-products, we obtain
s i , j 2 = 2 A i A j P i P j M M 1 ,   i , j = 1 , 2 , , N ,   i j .
Summing up s μ 2 and s i , j 2 , we eventually obtain
S ω 2 = μ = 1 Ν s μ 2 + i , j = 1 i < j Ν s i , j 2 ω ¯ 2
S ω 2 = Μ μ = 1 Ν A μ 2 P μ μ = 1 Ν A μ 2 P μ 2 2 i = 1 Ν j = i + 1 Ν A i A j P i P j .
This Lemma along with the central limit theorem, offer the following:
Lemma 2.
Suppose that one executes M 30 successive multiplications. Then, the number of erroneous d. d. generated in the product obtained after these multiplications, ω = μ = 1 Ν A μ i μ , follows a normal distribution with mean value ω ¯ and variance S ω 2 given by (4.1) and (4.2).
We will apply all the previous results to the three more important cases, described below.
Case 1. The worst case, where in all multiplications m a n γ n i m a n β n i < 10 holds.
Case 2. The most favorable case, where m a n γ n i m a n β n i 10 always holds.
Case 3. The general case, where the distribution of m a n γ n i m a n β n i is arbitrary.
Case 1. If at each multiplication, inequality (3.4) holds, namely
m a n α n m a n β n < 10 ,
then we choose the following values around which the corresponding probabilities are more frequently encountered:
  • ξ = 2 : P E Q 2 ; γ n i , β n i = P 1 O 10 9 (i.e., almost negligible). We repeat that we use probabilities P E Q only, since we consider the worst case as far as f. p. error generation and accumulation is concerned, namely that # e d d γ n i = # e d d β n i .
  • ξ = 1 : P E Q 1 ; γ n i , β n i = P 2 0.5530 .
  • ξ = 0 : P E Q 0 ; γ n i , β n i = P 3 0.3934 .
  • ξ = 1 : P E Q 1 ; γ n i , β n i = P 4 0.0457 .
  • ξ = 2 : P E Q 2 ; γ n i , β n i = P 5 7.8 · 10 3 .
  • ξ = −3: P E Q 3 ; γ n i , β n i = P 6 O 10 6 . (i.e., almost negligible).
Therefore, if M such successive multiplications take place, then, the overall number ω = μ = 1 Ν A μ i μ of generated e. d. d. follows a multinomial distribution, which may be very well approximated by a normal distribution with ω ¯ = 0.4917 M and S ω 2 = 0.3881 M , M 30 . Hence, quantity z = ω ω ¯ S ω follows a standard Gauss distribution, i.e., z   ~   N 0 , 1 .
However, now, inequality m a n α n m a n β n < 10 (3.4) holds, thus, ω ¯ is positive and inequality ω ω ¯ S ω 4.2649 holds with confidence 99.999%; coefficient 4.2649 corresponds to the aforementioned confidence level. With this level of significance, the accumulated number ω of erroneous d. d. in the final product, after M successive multiplications, obeying inequality (3.4), satisfies relation
ω 4.2649   S ω + ω ¯ .
However, in this case, the right-hand side of inequality (4.6) is always positive and, moreover, is a monotonically increasing function of M . Consequently, the accumulated number ω of erroneous decimal digits of every product γ n i ,   i = 1 ,   2 ,   ,   M , tends to rapidly increase even for a particularly small number of successive multiplications M . This is fully supported by the contents of Table 7 and Table 8, below.
Theorem 1.
Let us assume that a number of M successive multiplications γ n i + 1 = γ n i β n i , i = 0 ,   ,   M 1 is performed and that for every multiplication, inequality m a n γ n i m a n β n i < 10 holds. In this case, the product γ n i ,   i = 1 , ,   M is prone to serious finite precision error accumulation. We also assume that Hypothesis 1 now holds. Let, in the N -th iteration the number ω , ω = μ = 1 Ν A μ i μ , be the number of erroneous d. d. with which quantity γ n N has been evaluated. Then, it holds that
ω C α · S ω + ω ¯ ,  
where α is the desired level of significance and C α is the lower bound of the corresponding confidence interval.
The theorem holds for any desired level of significance α . Due to the fact that quantity C α · S ω + ω ¯ is always positive in this case and it is a monotonically increasing function of M , quantity ω = # e d d γ n N tends to increase rapidly, even for particularly small numbers of M .
Hypothesis 2 and Associated Notation 5.
Suppose that probabilities P i = P E Q ξ ; γ n i , β n i do not remain constant throughout the successive multiplications, but on the contrary, they depend on the current i t h multiplication. In this case, we consider the following events and the corresponding probabilities for an arbitrary multiplication, say the i t h one:
A 1 , i = 2   e . d . d .   a d d e d   t o   γ n i ; P 1 , i = P E Q 2 ; γ n i , β n i ,
A 2 , i = 1   e . d . d .   a d d e d   t o   γ n i ;   P 2 , i = P E Q 1 ; γ n i , β n i , ,
A 3 , i = n o   e . d . d .   a d d e d   t o   γ n i ; P 3 , i = P E Q 0 ; γ n i , β n i ,
A 4 , i = r e d u c t i o n   o f   γ n i   e . d . d .   b y   o n e ; P 4 , i = P E Q 1 ; γ n i , β n i etc.
If one adopts the above Hypothesis 2, the following result holds:
Theorem 2.
Under the conditions imposed by Hypothesis 2, one may dynamically compute the exact (up to ± 1 d. d.) number of erroneous decimal digits, which are accumulated at the i t h , arbitrary, product γ n i , i = 1 , 2 , , M by applying the method introduced in Section 3. This dynamic computation of # e d d γ n i can be made by a rather straightforward code based on the results of Section 3.
Case 2. Now, we assume that at each one of the M successive multiplications, inequality (3.16) holds, i.e., that
m a n γ n m a n β n 10 .
Then, consider the following associated, quite representative probabilities, in accordance with the analysis of Section 3.2, Case 1.ii:
  • Probability that A 1   ξ = 2 occurs is P 1 = 0 , since in this case equality ξ = 2 can never occur.
  • Probability that A 2   ξ = 1 occurs is P 2 = 6.146 · 10 4 .
  • Probability that A 3   ξ = 0 occurs is P 3 = 0.7468 .
  • Probability that A 4   ξ = 1 occurs is P 4 = 0.2224 .
  • Probability that A 5   ξ = 2 occurs is P 5 = 0.02720 .
  • Probability that A 6   ξ = 3 occurs is P 6 = 0.002979 .
As a rule, the probabilities of events A 6 ,   A 7 , are pretty small, practically zero; however, the entire analysis is absolutely valid if one incorporates the (very small) corresponding probabilities in it. Hence, according to Lemma 1, ω ¯ = 0.2851 M and S ω 2 = 0.2783 M .
We would like to emphasize that in this case, the mean value ω ¯ of generated e. d. d. is negative.
Now, quantity z = ω ω ¯ S ω follows a standard Gauss distribution, i.e., z ~ N 0 , 1 . Hence, inequality ω ω ¯ S ω 4.2649 holds with confidence 99.999%. With this confidence level, the accumulated number ω of e. d. d. after M successive multiplications obeying (3.16), satisfies
ω 4.2649 S ω + ω ¯ .
Here, 4.2649 S ω + ω ¯ is a monotonically decreasing function of M . Thus, the accumulated number ω of e. d. d. remains very close to zero, even for a very large number of multiplications. This has been fully experimentally verified as described in Section 4. Hence, the following holds:
Theorem 3.
Suppose that a number of M successive multiplications γ n i + 1 = γ n i β n i , i = 0 ,   ,   M 1 is performed. For every such multiplication, let inequality m a n γ n i m a n β n i 10 (3.16) holds. Then, for all practical purposes, these multiplications accumulate a negligible amount of f. p. error on the product γ n i + 1 = γ n i β n i for all i = 0 ,   ,   M .
Moreover, the number of erroneous decimal digits (e. d. d.) accumulated in the arbitrary γ n i product, is, as a rule, a decreasing function of i .
If, in addition, Hypothesis 1 is adopted, then the number ω of e. d. d. accumulated in the i -th multiplication satisfies inequality (4.8).
The theorem holds for any desired level of significance α , the only difference being the coefficient of S ω .
By a complete analogy with Theorem 2, one may adopt Hypothesis 2, in which case the following result holds:
Theorem 4.
Under the conditions imposed by Hypothesis 2, one may dynamically compute the exact (up to ± 1 d. d.) number of erroneous decimal digits, which are accumulated at the i t h , arbitrary, product γ n i , i = 1 , 2 , , M by applying the method introduced in Section 3. This dynamic computation of # e d d γ n i can be made by a rather straightforward code based on the results of Section 3.
Case 3. In the general case, either inequality m a n γ n i m a n β n i 10 (3.16) or inequality m a n γ n i m a n β n i < 10 (3.4) arbitrarily holds. Then, in order to obtain a rigorous estimation of the number of e. d. d. in each multiplication, together with the corresponding probability, one must know the statistical distribution of m a n γ n i m a n β n i , as compared to ten (10). In general, these distributions may highly depend on the algorithm in hand. However, in order to obtain an estimation of the corresponding generated f. p. error, we will state the very interesting example where both m a n γ n i and m a n β n i follow a uniform distribution in the interval 1 , 10 . In fact, in this case, the set of γ n i , β n i in the γ n i β n i -plain satisfying (3.4), is the 2D domain bounded by the straight lines γ n i = 1 and β n i = 1 and the hyperbola γ n i β n i = 10 . Dually, the 2D domain for which the alternative inequality (3.16) holds, is the one limited by the straight lines   γ n i = 10 , β n i = 10 and the same hyperbola. Then, we follow the results of Section 3 and we use the graphical representation associated with the square of Figure 1 for the probability density function f X Y U N x , y = 1 81 , defined on this square except the cross. Consequently, in a rather straightforward manner, we obtain P m a n γ n i m a n β n i < 10 = 1 81 1 10 10 β n i d β n i = 1 81 10 ln β n i 1 10 0.7157 .
In case that there is no discernible distribution of m a n γ n i m a n β n i within the course of the algorithm, we may dynamically calculate the finite precision error accumulation for every product in order to estimate the accumulation of the finite precision error in the algorithm in general, as described in Theorems 2 and 4; we remind that Theorem 2 refers to the worst case in which m a n γ n i m a n β n i < 10 always holds, while Theorem 4 is connected to the dual inequality (3.16), which is most favorable from the point of view of generation of finite precision error during multiplication. In any case, the following holds:
Theorem 5.
Suppose again that during M successive multiplications γ n i + 1 = γ n i β n i , i = 0 ,   ,   M 1 and that for a fraction, say F g , of these multiplications, inequality (3.16) holds, while for the other fraction F s = 1 F g of them inequality (3.4) holds. Then, concerning the f. p. error accumulation in the products γ n i ,   i = 1 ,   ,   M , the following two cases hold:
(i) 
if F g > 0.5 , product γ n i tends to behave as described in Case 2, i.e., the overall number of e. d. d. of γ n i ,   i = 1 ,   ,   M is restrained. The closer to 1 fraction F g is, the greater the restriction of the number of e. d. d. accumulated in products γ n i (see Table 8).
(ii) 
If F g < 0.5 , the accumulated f. p. error in the products γ n i is amplified. The closer to 0 F g is, the more rapidly the f. p. e. accumulated in products γ n i grows (Table 7).
The theoretical approach and the associated results introduced in the present Section 4, have been fully confirmed experimentally, as it will be described in Section 6 of the present work.

5. Comparing the Finite Precision Error Generation and Accumulation during Execution of the Same Algorithm Including Successive Multiplications, with Different Finite Word Length/Precision

We shall begin by giving a brief description of the goal of the present section: consider an algorithm A , involving multiplications at each iteration. We execute A first with n decimal digits in the mantissa (say n 7 ) and simultaneously with m decimal digits (d. d.) in the mantissa, where we assume that m 2 n + 7 , using exactly the same input in both cases. Consider any quantity γ of A and let γ n i be the value of this quantity at the i th iteration of A , where all calculations are made with precision of n d. d. in the mantissa. Similarly, let γ m i be the value of this quantity at the same iteration of A , when all operations are made with precision of m d. d. In the present section, we will compare the number of erroneous d. d. with which any such two quantities γ n i ,   γ m i are calculated and, in particular, for the difference Δ = # e d d γ n i # e d d γ m i .
In Section 4 we have concluded that, independently of the finite word length, the number of e. d. d. of any product γ follows a normal distribution if the number of successive multiplications which generated γ , is greater than or equal to 30. Thus, the difference in the number of e. d. d. between γ n i and γ m i also follows a normal distribution with mean value zero and a variance that can be immediately estimated from the results of Section 4. Hence, one may deduce:
Theorem 6.
Suppose that an algorithm A including an arbitrary number of successive multiplications, is executed in parallel with two different finite word lengths corresponding to n and m decimal digits (d. d.). Let the two representations of an arbitrary quantity γ of A be γ n and γ m respectively, in these two finite word lengths. Consider the random variable
Λ = n u m b e r   o f   e .   d .   d .   a c c u m u l a t e d   i n   γ m n u m b e r   o f   e .     d .     d .     a c c u m u l a t e d   i n   γ n .
Λ follows a normal distribution with mean value zero and variance 2 S ω 2 , where S ω 2 is given in (4.2). Let F m , n t be the cumulative distribution function of Λ ’s normal distribution. Then, the probability that Δ = Λ is greater than ζ d. d. (where, clearly, ζ 0 ) is given by
P Δ > ζ = 2 1 F m , n ζ .
Corollary 1.
Based on the analysis introduced in Section 3 and Section 4, one may deduce in a quite straightforward manner that the probabilities that γ n i and γ m i differ in absolute value by Δ 7 decimal digits, is practically zero. This holds true for arbitrarily large number M of successive multiplications executed in A.
Theorem 7.
As in Theorem 6, we let A be executed in parallel with the two different finite word lengths n and m , where m > 2 n + 7 . Then, for an arbitrary quantity γ of A , the following hold:
1. 
We project γ m to n d. d. in the mantissa, obtaining a restricted representation γ ˜ n of γ m .   We compare γ n and γ ˜ n by means of Definitions 1 and 2. If the obtained result is κ e. d. d. ( κ < n ) , then we deduce that precisely the last κ digits of γ n are erroneous.
2. 
We also deduce that γ m has at most κ + 7 e. d. d. or, equivalently, that the first m κ   7 d. d. of γ m are correct.
3. 
As long as κ < n holds, then, γ ˜ n is a fully correct representation of γ with n d. d.

6. Experiments That Fully Support the Theoretical Analysis

In this section, we shall introduce a number of experiments that have been specifically designed by the authors, in order to test the validity and the reliability of the theoretical analysis and results presented in the previous sections.

6.1. Description of a First Class of Experiments That Confirm the Theoretical Approach

Aiming at testing methodology and the associated theoretical results introduced in Section 3 and Section 4, we have proceeded as follows: first, we have selected a set S n of 10 6 randomly chosen floating point numbers having 16 decimal digits (d. d.) in the mantissa (subscript n stands for 16); the elements of this set come from a uniform distribution. All numbers were expressed in scientific form.
Next, we have extended each number a n of S n into a 40 d. d. representation, in scientific form, setting the last 24 d. d. of each number’s mantissa to zero. Thus, we have obtained floating point numbers a m forming set S m ( m = 40 ).
Subsequently, we have chosen an arbitrary, momentarily fixed value of F g in the interval 0 ,   1 . We have performed F g · 10 9 multiplications with n = 16 d. d., for which the multiplication operands α n ,   β n satisfied (3.16). Next, we have performed 1 F g · 10 9 multiplications with 16 d. d. word length, where the opposite inequality, (3.4), namely that the mantissa of the product terms have absolute value smaller than 10, holds. We have ensured that no repetition of any multiplication occurred.
The very same multiplications have been repeated with 40 d. d. precision, among the corresponding numbers a m S m . Suppose that two numbers α n , β n S n , when multiplied, generate γ n 1 with finite precision error (f. p. e.) x n 1 0 , while γ m 1 is generated with f. p. e. x m 1 0 . These errors have been computed via Definition 2 and Theorems 5 and 6 introduced in Section 5. More specifically:
(i)
We have restricted γ m 1 into n = 16 d. d., thus obtaining the number γ ˜ n 1 .
(ii)
According to Theorem 6, γ ˜ n 1 is a correct representation of product γ having n = 16 decimal digits in its mantissa.
(iii)
We have compared γ n 1 and γ ˜ n 1 using Definitions 1 and 2, i.e., by forming their difference γ n 1 γ ˜ n 1 . In this way, we have obtained the exact number of erroneous decimal digits (e. d. d.) with which quantity γ n 1 has been evaluated.
By merging the obtained products γ n 1   and   γ m 1 in two distinct ensembles, we have formed two new sets, S n 1 ,   S m 1 , being in a natural biunivocal relation ( γ n 1 ,   γ m 1 ).
Moreover, for the same value F g , we have performed F g · 10 9 multiplications between α n 1 ,   β n 1 S n 1 , satisfying (3.16), as well as 1 F g · 10 9 multiplications where (3.4) holds, obtaining 10 9 products γ n 2 . Again, during the aforementioned process, no repetition of any multiplication occurred. The very same multiplications have been performed with 40 decimal digits precision, between corresponding elements of set S m 1 , obtaining products γ m 2 . The erroneous d. d. of γ n 1 have been computed using γ m 1 , as described above based on the results of Section 2 and Section 5. We let products γ n 2 and γ m 2 form sets S n 2 and S m 2 respectively, maintaining the natural biunivocal relation ( γ n 2 ,   γ m 2 ).
We continued in this way, forming sets ( S n 3 , S m 3 ) , ,   S n i , S m i , etc. with the same factor F g . In all these cases we evaluated the number of e. d. d. with which products γ n i , are computed as it has been previously described in connection with γ n 1 and γ m 1 . In addition, whenever an exponent exceeded a large absolute value (e.g., 50) during the previous process, it was set to zero, since the exponent 10 τ ,   τ , of the scientific form plays no role in the f. p. e. generation and accumulation in the multiplication process in general. We have taken this action, in order to avoid possible effects of overflow or underflow in consecutive multiplications, since these easily spotted problems have nothing to do with the present study. However, we have kept the overall exponent of each product by simple recursive additions.
We have repeated the aforementioned experiment for various values of F g , where always F g 0 , 1 . At this point, we have distinguished two additional sub-cases: (a) F g < 0.5 and (b) F g > 0.5 .
Sub-case (a) is quite analogous to Case 1, for which inequality (3.4) holds permanently. Specifically, the obtained products γ n i = α n i β n i , have been calculated with all digits erroneous after a relatively small number of iterations, as shown in Table 7. The smaller fraction F g , the more serious the f. p. error is.
On the contrary, Sub-case (b) is quite similar to Case 2, in the sense that products γ n i = α n i β n i manifested substantially smaller f. p. e. accumulation, as Table 8 manifests. In full accordance with the theoretical analysis, the smaller F g , the smaller the accumulated f. p. e. in γ n i is.

6.2. A Second Class of Experiments for Testing the Theoretical Analysis Concerning Successive Multiplications

Case 1. All Successive Multiplications Satisfy Inequality (3.4), m a n α n i m a n β n i < 10     F g 0 .
In connection to it, we have performed the following experiment: we have implemented an artificial algorithm, which forces all successive multiplications to satisfy (3.4). The flow chart of this algorithm is the following:
Starting from an arbitrary number β 0 1 , 10 , we express it with a certain number n of decimal digits (d. d.), as well as with m = 2 n + 10 d. d. We then multiply β 0 by itself in both precisions. In case β 0 2 exceeds ten, then we subtract a properly selected positive integer c 0 , from β 0 in both precisions; we do so, in order that 1 β 0 2 < 10 now holds. We stress that c 0 is adequately selected to be an integer in order that its subtraction from the initial β 0 does not add any e. d. d. to β 0 ; in all performed experiments, we ensured this by checking the number of e. d. d. of the difference β 0 c 0 , via Definition 2. By comparing product β 0 2 in both precisions, we calculate the erroneous decimal digits (e. d. d.) of β 0 2 in the n digits precision. Next, we set the exponent of β 0 2 equal to zero, in order to avoid overflow or underflow and we let the obtained mantissa of β 0 2 be a new number, β 1 , expressed in both precisions. Then, we repeat the previous actions by letting β 1 play the role of β 0 and we evaluate and store the number of e. d. d. with which β 1 2 is computed, after ensuring that β 1 2 1 , 10 , via a proper subtraction β 1 c 1 , as before. We continue this process until the obtained β i 2 is calculated in the n digits precision with all its digits erroneous, while we have ensured that β i 2 1 , 10 .
We have executed this algorithm for 1000 different initial values of β 0 , always belonging to the interval 1 , 10 . The obtained maximum number of iterations for which β i 2 was totally erroneous is shown in Table 9 for various values of precision n . Thus, we obtain the particularly important result that β i 2 is totally erroneous after an impressively small number of iterations, in comparison to the employed precision, in full accordance with the theory and in particular with Theorem 1 of Section 4.
Case 2. All Successive Multiplications Satisfied m a n α n i m a n β n i 10   F g 1 .
We have, again, performed an additional experiment, in which we have written an artificial algorithm, that forces all successive multiplications to satisfy m a n α n i m a n β n i 10 . Indeed, this algorithm is quite similar to the one described in connection with Case 1 above and it has the following flow chart:
Starting, again, from an arbitrary number β 0 1 , 10 , we express it in both n and m = 2 n + 10 d. d. precision. We then execute β 0 β 0 in both precisions. In case β 0 2 is smaller than ten, then we add a properly selected positive integer c 0 to β 0 in both precisions, so as β 0 2 10 . We stress that β 0 + c 0 never manifests any e. d. d. By comparing product β 0 2 in both precisions, we calculate and store the n -precision number’s e. d. d., again by means of Definition 2 and Theorem 5. Next, we set the exponent of β 0 2 to zero, once more to avoid overflow or underflow and we let the obtained mantissa of β 0 2 be a new number β 1 expressed in both precisions. Next, we repeat the previous actions by letting β 1 play the role of β 0 and we store and evaluate the number of e. d. d. with which number β 1 2 is computed, after ensuring that β 1 2 1 , 10 by adding a proper c 1 to β 1 , if necessary. We repeated this process for an arbitrarily large number of iterations, while monitoring the f. p. error of β 1 2 .
We have executed this algorithm 10 10 times in 16 and 42 d. d. precision for 1000 initial values of β 0 , always belonging to the interval 1 , 10 . The experiment has shown that the number of erroneous decimal digits with which β 1 2 has been calculated never exceeded two (2), while the mean value of these e. d. d. remained always pretty close to zero, even for the larger numbers of iterations of the algorithm, in full accordance with Theorem 3.

6.3. Description of a Third Experiment That Fully Supports the Theoretical Results regarding the Case of Successive Multiplications with a Varying Word Length

We have experimentally tested the correctness of Theorem 7 of Section 5, by performing M successive multiplications as described in Section 4. However, now, each multiplication has been executed three times with 16, 40 and 128 d. d. in the mantissa. In this way for each product γ we have obtained three representations, γ 16 , γ 40 and γ 128 in parallel. Next, we have restricted γ 40 to 16 d. d., obtaining representation γ ˜ 16 as described before. Similarly, we have projected γ 128 to both 16 and 40 d. d., obtaining the corresponding representations γ ^ 16 and γ ^ 40 . Eventually, we have compared γ 16 with γ ˜ 16 and γ ^ 16 by means of Definitions 1 and 2; we have also compared γ ˜ 16 with γ ^ 16 and γ 40 with γ ^ 40 via the same method. The obtained results are shown in Table 10 and fully justify the aforementioned Theorems of Section 5, but also of Section 4.

7. Eventual Applications Associated with the Present Work

In the section in hand, we shall present and highlight an ensemble of possible and probable applications, which will be based in the analysis and methodology introduced here. Thus:
  • In certain applications, like the ones that will be described below, it is preferable and/or necessary to use finite elements methods, which employ polynomials of high order to approximate the considered function on each element, usually called “higher order basis functions”. In this approach, if the higher order basis functions are of order n , then one must use elements consisting of n + 1 nodes; moreover, one frequently uses the following basis functions ([20]):
    ψ i ^ ξ = ξ ξ 1 ξ ξ j ξ ξ i 1 ξ ξ i + 1 ξ ξ n + 1 ξ i ξ 1 ξ i ξ j ξ i ξ i 1 ξ i ξ i + 1 ξ i ξ n + 1 ,
    where (a) i is the cardinal number of the node in hand, i = 1 , 2 , , n + 1 , (b) j represents the cardinal number of the other nodes of the specific element, hence j = 1 , 2 , 3 , , n + 1 ,   j i , (c) ξ is the independent variable of the polynomial basis function and (d) evidently ξ j , j = 1 , 2 , 3 , , n + 1 is the value that this variable acquires on the j t h element of the node in hand.
It is rather clear that both the nominator and the denominator in relation (7.1) are results of successive multiplications.
However, even in the case of second order basis functions, one employs the basis functions:
ψ 1 ^ ξ = 1 2 ξ ξ 1 ,   ψ 2 ^ ξ = 1 ξ 2 ,   ψ 3 ^ ξ = 1 2 ξ ξ + 1 ,
which includes multiplications. Consequently, the entire previous analysis may be applied immediately, so that together with ψ i ^ ξ ,   i = 1 , 2 , , n + 1 computation, the user may know the exact number of erroneous decimal digits with which this quantity has been evaluated, each time. Clearly, in case that the numerical value of such a basis function for a certain ξ is highly or even totally “contaminated”, then the user may immediately receive a corresponding signal.
Therefore, more specifically, this method can be applied to the subsequent applications:
  • In research associated with the modelling of the fatigue of materials employed in the rail-wheel system ([21,22]).
  • In the study of rail corrugation ([23,24]).
  • In the study of the influence of bending on the value of friction coefficient ([25]).
  • In tackling important classes of contact problems in elatostatics ([26,27]).
  • In the investigation and analysis of the spatial stress-strain states of a pipe with respect to its corrosion damage, taking into account various types of complex loading ([28]).
  • In real time analysis of local damage in wear-and-fatigue tests, whenever finite elements methods are required/applied ([29]).
It is worthwhile noticing that in many of the aforementioned studies the involved models frequently include multiplications; consequently the approach introduced in the present manuscript may also be proved helpful in associated numerical experiments.
B.
The sequence of powers of a real number.
Consider a single real number, say β > 0 . Moreover, consider the sequence of powers of β , usually computed recursively by means of the following succession of multiplications:
z 0 = β · β z 1 = z 0 · z 0 z 2 = z 1 · z 1 z n = z n 1 · z n 1 ,   n .
Suppose that the numerical value of β > 0 is such that, statistically, multiplication z n 1 · z n 1 , n , satisfies inequality (3.4)
m a n z n 1 · m a n z n 1 < 10 ,
more frequently than the opposite one (3.16), namely
m a n z n 1 · m a n z n 1 10  
Then, according to the previous analysis, one expects that z n will continually be evaluated with a larger number of erroneous decimal digits (e. d. d.), as n grows. To verify/demonstrate that, we have employed β = 1.12 , we have generated sequence z n of the powers of β by means of the aforementioned sequence of successive multiplications and we have evaluated the exact number of e. d. d. with which z n is calculated each time; the determination of the exact number of erroneous d. d. has been made as described in Section 5 and Section 6, using n = 16 decimal digits word length and m = 40 decimal digits precision. The associated results are depicted in Figure 4, from which it is evident that after the impressively small number of 55 iterations, the power z n is computed with all its digits erroneous.
We must emphasize that, in order to circumvent the effects of overflow, each time we have multiplied the mantissae of z n 1 only and not the entire number z n 1 . Equivalently, whenever the exponent of z n exceeded a rather large number, say E z n = 50 , then we have divided with 10 E z n . However, we have registered the power’s exponent each time by simple recursive additions. It is important to stress that, in both these approaches the number of erroneous decimal digits accumulated in z n were identical.
C.
Continual multiplication of contaminated numbers.
Exactly the same analysis holds true, in the case that instead of multiplying z n 1 with itself to produce z n , we instead perform the sequence of multiplications:
z 0 = x 0 · y 0 z 1 = z 0 · y 1 z n = z n 1 · y n 1 ,   n ,
where y 0 , y 1 , , y n 1 ,   n is an arbitrary sequence of contaminated numbers, to which erroneous digits are accumulated probably due to another procedure. The application (D) that follows, we believe that it will clarify the content of these statements.
D.
Finite Precision Error Accumulated in Various Fast Kalman Algorithms.
One of the most widely used filtering procedures is the Kalman one [30]. In many of these algorithmic schemes a certain scalar quantity, say α m b n + 1 ,   m , n , is updated at the n + 1 t h time instant by means of a formula of the type
a m b n + 1 = λ · α m b n · J m n + 1 ,
where (i) λ is the so-called “forgetting factor” almost always belonging to the interval 0.97 , 0.99 and (ii) J m n + 1 is another quantity of the algorithm, which is also computed recursively. In many applications [30], quantities α m b n   and   J m n + 1 have values such that inequality (3.4)
m a n z n 1 · m a n z n 1 < 10 ,
holds very frequently, statistically. Hence, every formula of the type (7.3), tends to generate one additional erroneous decimal digit in a relatively small number of recursions; this erroneous digit is added to the value of a m b n + 1 . Subsequently, since a m b n + 1 enters directly or indirectly, in all other formulae of the corresponding Kalman algorithms, including J m n + 1 , it follows that these schemes are very frequently destroyed due to this successive-multiplication-based finite precision error, in an impressively small number of iterations [30].
Thus, for example, the faster existing Kalman algorithm (the FAEST [31]) can never converge in practice due to this type of f. p. e.
In general, the methodology introduced here allows for both the evaluation of the number of erroneous decimal digits with which all quantities in any fast Kalman algorithm are computed, as well as for finding methods of stabilizing various algorithms of this class ([32]).

8. Conclusions

In this paper, we have presented a new approach to the study of the finite precision error generation and accumulation in the multiplication process. We have initially given a strict mathematical definition of the number of correct digits of a real quantity expressed in any finite word length. We emphasize that although the analysis introduced here is made in the decimal radix, it offers accurate results and prediction of the f. p. e. generated and accumulated in any computing machine that performs an arbitrary number of multiplications, successively.
Along this new approach, we have shown the following fundamental result: suppose that one executes an arbitrary multiplication γ n = α n β n in a computing environment employing the equivalent of n decimal digits in the mantissa. Moreover, let operands α n and β n have λ erroneous decimal digits at most in their mantissae. Then, the number of e. d. d. with which product γ n is calculated depends on the value of m a n α n m a n β n . In fact, if inequality m a n α n m a n β n < 10 holds, then product γ n is calculated with at most λ + 2 erroneous d. d. or with λ ,   λ 1 ,   λ 2 ,   λ 3 e. d. d. In case the complementary inequality holds, then product γ n may be calculated with up to λ + 1 , or with λ ,   λ κ ,   κ = 1 ,   2 ,   3 ,   4 e. d. d.
We have also shown that the chance of encountering one of the aforementioned cases heavily depends on the exponent of quantity α n x + β n y · 10 δ , where x and y are the multiplication operands’ f. p. e. mantissae and δ = # e d d β n # e d d α n .
In order to calculate the probabilities that each one of the aforementioned cases holds, we have introduced the rectangular shaped set of points of Figure 1 and we have defined the sub-domains in which the values of the random variables x and y correspond, in order that product γ n is computed with a specific number of e. d. d. Then, by integration on the corresponding sub-domains, we have calculated the associated probabilities.
We have also given exact formulae for the mean value and standard deviation of the number of e. d. d. accumulated in the results of successive multiplications.
Moreover, we have established that if we perform the exact same set of successive multiplications using n and m > 2 n + 7 d. d., then we may easily track the number of e. d. d. accumulated in the n precision results.
Finally, in order to test the validity of the introduced theoretical analysis, we have performed a number of specially developed experiments. The results of these experiments fully supported the theoretical analysis introduced here.
We emphasize that the developed novel methodology is expandable, so as to tackle the finite precision error generation and accumulation in any arithmetic operation; this will be the subject of forthcoming manuscripts.

Author Contributions

Conceptualization, C.P., D.A., F.G., C.C. and A.R.M.; Funding acquisition; Investigation, A.R.M. and C.C.; Project administration, C.P.; Resources, F.G. and C.C.; Software, C.P., C.C., A.R.M., F.G. and D.A.; Supervision, C.P. and D.A.; Validation, C.P., D.A., F.G., A.R.M. and C.C.; Writing—original draft, C.P. and F.G.; Writing—review & editing, A.R.M., C.P., D.A. and C.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The study did not employ any data sets.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Caraiscos, C.; Liu, B. A roundoff error analysis of the LMS adaptive algorithm. IEEE Trans. Acoust. Speech Signal Process. 1984, 32, 34–41. [Google Scholar] [CrossRef]
  2. Moustakides, G.V. Correcting the instability due to finite precision of the fast Kalman identification algorithms. Signal Process. 1989, 18, 33–42. [Google Scholar] [CrossRef]
  3. Steele, G.L.; White, J.L. How to Print Floating-Point Numbers Accurately. In Proceedings of the ACM SIGPLAN 1990 Conference on Programming Language Design and Implementation, New York, NY, USA, 20–22 June 1990; pp. 112–126. [Google Scholar] [CrossRef]
  4. Bai, Z. Error Analysis of the Lanczos Algorithm for the Nonsymmetric Eigenvalue Problem. Math. Comput. 1994, 62, 209–226. [Google Scholar] [CrossRef] [Green Version]
  5. Arioli, M.; Fassino, C. Roundoff error analysis of algorithms based on Krylov subspace methods. Bit Numer. Math. 1996, 36, 189–205. [Google Scholar] [CrossRef]
  6. Lowenstein, J.H.; Vivaldi, F. Anomalous transport in a model of Hamiltonian round-off. Nonlinearity 1998, 11, 1321–1350. [Google Scholar] [CrossRef]
  7. Allen, E.; Burns, J.; Gilliam, D.; Hill, J.; Shubov, V. The impact of finite precision arithmetic and sensitivity on the numerical solution of partial differential equations. Math. Comput. Model. 2002, 35, 1165–1195. [Google Scholar] [CrossRef]
  8. Gelb, A. Parameter Optimization and Reduction of Round Off Error for the Gegenbauer Reconstruction Method. J. Sci. Comput. 2004, 20, 433–459. [Google Scholar] [CrossRef]
  9. Martel, M. Semantics of roundoff error propagation in finite precision calculations. High. Order Symb. Comput. 2006, 19, 7–30. [Google Scholar] [CrossRef]
  10. Wang, P.; Huang, G.; Wang, Z. Analysis and application of multiple-precision computation and round-off error for nonlinear dynamical systems. Adv. Atmos. Sci. 2006, 23, 758–766. [Google Scholar] [CrossRef] [Green Version]
  11. Papakostas, G.; Karras, D.; Boutalis, Y.; Mertzios, B. Fast numerically stable computation of orthogonal Fourier–Mellin moments. IET Comput. Vis. 2007, 1, 11–16. [Google Scholar] [CrossRef]
  12. Kountouris, A. A randomized algorithm for controlling the round-off error accumulation in recursive digital frequency synthesis (DFS). Digit. Signal Process. 2009, 19, 534–544. [Google Scholar] [CrossRef]
  13. Linderman, M.D.; Ho, M.; Dill, D.L.; Meng, T.H.; Nolan, G.P. Towards program optimization through automated analysis of numerical precision. In Proceedings of the 8th annual IEEE/ACM International Symposium on Code Generation and Optimization, Toronto, ON, Canada, 24–28 April 2010; pp. 230–237. [Google Scholar] [CrossRef] [Green Version]
  14. Turchetti, G.; Vaienti, S.; Zanlungo, F. Relaxation to the asymptotic distribution of global errors due to round off. EPL Europhys. Lett. 2010, 89, 40006. [Google Scholar] [CrossRef] [Green Version]
  15. Cheng, A.-D. Multiquadric and its shape parameter—A numerical investigation of error estimate, condition number, and round-off error by arbitrary precision computation. Eng. Anal. Bound. Elem. 2012, 36, 220–239. [Google Scholar] [CrossRef]
  16. Deng, A.-W.; Wei, C.-H.; Gwo, C.-Y. Stable, fast computation of high-order Zernike moments using a recursive method. Pattern Recognit. 2016, 56, 16–25. [Google Scholar] [CrossRef]
  17. Das, A.; Briggs, I.; Gopalakrishnan, G.; Krishnamoorthy, S.; Panchekha, P. Scalable yet Rigorous Floating-Point Error Analysis. In Proceedings of the SC20: International Conference for High Performance Computing, Networking, Storage and Analysis, Atlanta, GA, USA, 9–19 November 2020; pp. 1–14. [Google Scholar] [CrossRef]
  18. Papaodysseus, C.; Koukoutsis, E.; Vassilatos, C. Error propagation and methods of error correction in LS FIR filtering and l-step ahead linear prediction. IEEE Trans. Signal Process. 1994, 42, 1097–1108. [Google Scholar] [CrossRef]
  19. Papaodysseus, C.; Koukoutsis, E.; Triantafyllou, C. Error sources and error propagation in the Levinson-Durbin algorithm. IEEE Trans. Signal Process. 1993, 41, 1635–1651. [Google Scholar] [CrossRef]
  20. Becker, E.B.; Carey, G.F.; Oden, J.T.; Belytschko, T. Finite Elements, An Introduction. J. Appl. Mech. 1982, 49, 682. [Google Scholar] [CrossRef]
  21. Bendikiene, R.; Bahdanovich, A.; Cesnavicius, R.; Ciuplys, A.; Grigas, V.; Jutas, A.; Marmysh, D.; Nasan, A.; Shemet, L.; Sherbakov, S.; et al. Tribo-fatigue Behavior of Austempered Ductile Iron MoNiCa as New Structural Material for Rail-wheel System. Mater. Sci. 2020, 26, 432–437. [Google Scholar] [CrossRef]
  22. Iannitti, G.; Ruggiero, A.; Bonora, N.; Masaggia, S.; Veneri, F. Micromechanical modelling of constitutive behavior of austempered ductile iron (ADI) at high strain rate. Appl. Fract. Mech. 2017, 92, 351–359. [Google Scholar] [CrossRef]
  23. Liu, Q.; Zhang, B.; Zhou, Z. An experimental study of rail corrugation. Wear 2003, 255, 1121–1126. [Google Scholar] [CrossRef]
  24. Ahlbeck, D.R.; Daniels, L.E. Investigation of rail corrugations on the Baltimore Metro. Wear 1991, 144, 197–210. [Google Scholar] [CrossRef]
  25. Trzepiecinski, T.; Lemu, H.G. Effect of Lubrication on Friction in Bending under Tension Test-Experimental and Numerical Approach. Metals 2020, 10, 544. [Google Scholar] [CrossRef] [Green Version]
  26. Campos, L.; Oden, J.; Kikuchi, N. A numerical analysis of a class of contact problems with friction in elastostatics. Comput. Methods Appl. Mech. Eng. 1982, 34, 821–845. [Google Scholar] [CrossRef]
  27. Migórski, S.; Gamorski, P. A new class of quasistatic frictional contact problems governed by a variational–hemivariational inequality. Nonlinear Anal. Real World Appl. 2019, 50, 583–602. [Google Scholar] [CrossRef]
  28. Sherbakov, S. Three-Dimensional Stress-Strain State of a Pipe with Corrosion Damage Under Complex Loading. Tribol. lubr. Lubr. 2011. [Google Scholar] [CrossRef] [Green Version]
  29. Sosnovskiy, L.; Bogdanovich, A.; Yelovoy, O.; Tyurin, S.; Komissarov, V.; Sherbakov, S. Methods and main results of Tribo-Fatigue tests. Int. J. Fatigue 2014, 66, 207–219. [Google Scholar] [CrossRef]
  30. Papaodysseus, C.; Koukoutsis, E.; Stavrakakis, G.; Halkias, C. Exact analysis of the finite precision error generation and propagation in the FAEST and the fast transversal algorithms: A general methodology for developing robust RLS schemes. Math. Comput. Simul. 1997, 44, 29–41. [Google Scholar] [CrossRef]
  31. Carayannis, G.; Manolakis, D.; Kalouptsidis, N. A fast sequential algorithm for least-squares filtering and prediction. IEEE Trans. Acoust. SpeechSignal Process. 1983, 31, 1394–1402. [Google Scholar] [CrossRef]
  32. Boutalis, Y.; Papaodysseus, C.; Koukoutsis, E. A New Multichannel Recursive Least Squares Algorithm for Very Robust and Efficient Adaptive Filtering. J. Algorithms 2000, 37, 283–308. [Google Scholar] [CrossRef]
Figure 1. Geometric representation of all pairs of finite precision error mantissae x , y . Since these pairs do not belong to the cross Λ 1 Λ 2 Τ 2 Λ 3 Λ 4 Τ 3 Λ 5 Λ 6 Τ 4 Λ 7 Λ 8 Τ 1 Λ 1 , the corresponding joint probability function is restricted within J = A Λ 1 Τ 1 Λ 8 A Β Λ 3 Τ 2 Λ 2 B Γ Λ 5 Τ 3 Λ 4 Γ Δ Λ 7 Τ 4 Λ 6 Δ .
Figure 1. Geometric representation of all pairs of finite precision error mantissae x , y . Since these pairs do not belong to the cross Λ 1 Λ 2 Τ 2 Λ 3 Λ 4 Τ 3 Λ 5 Λ 6 Τ 4 Λ 7 Λ 8 Τ 1 Λ 1 , the corresponding joint probability function is restricted within J = A Λ 1 Τ 1 Λ 8 A Β Λ 3 Τ 2 Λ 2 B Γ Λ 5 Τ 3 Λ 4 Γ Δ Λ 7 Τ 4 Λ 6 Δ .
Mathematics 09 01199 g001
Figure 2. Depiction of the various sub-domains of J , which give rise to different numbers of erroneous decimal digits of γ 5 = α 5 β 5 , where m a n α 5 m a n β 5 < 10 . In this example, we have selected α 5 = 2.3912 and β 5 = 3.2578 ; consequently: (a) the sub-region generating one additional e. d. d. is depicted in magenta, (b) the sub-domain that does not increase the f. p. error is shown in cyan (c) the sub-region relaxing the e. d. d. number by one is depicted in green and (d) the one relaxing the number of e. d. d. by two is shown in yellow. Sub-domains that represent an even greater relaxation of the f. p. e. are too small to appear.
Figure 2. Depiction of the various sub-domains of J , which give rise to different numbers of erroneous decimal digits of γ 5 = α 5 β 5 , where m a n α 5 m a n β 5 < 10 . In this example, we have selected α 5 = 2.3912 and β 5 = 3.2578 ; consequently: (a) the sub-region generating one additional e. d. d. is depicted in magenta, (b) the sub-domain that does not increase the f. p. error is shown in cyan (c) the sub-region relaxing the e. d. d. number by one is depicted in green and (d) the one relaxing the number of e. d. d. by two is shown in yellow. Sub-domains that represent an even greater relaxation of the f. p. e. are too small to appear.
Mathematics 09 01199 g002
Figure 3. Depiction of the various sub-domains of J , which give rise to different numbers of erroneous decimal digits of γ 5 = α 5 β 5 , where m a n α 5 m a n β 5 10 . In this example, we have selected α 5 = 4.6812 and β 5 = 6.3178 : therefore (a) the sub-domain generating one additional e. d. d. is depicted in magenta, (b) the sub-region that does not increase the f. p. error is shown in cyan, (c) the sub-domain relaxing the e. d. d. number by one is depicted in green, (d) the one relaxing the number of e. d. d. by two is shown in yellow, while (e) the one relaxing the number of e. d. d. by three is depicted in blue. The sub-domains that represent an even greater relaxation of the f. p. e. are too small to appear.
Figure 3. Depiction of the various sub-domains of J , which give rise to different numbers of erroneous decimal digits of γ 5 = α 5 β 5 , where m a n α 5 m a n β 5 10 . In this example, we have selected α 5 = 4.6812 and β 5 = 6.3178 : therefore (a) the sub-domain generating one additional e. d. d. is depicted in magenta, (b) the sub-region that does not increase the f. p. error is shown in cyan, (c) the sub-domain relaxing the e. d. d. number by one is depicted in green, (d) the one relaxing the number of e. d. d. by two is shown in yellow, while (e) the one relaxing the number of e. d. d. by three is depicted in blue. The sub-domains that represent an even greater relaxation of the f. p. e. are too small to appear.
Mathematics 09 01199 g003
Figure 4. The evolution of the number of the erroneous decimal digits (e. d. d.) accumulated in the power β 2 n ,   β = 1.12 , due to finite precision error. The abscissa represents the recursions’ cardinal number, while y y axis represents the number of e. d. d. Number β has been chosen in such a way, so as inequality m a n z n 1 · m a n z n 1 < 10 holds more frequently than the dual one (3.16). As a consequence, the number of e. d. d. grows rapidly, in full accordance with the analysis and the results of Section 4 and Section 6.
Figure 4. The evolution of the number of the erroneous decimal digits (e. d. d.) accumulated in the power β 2 n ,   β = 1.12 , due to finite precision error. The abscissa represents the recursions’ cardinal number, while y y axis represents the number of e. d. d. Number β has been chosen in such a way, so as inequality m a n z n 1 · m a n z n 1 < 10 holds more frequently than the dual one (3.16). As a consequence, the number of e. d. d. grows rapidly, in full accordance with the analysis and the results of Section 4 and Section 6.
Mathematics 09 01199 g004
Table 1. This refers to Case 1, with λ α c = λ β c = λ c . The first column under the title “sub-cases”, the eventual values of M 1 = α n x + β n y are shown. In Line 3, in the right of the same title, the possible values of the product m a n α n m a n β n are presented. The obtained number of correct decimal digits (c. d. d.) of product γ n = a n β n is shown in bold in each corresponding square.
Table 1. This refers to Case 1, with λ α c = λ β c = λ c . The first column under the title “sub-cases”, the eventual values of M 1 = α n x + β n y are shown. In Line 3, in the right of the same title, the possible values of the product m a n α n m a n β n are presented. The obtained number of correct decimal digits (c. d. d.) of product γ n = a n β n is shown in bold in each corresponding square.
Case   1 :   Number   of   c .   d .   d .   of   Product   γ n   When   λ α c = λ β c = λ c .  
Let   M 1 = α n x + β n y .
Sub-Cases ( 1 . i ) m a n α n m a n β n < 10 ( 1 . ii ) m a n α n m a n β n 10
(a)   100   M 1 < UB λ c 2 λ c 1
(b)   10   M 1 < 100 λ c 1 λ c
(c)   1   M 1 < 10 λ c λ c + 1
(d)   10 1   M 1 < 1 λ c + 1 λ c + 2
(e)   10   k M 1 < 10 k 1 ,
k = 2 , 3 , 4
λ c + k λ c + k + 1
Table 2. It refers to Case 2, where λ α c λ β c and in particular λ α c > λ β c , without any loss of generality. The first column under the title “sub-cases”, the eventual values of M 2 = α n x + β n y · 10 λ a c λ β c are shown. In Line 3, in the right of the same title, the possible values of the product m a n α n m a n β n are presented. The obtained number of correct decimal digits (c. d. d.) of product γ n = a n β n is shown in bold in each corresponding square.
Table 2. It refers to Case 2, where λ α c λ β c and in particular λ α c > λ β c , without any loss of generality. The first column under the title “sub-cases”, the eventual values of M 2 = α n x + β n y · 10 λ a c λ β c are shown. In Line 3, in the right of the same title, the possible values of the product m a n α n m a n β n are presented. The obtained number of correct decimal digits (c. d. d.) of product γ n = a n β n is shown in bold in each corresponding square.
Case   2 :   Number   of   c .   d .   d .   of   Product   γ n When   λ α c > λ β c .  
Let   M 2 = α n x + β n y · 10 λ a c λ β c .
Sub-Cases ( 2 . i ) m a n α n m a n β n < 10 ( 2 . ii ) m a n α n m a n β n 10
(a)   100   M 1 λ β c 2 λ β c 1
(b)   10   M 1 < 100 λ β c 1 λ β c
(c)   1   M 1 < 10 λ β c λ β c + 1
(d)   10 1   M 1 < 1 λ β c + 1 λ β c + 2
(e) 10   k M 1 < 10 k 1 ,
k = 2 , 3 , 4
λ β c + k λ β c + k + 1
Table 3. Four arbitrarily chosen pairs of α 16 i , β 16 i .
Table 3. Four arbitrarily chosen pairs of α 16 i , β 16 i .
α 16 i β 16 i
α 1 = 1.505791937075619 β 1 = 5.526986816293506
α 2 = 2.675404049994100 β 2 = 2.778498218867048
α 3 = 7.946881519204984 β 3 = 4.557506835434298
α 4 = 5.557116785741900 β 4 = 5.549129305868777
Table 4. The theoretical probabilities P E Q ξ ; α ˜ 16 i , β ˜ 16 i numerically evaluated, as compared to the actually observed corresponding experimental frequencies of ξ = # e d d γ ˜ 16 ι # e d d α ˜ 16 ι .
Table 4. The theoretical probabilities P E Q ξ ; α ˜ 16 i , β ˜ 16 i numerically evaluated, as compared to the actually observed corresponding experimental frequencies of ξ = # e d d γ ˜ 16 ι # e d d α ˜ 16 ι .
P E Q ξ ; α ˜ 16 i , β ˜ 16 i Verification. Normally Distributed Contamination σ = 2.2
α ˜ 16 i , β ˜ 16 i α ˜ 16 1 , β ˜ 16 1 α ˜ 16 2 , β ˜ 16 2 α ˜ 16 3 , β ˜ 16 3 α ˜ 16 4 , β ˜ 16 4
Generation of 2 Additional e. d. d.Theoretical Probability0000
Experimental Frequency0000
Generation of 1 Additional e. d. d.Theoretical Probability0.6480.373 10 5 3.74 · 10 9
Experimental Frequency0.6470.389 10 5 0
Generation of No Additional e. d. d.Theoretical Probability0.3380.5170.7020.622
Experimental Frequency0.3390.5030.7040.623
Relaxation of Product’s e. d. d. by 1Theoretical Probability0.0120.0990.2640.325
Experimental Frequency0.0120.0960.2620.321
Relaxation of Product’s e. d. d. by 2Theoretical Probability0.0010.0090.0310.046
Experimental Frequency0.0010.0100.0300.049
Table 5. The theoretical probabilities P U N ξ , 1 ; α ˜ 16 i , β ˜ 16 i , namely for the case where # e d d β ˜ 16 i # e d d α ˜ 16 i = 1 , numerically evaluated, as compared to the actually observed corresponding experimental frequencies of ξ = # e d d γ ˜ 16 ι # e d d β ˜ 16 ι .
Table 5. The theoretical probabilities P U N ξ , 1 ; α ˜ 16 i , β ˜ 16 i , namely for the case where # e d d β ˜ 16 i # e d d α ˜ 16 i = 1 , numerically evaluated, as compared to the actually observed corresponding experimental frequencies of ξ = # e d d γ ˜ 16 ι # e d d β ˜ 16 ι .
P U N ξ ; 1 ; α ˜ 16 i , β ˜ 16 i Verification. Normally Distributed Contamination σ = 2.2
α ˜ 16 i , β ˜ 16 i α ˜ 16 1 , β ˜ 16 1 α ˜ 16 2 , β ˜ 16 2 α ˜ 16 3 , β ˜ 16 3 α ˜ 16 4 , β ˜ 16 4
Generation of 2 Additional e. d. d.Theoretical Probability0000
Experimental Frequency0000
Generation of 1 Additional e. d. d.Theoretical Probability0.0080.12700
Experimental Frequency0.0080.14700
Generation of No Additional e. d. d.Theoretical Probability0.8700.8730.8670.631
Experimental Frequency0.8720.8530.8700.633
Relaxation of Product’s e. d. d. by 1Theoretical Probability0.112 2 · 10 4 0.1330.369
Experimental Frequency0.110 1 · 10 4 0.1300.367
Relaxation of Product’s e. d. d. by 2Theoretical Probability0.00900 4.23 · 10 6
Experimental Frequency0.009000
Table 6. The theoretical probabilities P U N ξ , 2 ; α ˜ 16 i , β ˜ 16 i , namely for the case where # e d d β ˜ 16 i # e d d α ˜ 16 i = 2 , numerically evaluated, as compared to the actually observed corresponding experimental frequencies of ξ = # e d d γ ˜ 16 ι # e d d β ˜ 16 ι .
Table 6. The theoretical probabilities P U N ξ , 2 ; α ˜ 16 i , β ˜ 16 i , namely for the case where # e d d β ˜ 16 i # e d d α ˜ 16 i = 2 , numerically evaluated, as compared to the actually observed corresponding experimental frequencies of ξ = # e d d γ ˜ 16 ι # e d d β ˜ 16 ι .
P U N ξ , 2 ; α ˜ 16 i , β ˜ 16 i   Verification .   Normally   Distributed   Contamination   σ = 2.2
α ˜ 16 i , β ˜ 16 i α ˜ 16 1 , β ˜ 16 1 α ˜ 16 2 , β ˜ 16 2 α ˜ 16 3 , β ˜ 16 3 α ˜ 16 4 , β ˜ 16 4
Generation of 2 Additional e. d. d.Theoretical Probability0000
Experimental Frequency0000
Generation of 1 Additional e. d. d.Theoretical Probability0.0040.12300
Experimental Frequency0.0030.13400
Generation of No Additional e. d. d.Theoretical Probability0.9960.8770.8650.626
Experimental Frequency0.9970.8660.8720.637
Relaxation of Product’s e. d. d. by 1Theoretical Probability000.1350.374
Experimental Frequency000.1280.363
Relaxation of Product’s e. d. d. by 2Theoretical Probability0000
Experimental Frequency0000
Table 7. F g is a percentage of multiplications in which inequality (3.16) holds. Thus, when F g < 50 % holds, then γ n i becomes completely erroneous after a relatively small number of multiplications, in full accordance with the theoretical predictions. These predictions are based on the results of Theorems 1, 3 and 5, refer to the lower bound of the expected e. d. d. for each F g and they are presented in the last column for confidence level 1 10 5 . For each F g the experimental and theoretical results manifest an excellent agreement.
Table 7. F g is a percentage of multiplications in which inequality (3.16) holds. Thus, when F g < 50 % holds, then γ n i becomes completely erroneous after a relatively small number of multiplications, in full accordance with the theoretical predictions. These predictions are based on the results of Theorems 1, 3 and 5, refer to the lower bound of the expected e. d. d. for each F g and they are presented in the last column for confidence level 1 10 5 . For each F g the experimental and theoretical results manifest an excellent agreement.
Percentage of Successive Multiplications Satisfying m a n   α n i m a n β n i 10 .Number of Successive Multiplications after Which Product γ n i = α n i · β n i Was Computed with All 16 d. d. Erroneous.Theoretical Lower Bounds for # e d d γ n i Obtained via Theorems 1, 3 and 5.
0.05789215.85
0.07188914.00
0.08418812.92
0.09269012.93
0.116010415.15
0.136010714.40
0.186210810.66
0.231013111.58
0.273417515.22
0.405332812.90
0.456245910.12
0.459148911.46
0.48265776.39
Table 8. Demonstration of the results of experiment associated with Case 3, described in the Section 4: 3 × 105 successive multiplications γ n i = α n i · β n i have been performed for various percentages F g of them satisfying m a n α n i m a n β n i 10 . The obtained maximum and average numbers of e. d. d. are in full accordance with Theorems 1, 3 and 5. In fact, when F g > 50 % holds, then the evaluated products manifest a considerable resistance to finite precision error. The closest to 1 F g is, the smaller the number of erroneous digits with which all γ n i are computed, exactly as predicted by the theoretical analysis.
Table 8. Demonstration of the results of experiment associated with Case 3, described in the Section 4: 3 × 105 successive multiplications γ n i = α n i · β n i have been performed for various percentages F g of them satisfying m a n α n i m a n β n i 10 . The obtained maximum and average numbers of e. d. d. are in full accordance with Theorems 1, 3 and 5. In fact, when F g > 50 % holds, then the evaluated products manifest a considerable resistance to finite precision error. The closest to 1 F g is, the smaller the number of erroneous digits with which all γ n i are computed, exactly as predicted by the theoretical analysis.
Percentage of Successive Multiplications Satisfying m a n α n i m a n β n i 10 Maximum Number of e. d. d. with Which Product γ n i = α n i · β n i Has Been Computed, i 3 · 10 5 Average Number of e. d. d. Accumulated in All Products γ n i = α n i · β n i ,   i = 1 , 2 , , 3 · 10 5
0.5014137.7761
0.5029127.2217
0.5051116.9437
0.5063116.8764
0.5076116.6435
0.5131106.1244
0.5143106.0928
0.5154106.0165
0.5171105.9189
0.5204105.7784
0.526095.5140
0.530095.4266
0.540495.1647
0.548195.0070
0.558884.8098
0.589984.4619
0.618284.1856
0.774263.4101
0.842263.0055
0.870252.9013
0.891352.8306
0.919552.7292
0.941252.6463
0.951942.6105
Table 9. Table demonstrating the number of iterations after which the output of the algorithm described in Case 1 of the present Section, offered totally erroneous results, for various employed finite word lengths n . The results are in full accordance with the theoretical analysis presented in the previous sub-sections. The experimentally observed results are in excellent agreement with the content of Theorem 1 of Section 4.
Table 9. Table demonstrating the number of iterations after which the output of the algorithm described in Case 1 of the present Section, offered totally erroneous results, for various employed finite word lengths n . The results are in full accordance with the theoretical analysis presented in the previous sub-sections. The experimentally observed results are in excellent agreement with the content of Theorem 1 of Section 4.
Employed Precision in Decimal DigitsNumber of Iterations after Which All Digits of β i 2 Were Erroneous, Independently of the Choice of β 0
1661
64249
128484
256967
5121915
10243843
20487670
409615,285
Table 10. Comparison of the number of erroneous decimal digits (e. d. d.) accumulated in all the intermediate results of 108 successive multiplications. All these multiplications have been executed in parallel, with 16, 40 and 128 d. d. in the mantissa. All obtained experimental results fully support the theoretical analysis introduced in Section 5 and in particular the content of Theorems 6 and 7.
Table 10. Comparison of the number of erroneous decimal digits (e. d. d.) accumulated in all the intermediate results of 108 successive multiplications. All these multiplications have been executed in parallel, with 16, 40 and 128 d. d. in the mantissa. All obtained experimental results fully support the theoretical analysis introduced in Section 5 and in particular the content of Theorems 6 and 7.
Minimum Erroneous Decimal Digits Difference Between 16 and 40 Decimal Digits Representation.−3
Maximum Erroneous Decimal Digits Difference Between 16 and 40 Decimal Digits Representation3
Mean Erroneous Decimal Digits Difference Between 16 and 40 Decimal Digits Representation−0.0945
Maximum Number of Erroneous Decimal Digits in the 16 Decimal Digits Representation12
Maximum Number of Erroneous Decimal Digits in the 40 Decimal Digits Representation32
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Papaodysseus, C.; Arabadjis, D.; Giannopoulos, F.; Mamatsis, A.R.; Chalatsis, C. Analysis, Evaluation and Exact Tracking of the Finite Precision Error Generated in Arbitrary Number of Multiplications. Mathematics 2021, 9, 1199. https://doi.org/10.3390/math9111199

AMA Style

Papaodysseus C, Arabadjis D, Giannopoulos F, Mamatsis AR, Chalatsis C. Analysis, Evaluation and Exact Tracking of the Finite Precision Error Generated in Arbitrary Number of Multiplications. Mathematics. 2021; 9(11):1199. https://doi.org/10.3390/math9111199

Chicago/Turabian Style

Papaodysseus, Constantin, Dimitris Arabadjis, Fotios Giannopoulos, Athanasios Rafail Mamatsis, and Constantinos Chalatsis. 2021. "Analysis, Evaluation and Exact Tracking of the Finite Precision Error Generated in Arbitrary Number of Multiplications" Mathematics 9, no. 11: 1199. https://doi.org/10.3390/math9111199

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop