Next Article in Journal
Adaptively Promoting Diversity in a Novel Ensemble Method for Imbalanced Credit-Risk Evaluation
Next Article in Special Issue
Optimal Tuning of the Speed Control for Brushless DC Motor Based on Chaotic Online Differential Evolution
Previous Article in Journal
New Bounds for Arithmetic Mean by the Seiffert-like Means
Previous Article in Special Issue
Operator Calculus Approach to Comparison of Elasticity Models for Modelling of Masonry Structures
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Accurate Goertzel Algorithm: Error Analysis, Validations and Applications

1
College of Computer Science and Electronic Engineering, Hunan University, Changsha 410082, China
2
Northwest Institute of Nuclear Technology, Xi’an 710024, China
3
School of Cyberspace Security, Dongguan University of Technology, Dongguan 523106, China
4
College of Computer, National University of Defense Technology, Changsha 410073, China
*
Author to whom correspondence should be addressed.
Mathematics 2022, 10(11), 1788; https://doi.org/10.3390/math10111788
Submission received: 13 April 2022 / Revised: 4 May 2022 / Accepted: 18 May 2022 / Published: 24 May 2022
(This article belongs to the Special Issue Numerical Analysis and Scientific Computing II)

Abstract

:
The Horner and Goertzel algorithms are frequently used in polynomial evaluation. Each of them can be less expensive than the other in special cases. In this paper, we present a new compensated algorithm to improve the accuracy of the Goertzel algorithm by using error-free transformations. We derive the forward round-off error bound for our algorithm, which implies that our algorithm yields a full precision accuracy for polynomials that are not too ill-conditioned. A dynamic error estimate in our algorithm is also presented by running round-off error analysis. Moreover, we show the cases in which our algorithms are less expensive than the compensated Horner algorithm for evaluating polynomials. Numerical experiments indicate that our algorithms run faster than the compensated Horner algorithm in those cases while producing the same accurate results, and our algorithm is absolutely stable when the condition number is smaller than 10 16 . An application is given to illustrate that our algorithm is more accurate than MATLAB’s fft function. The results show that the relative error of our algorithm is from 10 15 to 10 17 , and that of the fft was from 10 12 to 10 15 .

1. Introduction

Polynomial evaluation is ubiquitous in computational sciences and their applications, such as interpolation and approximation practices and signal processing. This article will investigate a broader situation of polynomial evaluation:
ω ( z ) = n = 0 N a n z n ,
where z , a 0 , a 1 , , a N C . The nested-type algorithms are usually used to evaluate polynomials. The Horner algorithm is the most widely used polynomial evaluation algorithm [1]. In special cases, like z C and a 0 , a 1 , , a N R , the Goertzel algorithm that can be applied to compute the discrete Fourier transform (DFT) of specific indices in a vector [2,3] is less expensive the Horner algorithm. The numerical stability of the Horner and Goertzel algorithms was given by Wilkinson [4] and Smoktunowicz [5]. The computed results from these algorithms are arbitrarily less accurate than the working precision u when the polynomial is ill-conditioned due to the round-off errors in floating-point arithmetic. The relative accuracy of these algorithms verifies the following priori bound:
| ω ( z ) ω ^ ( z ) | | ω ( z ) | cond ( ω , z ) × O ( u ) ,
where ω ^ ( z ) is the computed result and cond ( ω , z ) = n = 0 N | a n | | z | n / | n = 0 N a n z n | is the condition number.
In order to improve the accuracy of double precision, Bailey [6] proposed a famous library for double-double and quad-double arithmetic. However, this library needs to normalize floating-point numbers in every operation, and thus the instruction level parallelism is affected [7,8]. The compensated algorithm is improved to solve this problem with the developments and applications of error-free transformation [9]. The relative accuracy of compensated algorithms verifies the following priori bound:
| ω ( z ) ω ¯ ( z ) | | ω ( z ) | u + cond ( ω , z ) × O ( u 2 ) ,
where ω ¯ ( z ) is the computed result of a compensated algorithm.
Recently, compensated algorithms have been widely studied in evaluating polynomials. Graillat [10] proposed a compensated Horner algorithm that achieves full precision accuracy for polynomials that are not excessively ill-conditioned. Aside from that, he extended the error-free transformation and compensated Horner algorithm in complex floating-point arithmetic [11,12] and applied a compensated Horner algorithm to evaluate rational functions [13] and solve all polynomial roots [14]. Polynomial series represented in other basis were also considered, such as the Chebyshev form evaluated by a compensated Chenshaw algorithm [15], the Bernstein form evaluated by a compensated de Casteljau algorithm [16], and a compensated Volk and Schumaker(VS) algorithm [17]. Furthermore, the compensated idea is also applied to matrix multiplication to obtain more accurate results [18,19,20].
With the wide application of floating-point numbers and floating-point operations in numerical computing, the analysis of rounding errors has become the focus [21,22]. Running round-off errors are analyzed and applied to many algorithms of polynomial evaluation [23]. Delgado [17] proposed an adaptive evaluation algorithm by using the de Casteljau algorithm and compensated VS algorithms with a dynamic error estimate. Jiang [24] presented running round-off error analysis for evaluating elementary symmetric functions in real and complex floating-point arithmetic. Barrio [25] developed a more complete compensated algorithm library to evaluate orthogonal polynomial series with dynamic error estimates. In addition, error analysis can also be used for machine learning and the numerical solution of differential equations [26].
In this paper, our contributions are as follows:
  • We design a new compensated Goertzel algorithm and prove that our algorithm can almost yield full working precision to evaluate polynomials (1);
  • We propose dynamic error estimates, which can offer a sharper bound for our approach without considerably increasing its computing complexity;
  • Numerical experiments show that our algorithm runs faster than the compensated Horner algorithm in some cases while keeping a similar precision accuracy;
  • An application is given to illustrate that our algorithm outperforms MATLAB’s fft when dealing with the DFT.
The rest of this paper is organized as follows. Section 2 introduces our compensated Goertzel algorithm. A dynamic error estimate is proposed in Section 3Section 4 analyzes numerical experiment results and gives an application to illustrate that our algorithm outperforms them. Finally, the full paper is summarized in Section 5.

2. Goertzel Compensated Algorithm

We assume working with IEEE-754 floating-point standard [27] rounding to the nearest value in this paper. Let F be the set of floating-point numbers, C represent the complex number, and  { , , , } represent a floating-point operation. This part presents how to design the Goertzel compensated algorithm. First, the Goertzel algorithm and its relationship with the Clenshaw algorithm are listed. Then, the error-free transformations and sum of squares algorithm are recalled. At last, we present the compensated Goertzel algorithm.

2.1. Goertzel Algorithm

By assuming λ , z C and z = x + i y , then we have a quadratic polynomial
( λ z ) ( λ z ¯ ) = λ 2 p λ + q ,
where p = 2 x and q = | z | 2 . By dividing the polynomial in Equation (1) by that in Equation (4), we obtain
ω ( λ ) = b 0 + b 1 λ + ( λ z ) ( λ z ¯ ) n = 2 N b n λ n 2 ,
where
a 0 = b 0 + q b 2 , a 1 = b 1 p b 2 + q b 3 , a n = b n p b n + 1 + q b n + 2 .
Thus, the evaluated result of the polynomial in Equation (1) is
ω ( z ) = b 0 + b 1 z .
Above Equations (5)–(7), we can find the Goertzel algorithm [2] with Algorithm 1.
Algorithm 1 Polynomial evaluation by Goertzel algorithm
Function:  ω ( z ) = Goertzel ( ( a n ) n = 0 N , z )
Require:  z = x + i y C , ( a n ) n = 0 N C
Ensure:  ω ( z ) = n = 0 N a n z n
    p = 2 x , q = x 2 + y 2
    b N + 1 = b N + 2 = 0
   for   n = N , N 1 , . . . , 1
       b n = a n + p b n + 1 q b n + 2
   end
    b 0 = a 0 + x b 1 q b 2
    ω ( z ) = b 0 + i y b 1
In floating-point arithmetic, a backward error bound for the computed result of the polynomial evaluation by Algorithm 1 is presented by Smoktunowicz [5] as Theorem 1:
Theorem 1.
Assume z , a n F + i F for n = 0 , , N . Let 10 N 2 u 0.1 . Then the Goertzel algorithm for evaluating the polynomial in Equation (1) is componentwise backward stable such that
ω ^ ( z ) = n = 0 N a n ( 1 + Δ n ) z n ,
where
| Δ n | 10 N 2 u + O ( u 2 ) .
In fact, Algorithm 1 is a special case of the Clenshaw algorithm [2,5]. When we let t = x / | z | and B n = b n | z | n , Algorithm 1 can be represented in Clenshaw form [28]:
B n = a n | z | n + 2 t B n + 1 B n + 2 .
According to the properties of the Chebyshev polynomial series [29] evaluated by the Clenshaw algorithm, we have
b n = k = n N a k | z | k n U k n ( t ) , b 0 = k = 0 N a k | z | k T k ( t ) , n = 1 , 2 , , N .
where T k ( t ) and U k ( t ) are Chebyshev polynomials of the first and second kinds, respectively. They satisfy
| T k ( t ) | 1 , | U k ( t ) | k + 1 ,
and
z k | z | k = T k ( t ) + i y | z | U k 1 ( t ) f o r k = 0 , 1 , , N .

2.2. Error-Free Transformations and Sum of Squares Algorithm

The basic algorithms of error-free transformations are TwoSum and TwoProd , which were presented by Knuth [30] and Dekker [31], respectively. Graillat [11] extended the addition and multiplication to complex number cases, which are TwoSumCplx and TwoProdCplx , respectively. In this paper, we shall use a new product error-free transformation of one real and one complex floating-point number, which is called TwoProdRC in Algorithm 2.
Algorithm 2 Error-free transformation of the product of real and complex floating-point numbers
Function:  [ x , y ] = TwoProdRC ( a , b )
Require:  a F , b = c + i d F + i F
Ensure:    x + y = a × b
    [ p , e ] = TwoProd ( a , c )
    [ f , g ] = TwoProd ( a , d )
    x = p + i f
    y = e + i g
The details of the error-free transformations above are presented in Table 1.
Furthermore, the sum of squares algorithm [32] is given in Algorithm 3. It requires 42 flops.
Algorithm 3 Sum of squares by two floating-point numbers
Function:    [ x , y ] = SumOfSquares ( a , b )
Require:  a , b F
Ensure:    x + y a 2 + b 2
    [ p , f ] = TwoProd ( a , a )
    [ e , g ] = TwoProd ( b , b )
    [ x , h ] = TwoSum ( p , e )
    y = f g h

2.3. Compensated Goertzel Algorithm

Although Algorithm 1 is a special Clenshaw algorithm, computation via Equation (10) is more expensive. Thus, using the compensated Clenshaw algorithm [28] to express the compensated Goertzel algorithm is not a good idea. We design a compensated Goertzel algorithm by using SumOfSquares , TwoSumCplx and  TwoProdRC algorithms to record the round-off errors.
Assume a ^ F is a computed result in floating-point arithmetic, and its perturbation is ϵ a such that
a ^ = a + ϵ a .
SumOfSquares can find the approximate round-off error ϵ q ^ of q. Other round-off errors in Algorithm 1 can be accurately computed by error-free transformation algorithms TwoSumCplx and TwoProdRC . Then, we can combine all round-off errors by ^ n and find the approximate perturbation ϵ b ^ n of b n for n = N 1 , N 2 , , 1 in each loop. The loop ends using TwoProdRC and TwoSumCplx to calculate b ^ 0 and combines all round-off errors to obtain the approximate perturbation. The round-off error of y b ^ 1 should also be considered by TwoProdRC . The compensated Goertzel algorithm is presented in Algorithm 4, and Figure 1 shows the flow chart of this algorithm.
Algorithm 4 Polynomial evaluation by compensated Goertzel algorithm
Function:  ω ¯ ( z ) = CompGoertzel ( ( a n ) n = 0 N , z )
Require:  z = x + i y F + i F , ( a n ) n = 0 N F + i F
Ensure:  ω ¯ ( z ) n = 0 N a n z n
    p = 2 x
    [ q ^ , ϵ q ^ ] = SumOfSquares ( x , y )
    b ^ N = a N , b ^ N + 1 = ϵ b ^ N = ϵ b ^ N + 1 = 0
   for   n = N 1 , N 2 . . . , 1
       [ r n , π n ] = TwoProdRC ( p , b ^ n + 1 )
       [ s n , σ n ] = TwoProdRC ( q ^ , b ^ n + 2 )
       [ t n , η n ] = TwoSumCplx ( r n , s n )
       [ b ^ n , ξ n ] = TwoSumCplx ( t n , a n )
       ^ n = π n σ n η n ξ n ϵ q ^ b ^ n + 2
       ϵ b ^ n = ^ n p ϵ b ^ n + 1 q ^ ϵ b ^ n + 2
   end
    [ r 0 , π 0 ] = TwoProdRC ( x , b ^ 1 )
    [ s 0 , σ 0 ] = TwoProdRC ( q ^ , b ^ 2 )
    [ t 0 , η 0 ] = TwoSumCplx ( r 0 , s 0 )
    [ b ^ 0 , ξ 0 ] = TwoSumCplx ( t 0 , a 0 )
    ^ 0 = π 0 σ 0 η 0 ξ 0 ϵ q ^ b ^ 2
    ϵ b ^ 0 = ^ 0 x ϵ b ^ 1 q ^ ϵ b ^ 2
    [ ϕ , ψ ] = TwoProdRC ( y , b ^ 1 )
    e ^ = ϵ b ^ 0 i ( ϵ b ^ 1 y ψ )
    ω ^ ( z ) = b ^ 0 i ϕ
    ω ¯ ( z ) = ω ^ ( z ) e ^
We remark that if ( a n ) n = 0 N F , then we shall replace TwoSumCplx and TwoProdRC with TwoSum and TwoProd in Algorithm 4, respectively.

3. Round-Off Error and Complexity Analysis

In this section, we consider the error bound and complexity of Algorithm 4 through Higham’s theories [33]. In our analysis, we assume that there is no computational overflow or underflow. First, we present the priori bound by forward round-off error analysis. Then, we show a dynamic error estimate by running round-off error analysis. At last, we compare the complexities of Horner, Goertzel, their compensated algorithms and the compensated Goertzel algorithm with a dynamic error estimate to evaluate the polynomial in Equation (1) in real and complex coefficients.

3.1. Forward Round-Off Error Analysis

Let ⧫ { , , , } , { + , , × , ÷ } and a , b F . Then, a floating-point computation obeys the model
a b = ( a b ) ( 1 + ε 1 ) = a b 1 + ε 2 ,
where | ε 1 | , | ε 2 | u . We define
< N > : = 1 + θ N = n = 1 N ( 1 + ε n ) ρ n ,
where | ε n | u , ρ n = ± 1 and
| θ n | γ n : = n u ( 1 n u ) ,
for n = 1 , 2 , , N and N u < 1 . We assume that a ^ and b ^ in real arithmetic are denoted by c ˜ = a ^ b ^ , which will be used in our later analysis.
Lemma 1 summarizes the properties of the error-free transformations in Table 1:
Lemma 1.
For a , b , x , y F , [ x , y ] = TwoSum ( a , b ) verifies
| y | u | x | , | y | u | a + b | ,
In addition, [ x , y ] = TwoProd ( a , b ) verifies
| y | u | x | , | y | u | a × b | .
For a , b , x , y F + i F , [ x , y ] = TwoSumCplx ( a , b ) verifies
| y | u | x | , | y | u | a + b | ,
Additionally, [ x , y ] = TwoProdRC ( a , b ) verifies
| y | u | x | , | y | u | a × b | .
For a , b , p , e , f , g F + i F , [ p , e , f , g ] = TwoProdCplx ( a , b ) verifies
| e + f + g | 2 γ 2 | a × b | .
Proof. 
Equations (18)–(22) are presented in [9,11]. In Algorithm 2, it is easy to obtain Equation (21) from Equation (18).    □
Lemma 2 shows the property of Algorithm 3:
Lemma 2.
For a , b , x , y F , [ x , y ] = SumOfSquares ( a , b ) verifies
| x | ( 1 + γ 2 ) | a 2 + b 2 | , | y | γ 2 | x | , | x + y ( a 2 + b 2 ) | u γ 3 | a 2 + b 2 | .
Proof. 
According to Algorithm 3 and Table 1, x = a a b b . With Lemma 1, we obtain | f | u | p | , | g | u | e | and  | h | u | x | . From Equations (15)–(17), we have
| x | = | a 2 < 2 > + b 2 < 2 > | ( 1 + γ 2 ) | a 2 + b 2 | , | y | = | f < 1 > + g < 2 > + h < 1 > | ( 1 + γ 2 ) ( | f | + | g | + | h | ) u ( 1 + γ 2 ) ( | p | + | e | + | x | ) 2 u ( 1 + γ 2 ) | x | = γ 2 | x | .
From Theorem 4.2 in [32], | x + y ( a 2 + b 2 ) | 3 u 2 1 3 u 2 | a 2 + b 2 | u γ 3 | a 2 + b 2 | .    □
Theorem 2 presents a priori error bound of Algorithm 4 to evaluate the polynomial in Equation (1) in complex floating-point arithmetic.
Theorem 2.
Assume z , a n F + i F for n = 0 , , N . Then, the relative forward round-off error bound in the compensated Goertzel algorithm for evaluting ω ( z ) = n = 0 N a n z n in floating-point arithmetic satisfies
| CompGoertzel ( ( a n ) n = 0 N , z ) ω ( z ) | | ω ( z ) | u + 3 N 2 γ 15 γ 3 N + 1 cond ( ω , z ) .
Proof. 
Assume the error of ω ^ ( z ) is e (i.e.,  ω ^ ( z ) + e = ω ( z ) ). Then, we have
| ω ¯ ( z ) ω ( z ) | = | ( ω ^ ( z ) e ^ ) ω ( z ) | = | ( 1 + ε ) ( ω ^ ( z ) + e ^ ) ω ( z ) | = | ( 1 + ε ) ( ω ( z ) e + e ^ ) ω ( z ) | u | ω ( z ) | + ( 1 + u ) | e e ^ | ,
and
e = ϵ b 0 + i ( ϵ b 1 y + ψ ) ,
where ϵ b 0 and ϵ b 1 are the error of b 0 and b 1 , respectively, such that
ϵ b n = n + p ϵ b n + 1 + q ϵ b n + 2 , ϵ b 0 = 0 + x ϵ b 1 + q ϵ b 2 ,
where n = π n + σ n + η n + ξ n ϵ q b n + 2 for n = N 1 , N 2 , 0 . Similarly, let
e ˜ = ϵ b ˜ 0 + i ( ϵ b ˜ 1 y + ψ ) , e ^ = ϵ b ^ 0 i ( ϵ b ^ 1 y ψ ) ,
where
ϵ b ˜ n = ^ n + p ϵ b ˜ n + 1 + q ^ ϵ b ˜ n + 2 , ϵ b ˜ 0 = ^ 0 + x ϵ b ˜ 1 + q ^ ϵ b ˜ 2 ,
and
ϵ b ^ n = ^ n p ϵ b ^ n + 1 q ^ ϵ b ^ n + 2 , ϵ b ^ 0 = ^ 0 x ϵ b ^ 1 q ^ ϵ b ^ 2 ,
where ^ n = π n σ n η n ξ n ϵ ^ q b ^ n + 2 for n = N 1 , N 2 , 0 . Then, Equation (26) can be simplified as
| ω ¯ ( z ) ω ( z ) | u | ω ( z ) | + ( 1 + u ) ( | e e ˜ | + | e ˜ e ^ | ) .
First, we consider the bound of | e e ˜ | . From Equations (11), (28) and (30), we have
ϵ b 1 = n = 1 N n | z | n 1 U n 1 ( t ) , ϵ b 0 = n = 0 N n | z | n T n ( t ) , ϵ b ˜ 1 = n = 1 N ^ n | z | n 1 U n 1 ( t ) , ϵ b ˜ 0 = n = 0 N ^ n | z | n T n ( t ) .
Then, by Equations (13), (27) and (29), we deduce
| e e ˜ | = | ( ϵ b 0 ϵ b ˜ 0 ) + i ( ϵ b 1 ϵ b ˜ 1 ) y | = | n = 0 N 1 ( n ^ n ) z n | n = 0 N 1 | n ^ n | | z | n .
In Algorithm 4, according to Lemmas 1 and 2, we obtain
| π n | u | p b ^ n + 1 | 2 u | z | | b ^ n + 1 | , | σ n | u | q ^ b ^ n + 2 | u ( 1 + γ 2 ) | z | 2 | b ^ n + 2 | , | η n | u | p b ^ n + 1 q ^ b ^ n + 2 | u ( 1 + γ 2 ) ( 2 | z | | b ^ n + 1 | + | z | 2 | b ^ n + 2 | ) , | ξ n | u | a n + p b ^ n + 1 q ^ b ^ n + 2 | u ( 1 + γ 2 ) ( | a n | + 2 | z | | b ^ n + 1 | + | z | 2 | b ^ n + 2 | ) , | ϵ q ^ | γ 2 | q ^ | γ 2 ( 1 + γ 2 ) | z | 2 ,
for n = 0 , 1 , , N , and then
| π n | + | σ n | + | η n | + | ξ n | + | ϵ q ^ | | b ^ n + 2 | ( 3 u + γ 2 ) ( 1 + γ 2 ) ( | a n | + 2 | z | | b ^ n + 1 | + | z | 2 | b ^ n + 2 | ) .
In Algorithm 4, from Equations (15)–(17), we have
b ^ N 1 = p a N < 2 > + a N 1 < 1 > , b ^ N 2 = p b ^ N 1 < 3 > q a N < 5 > + a N 2 < 1 > = a N ( p 2 q ) < 5 > + p a N 1 < 3 > + a N 2 < 1 > , b ^ 1 = p b ^ 2 < 3 > q b ^ 3 < 5 > + a 1 < 1 > = a N ( p N 1 + ) < 3 N 4 > + a N 1 ( p N 2 + ) < 3 N 5 > + , b ^ 0 = x b ^ 1 < 3 > q ^ b ^ 2 < 5 > + a 0 < 1 > = a N ( p N + ) < 3 N 1 > + a N 1 ( p N 1 + ) < 3 N 2 > + ,
Then, by induction, we get
| b n b ^ n | γ 3 ( N n ) 1 | b n | ,
and
| b ^ n | ( 1 + γ 3 ( N n ) 1 ) | b n | .
Assume that
g k = n = k N | a n | | z | n k , k = 0 , 1 , , N ,
Given this, then
n = 0 N g n | z | n ( N + 1 ) g 0 .
By combining Equations (11), (12) and (40), we have
| b n | ( N n + 1 ) g n , n = 1 , , N . | b 0 | g 0 .
It is easy to obtain that
^ n = π n < 4 > + σ n < 4 > + η n < 3 > + ξ n < 2 > ϵ ^ q b ^ n + 2 < 2 > ,
Through Lemma 2 and Equations (36)–(42), we deduce
| n ^ n | γ 4 ( | π n | + | σ n | + | η n | + | ξ n | + | ϵ q ^ | | b ^ n + 2 | ) + | ϵ q b n + 2 ϵ q ^ b ^ n + 2 | γ 4 ( 3 u + γ 2 ) ( 1 + γ 2 ) ( | a n | + 2 | z | | b ^ n + 1 | + | z | 2 | b ^ n + 2 | ) + | ϵ q | | b n + 2 b ^ n + 2 | + | ϵ q ϵ q ^ | | b ^ n + 2 | γ 4 γ 5 ( | a n | + 2 | z | | b ^ n + 1 | + | z | 2 | b ^ n + 2 | ) + γ 2 γ 3 ( N n ) 7 | z | 2 | b n + 2 | + u γ 3 | z | 2 | b ^ n + 2 | [ γ 5 2 ( 1 + γ 3 ( N n ) 4 ) + γ 2 γ 3 ( N n ) 7 ] ( | a n | + 2 | z | | b n + 1 | + | z | 2 | b n + 2 | ) γ 5 γ 3 ( N n ) + 1 ( | a n | + 2 | z | | b n + 1 | + | z | 2 | b n + 2 | ) γ 5 γ 3 ( N n ) + 1 ( | a n | + 2 ( N n ) | z | g n + 1 + ( N n 1 ) | z | 2 g n + 2 ) 3 γ 5 γ 3 ( N n ) + 1 ( N n ) g n .
Thus, from Equations (34), (41) and (43), we obtain
| e e ˜ | n = 0 N 1 | n ^ n | | z | n 3 N 2 γ 5 γ 3 N + 1 g 0 .
Second, we consider the bound of | e ˜ e ^ | . In Algorithm 4, according to Equations (15)–(17), we have
ϵ b ^ N 1 = ^ N 1 , ϵ b ^ N 2 = p ^ N 1 < 2 > + ^ N 2 < 1 > , ϵ b ^ N 3 = p ϵ b ^ N 2 < 3 > q ^ ^ N 1 < 3 > + ^ N 3 < 1 > = ^ N 1 ( p 2 q ^ ) < 5 > + p ^ N 2 < 3 > + ^ N 3 < 1 > , ϵ b ^ 1 = p ϵ b ^ 2 < 3 > q ^ ϵ b ^ 3 < 3 > + ^ 1 < 1 > = ^ N 1 ( p N 2 + ) < 3 N 7 > + ^ N 2 ( p N 3 + ) < 3 N 8 > + , ϵ b ^ 0 = x ϵ b ^ 1 < 3 > q ^ ϵ b ^ 2 < 3 > + ^ 0 < 1 > = ^ N 1 ( p N 1 + ) < 3 N 4 > + ^ N 2 ( p N 2 + ) < 3 N 5 > + .
Then, by induction, from Equations (13) and (33) as well as Lemma 1, we deduce
| ( ϵ b ˜ 0 ϵ b ^ 0 ) + i ( ϵ b ˜ 1 ϵ b ^ 1 ) y | γ 3 N 4 | n = 0 N 1 ^ n z n | γ 3 N 4 n = 0 N 1 | ^ n | | z | n ,
and
| ϵ b ^ 0 + i ( ϵ b ^ 1 y + ψ ) | | ϵ b ^ 0 + i ϵ b ^ 1 y | + | ψ | ( 1 + γ 3 N 4 ) | n = 0 N 1 ^ n z n | + u | y b ^ 1 | ( 1 + γ 3 N 4 ) n = 0 N 1 | ^ n | | z | n + u | z | | b ^ 1 | .
Thus, through Equations (29), (30), (46) and (47), we obtain
| e ˜ e ^ | = | ϵ b ˜ 0 + i ( ϵ b ˜ 1 y + ψ ) ϵ b ^ 0 < 2 > i ( ϵ b ^ 1 y + ψ ) < 3 > | | ( ϵ b ˜ 0 ϵ b ^ 0 ) + i ( ϵ b ˜ 1 ϵ b ^ 1 ) y | + γ 3 | ϵ b ^ 0 + i ( ϵ b ^ 1 y + ψ ) | γ 3 N 1 n = 0 N 1 | ^ n | | z | n + u γ 3 | z | | b ^ 1 | .
According to Equations (36), (39) and (42), we get
| ^ n | ( 1 + γ 5 ) ( | π n | + | σ n | + | η n | + | ξ n | + | ϵ ^ q | | b ^ n + 2 | ) γ 9 ( | a n | + 2 | z | | b ^ n + 1 | + | z | 2 | b ^ n + 2 | ) γ 9 ( 1 + γ 3 ( N n ) 4 ) ( | a n | + 2 | z | | b n + 1 | + | z | 2 | b n + 2 | ) γ 9 ( 1 + γ 3 ( N n ) 4 ) ( | a n | + 2 ( N n ) | z | g n + 1 + ( N n 1 ) | z | 2 g n + 2 ) 3 γ 9 ( 1 + γ 3 ( N n ) 4 ) ( N n ) g n .
Then, by Equations (39), (41), (48) and (49), we obtain
| e ˜ e ^ | 3 γ 9 γ 3 N 1 n = 0 N 1 ( 1 + γ 3 ( N n ) 4 ) ( N n ) g n + u γ 3 ( 1 + γ 3 N 4 ) N | z | g 1 ( 1 + γ 3 N 4 ) N ( 3 N γ 9 γ 3 N 1 + u γ 3 ) g 0 3 N 2 γ 10 γ 3 N 1 g 0 .
Hence, when combining Equations (32), (44) and (50), we have
| ω ¯ ( z ) ω ( z ) | u | ω ( z ) | + ( 1 + u ) 3 N 2 ( γ 5 γ 3 N + 1 + γ 10 γ 3 N 1 ) g 0 u | ω ( z ) | + 3 N 2 γ 15 γ 3 N + 1 g 0 .
Thus, with cond ( ω , z ) = g 0 / | ω ( z ) | , we deduce Equation (25).    □

3.2. Running Round-Off Error Analysis

Theorem 3 gives a running error bound of Algorithm 4 to evaluate the polynomial in Equation (1) in complex floating-point arithmetic:
Theorem 3.
Assume z , a n F + i F for n = 0 , , N . Then, the running round-off error bound in the compensated Goertzel algorithm for evaluting ω ( z ) = n = 0 N a n z n in floating-point arithmetic satisfies
| ω ¯ ( z ) ω ( z ) | ( | c | α ^ ) ( 1 2 u ) ,
where
c = ω ^ ( z ) e ^ ω ¯ ( z ) , α ^ = γ ^ 3 N + 1 E ^ ( 1 6 ( N 1 ) u ) , E ^ = E b ^ 0 i E b ^ 1 | y | , E b ^ n = | n ^ | | p | E b ^ n + 1 | q ^ | E b ^ n + 2 , E b ^ 0 = | 0 ^ | | x | E b ^ 1 | q ^ | E b ^ 2 , n = N 1 , N 2 , , 1 .
Proof. 
Assume e is the error of ω ^ ( z ) , i.e.,  ω ^ ( z ) + e = ω ( z ) . Let ω ¯ ( z ) + c = ω ^ ( z ) + e ^ . Then, we have
| ω ¯ ( z ) ω ( z ) | = | ω ¯ ( z ) ( ω ^ ( z ) + e ) | = | ω ¯ ( z ) ( ω ^ ( z ) + e ^ e ^ + e ) | | c | + | e e ^ | .
Considering the round-off error ϵ q in Algorithm 4, from Lemma 2, we have ϵ q = ϵ ^ q < 2 > , and thus
^ N 1 = π N 1 < 3 > + σ N 1 < 3 > + η N 1 < 2 > + ξ N 1 < 1 > , ^ N 2 = π N 2 < 4 > + σ N 2 < 4 > + η N 2 < 3 > + ξ N 2 < 2 > ϵ q b ^ N < 4 > , ^ 0 = π 0 < 4 > + σ 0 < 4 > + η 0 < 3 > + ξ 0 < 2 > ϵ q b ^ 2 < 4 > .
Combined with Equation (45), we can deduce that
| ϵ b 0 ϵ b ^ 0 | = | ϵ b 0 < 3 N 1 > ϵ b 0 | γ 3 N 1 E b 0 , | ϵ b 1 ϵ b ^ 1 | = | ϵ b 1 < 3 N 4 > ϵ b 1 | γ 3 N 4 E b 1 ,
and
| ϵ b ^ 0 | = | < 3 N 1 > ϵ b 0 | ( 1 + γ 3 N 1 ) E b 0 , | ϵ b ^ 1 | = | < 3 N 4 > ϵ b 1 | ( 1 + γ 3 N 4 ) E b 1 ,
where E b n can be derived from
E b n = | n | + | p | E b n + 1 + | q ^ | E b n + 2 , E b 0 = | 0 | + | x | E b 1 + | q ^ | E b 2 , n = π n + σ n + η n + ξ n ϵ q b ^ n + 2 , 0 = π 0 + σ 0 + η 0 + ξ 0 ϵ q b ^ 2 , n = N 1 , N 2 , , 1 .
Furthermore, through Equation (15), we have
| ^ n | = { { [ ( π n + σ n ) 1 1 + ε 1 + η n ] 1 1 + ε 2 + ξ n } 1 1 + ε 3 ϵ q 1 1 + ε 4 1 1 + ε 5 b ^ n + 2 1 1 + ε 6 } 1 1 + ε 7 , E b ^ n = [ ( | ^ n | + | p | E b ^ n + 1 1 1 + ε 8 ) 1 1 + ε 9 + | q ^ | E b ^ n + 2 1 1 + ε 10 ] 1 1 + ε 11 ,
where | ε k | u for k = 1 , 2 , , 11 . Then, we obtain
| n | + | p | E b ^ n + 1 + | q ^ | E b ^ n + 2 ( 1 + u ) 6 E b ^ n .
By induction, we get
E b 0 ( 1 + u ) 6 ( N 1 ) E b ^ 0 .
Assuming that E = E b 0 + i E b 1 | y | , we have
E ( 1 + u ) 6 ( N 1 ) ( E b ^ 0 + i E b ^ 1 | y | ) ( 1 + u ) 6 N 8 E ^ .
From Equations (15)–(17) and (29), we get e ^ = ϵ b ^ 0 < 2 > + i ψ < 2 > + i ϵ b ^ 1 y < 2 > = < 2 > e ˜ . Then, by Equations (56), (57) and (62), due to γ k ( 1 + u ) γ ^ k and ( 1 + u ) n E ^ E ^ ( 1 ( n + 1 ) u ) , we deduce
| e e ^ | | e e ˜ | + γ 2 | e ˜ | = | | ϵ b 0 ϵ b ^ 0 | + i | ϵ b 1 ϵ b ^ 1 | | y | | + γ 2 | ϵ b ^ 0 + i ϵ b ^ 1 y | γ 3 N 1 ( E b 0 + i E b 1 | y | ) + γ 2 ( 1 + γ 3 N 1 ) ( E b 0 + i E b 1 | y | ) γ 3 N + 1 E γ 3 N + 1 ( 1 + u ) 6 N 8 E ^ γ ^ 3 N + 1 E ^ ( 1 6 ( N 1 ) u ) : = α ^ .
Thus, by combining Equations (54) and (63), we obtain
| ω ¯ ( z ) ω ( z ) | | c | + | e e ^ | ( | c | α ^ ) ( 1 2 u ) .
   □
Considering a dynamic error estimate to evaluate the polynomial based on Theorem 3, we can improve Algorithm 4 into Algorithm 5, whose flowchart is shown in Figure 2. We can see from Algorithm 5 that a new approximate perturbation term is obtained by combining all the dynamic error estimates in each calculation, and participating in the following computation makes the final result more accurate.

3.3. Computational Complexity

The Horner and its compensated algorithm CompHorner [12] are recalled in Algorithms 6 and 7. We shall use TwoSum and TwoProd instead of TwoSumCplx and TwoProdCplx , while the inputs of Algorithm 7 are real floating-point numbers. A comparison of the computational costs of Algorithms 1 and 4–7 is shown in Table 2. As we can see, each of the compensated algorithms (i.e.,  CompHorner or CompGoertzel ) can be less expensive than the others in different cases. For example, CompGoertzel is half as expensive as CompHorner when z C , z ± 1 and  a n R . For  z C , | z | 1 and  z ± 1 , the cost of CompGoertzel is also less than that of CompHorner . Even in these cases, CompGoertzelwErr costs less than CompHorner . However, for  z R , CompGoertzel is twice as expensive as CompHorner regardless of a n .
Algorithm 5 Polynomial evaluation by compensated Goertzel algorithm with a dynamic error estimate
Function:  [ ω ¯ ( z ) , μ ] = CompGoertzelwErr ( ( a n ) n = 0 N , z )
Require:  z = x + i y F + i F , ( a n ) n = 0 N F + i F
Ensure:  ω ¯ ( z ) n = 0 N a n z n , | ω ( z ) ω ¯ ( z ) | μ
    p = 2 x
    [ q ^ , ϵ q ^ ] = SumOfSquares ( x , y )
    b ^ N = a N , b ^ N + 1 = ϵ b ^ N = ϵ b ^ N + 1 = E b ^ N = E b ^ N + 1 = 0
   for   n = N 1 , N 2 , . . . , 1
       [ r n , π n ] = TwoProdRC ( p , b ^ n + 1 )
       [ s n , σ n ] = TwoProdRC ( q ^ , b ^ n + 2 )
       [ t n , η n ] = TwoSumCplx ( r n , s n )
       [ b ^ n , ξ n ] = TwoSumCplx ( t n , a n )
       ^ n = π n σ n η n ξ n ϵ q ^ b ^ n + 2
       ϵ b ^ n = ^ n p ϵ b ^ n + 1 q ^ ϵ b ^ n + 2
       E b ^ n = | ^ n | | p | E b ^ n + 1 | q ^ | E b ^ n + 2
   end
    [ r 0 , π 0 ] = TwoProdRC ( x , b ^ 1 )
    [ s 0 , σ 0 ] = TwoProdRC ( q ^ , b ^ 2 )
    [ t 0 , η 0 ] = TwoSumCplx ( r 0 , s 0 )
    [ b ^ 0 , ξ 0 ] = TwoSumCplx ( t 0 , a 0 )
    ^ 0 = π 0 σ 0 η 0 ξ 0 ϵ q ^ b ^ 2
    ϵ b ^ 0 = ^ 0 x ϵ b ^ 1 q ^ ϵ b ^ 2
    E b ^ 0 = | ^ 0 | | x | E b ^ 1 | q ^ | E b ^ 2
    [ ϕ , ψ ] = TwoProdRC ( y , b ^ 1 )
    e ^ = ϵ b ^ 0 i ( ϵ b ^ 1 y ψ )
    E ^ = E b ^ 0 i E b ^ 1 y
    ω ^ ( z ) = b ^ 0 i ϕ
    ω ¯ ( z ) = ω ^ ( z ) e ^
    c = ω ^ ( z ) e ^ ω ¯ ( z )
    α ^ = ( γ ^ 3 N + 1 E ^ ) ( 1 6 ( N 1 ) u )
    μ = ( | c | α ^ ) ( 1 2 u )
Algorithm 6 Polynomial evaluation by the Horner algorithm
Function:  ω ^ ( z ) = Horner ( ( a n ) n = 0 N , z )
Require:    z = x + i y F + i F , ( a n ) n = 0 N F + i F
Ensure:  ω ^ ( z ) n = 0 N a n z n
    b ^ N + 1 = 0
   for   n = N , N 1 , . . . , 0
       b ^ n = b ^ n + 1 z a n
   end
    ω ^ ( z ) = b ^ 0
Algorithm 7 Polynomial evaluation by the compensated Horner algorithm
Function:    ω ¯ ( z ) = CompHorner ( ( a n ) n = 0 N , z )
Require:    z = x + i y F + i F , ( a n ) n = 0 N F + i F
Ensure:      ω ¯ ( z ) n = 0 N a n z n
    b ^ N + 1 = ϵ b ^ N + 1 = 0
   for   n = N 1 , N 2 . . . , 0
       [ r n , π n ] = TwoProdCplx ( b ^ n + 1 , z )
       [ b ^ n , σ n ] = TwoSumCplx ( r n , a n )
       ϵ b ^ n = ϵ b ^ n + 1 z π n σ n
   end
    ω ¯ ( z ) = b ^ 0 + ϵ b ^ 0

4. Numerical Experiments

In this section, we test the accuracy, performance and application of our algorithm. All numerical experiments are performed in IEEE-754 double precision as working precision.

4.1. Accuracy

The accuracy measurements in this part were found in MATLAB R2019b, and the exact results were obtained by the Symbolic Toolbox in order to compute the relative errors. As in [12], we considered the expanded form of the polynomial
ω ( z ) = ( z 1 i ) n
at z = 1.333 + 1.333 i for n = 3 :42, while the condition number varied from 10 3 to 10 33 . The relative accuracy of the Horner , Goertzel , CompHorner and CompGoertzel algorithms as well as the theoretical bounds of CompGoertzel in Theorems 2 and 3 are exhibited in Figure 3. We can observe that the CompGoertzel algorithm, which had almost the same accuracy as the CompHorner algorithm, was absolutely stable when the condition number was smaller than 10 16 . Moreover, the numerical results and the error bounds had a good agreement, especially the running error bound, which was almost the same as the real relative errors, along with the results while the condition number was smaller than 10 13 .

4.2. Running Time

In this part, we show the practical performance of the Goertzel , CompGoertzel , CompHorner and CompGoertzelwErr algorithms in terms of measured computing time. The tests were performed in the following environments:
  • Env1: Laptop with Intel Core i7-7700 CPU, 4 cores each at 3.6 GHz and with Microsoft Visual C++ 2012 with the default compiler option /od on Windows 7;
  • Env2: Node of workstation with Intel Xeon E5-2697A CPU, 16 cores each at 2.6 GHz and with gcc 7.4.0 with the default compiler option-O0 on x86_64-Ubuntu-linux 18.04.
We generated the test polynomials with random coefficients in the interval [ 1 , 1 ] , whose degree varied from 50 to 10,000 by a step of 50. The average time ratios for CompGoertzel / Goertzel , CompGoertzel / CompHorner , CompGoertzelwErr / CompGoertzel and CompGo ertzelwErr / CompHorner are reported in Table 3 and Table 4, while the coefficients of the test polynomials were a n R and a n C , respectively. As we can see, CompGoertzel was faster than Goertzel in practice, especially when testing in Linux. We note the good agreement of the numerical and theoretical results except for CompGoertzelwErr / CompHorner while z C , | z | 1 , z ± 1 and a n C in Env2. This is because CompHorner takes more benefit than CompGoertzelwErr from the Fused-Multiply-and-Add instruction [34,35] and the instruction-level parallelism [7,8]. However, CompGoertzelwErr was still faster than CompHorner in this case in Env1.

4.3. Application

The test environment in this part was the same as the accuracy measurements. We considered the polynomial in Equation (1) with a 0 , a 1 , , a N R and z k = e 2 π k i / ( N + 1 ) . Then, the DFT, which could be computed by the function “fft” with the polynomial’s coefficients in MATLAB returned ω ( z k ) for k = 0 , 1 , , N . The Goertzel algorithm can also compute the DFT with the polynomial’s coefficients of specific indices z k in a vector. Figure 4 shows the relative errors of Goertzel , CompGoertzel , and fft applied to polynomials whose degrees varied from 50 to 1000 by a step of 10 with random coefficients in the interval [ 1 , 1 ] , where the relative error was defined as res comput res exact 2 / res exact 2 . In Figure 4, although fft was more accurate than Goertzel , the relative errors of Goertzel and fft were all increasing, while the degree of the polynomial grew. However, CompGoertzel always obtained full-precision accurate results in this test, and the relative error of our algorithm was 10 15 to 10 17 , while the fft was from 10 12 to 10 15 .

5. Conclusions

In this paper, we presented a compensated Goertzel algorithm with dynamic error estimation to evaluate polynomials in complex floating-point arithmetic. The forward error analysis and numerical experiments show that the algorithm can yield full working precision accuracy. Furthermore, although the algorithm is as precise as the compensated Horner algorithm, it is quicker in certain situations. The algorithm also performed well in the application of computing the DFT of specific indices.

Author Contributions

Conceptualization, C.L., P.D. and K.L.; methodology, C.L., P.D., Y.L. and H.J.; validation, C.L., K.L. and Z.Q.; formal analysis, K.L., Y.L. and H.J.; investigation, C.L., P.D. and H.J.; resources, Y.L. and Z.Q.; data curation, C.L. and P.D.; writing—original draft preparation, C.L., P.D., K.L. and Y.L.; writing—review and editing, C.L., H.J. and Z.Q.; supervision, H.J. and Z.Q.; project administration, P.D. and H.J.; funding acquisition, P.D. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by National Natural Science Foundation of China (No. 61907034), the 173 program of China (2020-JCJQ-ZD-029), the National Key Research and Development Program of China (2020YFA0709803), the Science Challenge Project of China (TZ2016002).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

We would like to thank the reviewers for providing valuable comments about our article.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
DFTdiscrete Fourier transform
VSVolk and Schumaker

References

  1. Peña, J.M.; Sauer, T. On the multivariate Horner scheme. Siam J. Numer. Anal. 2000, 37, 1186–1197. [Google Scholar] [CrossRef]
  2. Gentleman, W.M. An error analysis of Goertzel’s (Watt’s) method for computing Fourier coefficients. Comput. J. 1969, 12, 160–165. [Google Scholar] [CrossRef]
  3. Newbery, A.C.R. Error analysis for Fourier series evaluation. Math. Comput. 1973, 27, 639–644. [Google Scholar] [CrossRef]
  4. Wilkinson, J.H. Rounding Errors in Algebraic Processes; Courier Corporation: Englewood Cliffs, NJ, USA, 1994. [Google Scholar]
  5. Smoktunowicz, A.; Wróbel, I. On improving the accuracy of Horner’s and Goertzel’s algorithms. Numer. Algorithms 2005, 38, 243–258. [Google Scholar] [CrossRef] [Green Version]
  6. Bailey, D.H. Library for Double-Double and Quad-Double Arithmetric. Available online: http://www.nersc.gov/dhbailey/mpdist/mpdist.html (accessed on 18 February 2021).
  7. Louvet, N. Compensated Algorithms in Floating-Point Arithmetic: Accuracy, Validation, Performances; Université de Perpignan Via Domitia: Perpignan, France, 2007. [Google Scholar]
  8. Langlois, P.; Louvet, N. More Instruction Level Parallelism Explains the Actual Efficiency of Compensated Algorithm; Technical Report hal-00165020; DALI Research Team, University of Perpignan: Perpignan, France, 2007. [Google Scholar]
  9. Ogita, T.; Rump, S.M.; Oishi, S. Accurate sum and dot product. Siam J. Sci. Comput. 2005, 26, 1955–1988. [Google Scholar] [CrossRef] [Green Version]
  10. Graillat, S.; Langlois, P.; Louvet, N. Algorithms for accurate, validated and fast polynomial evaluation. Jpn. J. Ind. Appl. Math. 2009, 26, 191–214. [Google Scholar] [CrossRef] [Green Version]
  11. Graillat, S.; Morain, V. Error-free transformations in real and complex floating-point arithmetic. In Proceedings of the International Symposium on Nonlinear Theory and Its Applications, Vancouver, BC, Canada, 16–19 September 2007; pp. 341–344. [Google Scholar]
  12. Graillat, S.; Morain, V. Accurate summation, dot product and polynomial evaluation in complex floating-point arithmetic. Inf. Comput. 2012, 216, 57–71. [Google Scholar] [CrossRef]
  13. Graillat, S. An accurate algorithm for evaluating rational functions. Appl. Math. Comput. 2018, 337, 494–503. [Google Scholar] [CrossRef] [Green Version]
  14. Cameron, T.; Graillat, S. On a Compensated Ehrlich-Aberth Method for the Accurate Computation of All Polynomial Roots. Available online: https://hal.archives-ouvertes.fr/hal-03335604 (accessed on 16 March 2021).
  15. Jiang, H.; Barrio, R.; Li, H.; Liao, X.; Cheng, L.; Su, F. Accurate evaluation of a polynomial in Chebyshev form. Appl. Math. Comput. 2011, 217, 9702–9716. [Google Scholar] [CrossRef]
  16. Jiang, H.; Li, S.; Cheng, L.; Su, F. Accurate evaluation of a polynomial and its derivative in Bernstein form. Comput. Math. Appl. 2010, 60, 744–755. [Google Scholar] [CrossRef] [Green Version]
  17. Delgado, J.; Peña, J.M. Algorithm 960: POLYNOMIAL: An Object-Oriented Matlab Library of Fast and Efficient Algorithms for Polynomials. ACM Trans. Math. Softw. 2016, 42, 1–19. [Google Scholar] [CrossRef]
  18. Kazal, N.Y.; Mukhlash, I.; Sanjoyo, B.A.; Hidayat, N.; Ozaki, K. Extended use of error-free transformation for real matrix multiplication to complex matrix multiplication. Siam J. Phys. Conf. Ser. 2021, 1821, 012022. [Google Scholar] [CrossRef]
  19. Ozaki, K. Error-free transformation of matrix multiplication for multi-precision computations. In Proceedings of the 19th International Symposium on Scientific Computing, Computer Arithmetic, and Verified Numerical Computations, Szeged, Hungary, 13–15 September 2021; Volume 33. [Google Scholar]
  20. Ozaki, K. An Error-Free Transformation for Matrix Multiplication with Reproducible Algorithms and Divide and Conquer Methods. J. Phys. Conf. Ser. 2020, 1490, 012062. [Google Scholar] [CrossRef]
  21. Ozaki, K.; Ogita, T. The Essentials of verified numerical computations, rounding error analyses, interval arithmetic, and error-free transformations. Nonlinear Theory Its Appl. 2020, 11, 279–302. [Google Scholar] [CrossRef]
  22. Blanchard, P.; Higham, D.J.; Higham, N.J. Accurately computing the log-sum-exp and softmax functions. IMA J. Numer. Anal. 2021, 41, 2311–2330. [Google Scholar] [CrossRef]
  23. Delgado, J.; Peña, J.M. Running relative error for the evaluation of polynomials. SIAM J. Sci. Comput. 2009, 31, 3905–3921. [Google Scholar] [CrossRef]
  24. Jiang, H.; Graillat, S.; Barrio, R.; Yang, C. Accurate, validated and fast evaluation of elementary symmetric functions and its application. Appl. Math. Comput. 2016, 273, 1160–1178. [Google Scholar] [CrossRef]
  25. Barrio, R.; Du, P.; Jiang, H.; Serrano, S. ORTHOPOLY: A library for accurate evaluation of series of classical orthogonal polynomials and their derivatives. Comput. Phys. Commun. 2018, 231, 146–162. [Google Scholar] [CrossRef]
  26. Croci, M.; Fasi, M.; Higham, N.J.; Mary, T.; Mikaitis, M. Stochastic rounding: Implementation, error analysis and applications. R. Soc. Open Sci. 2022, 9, 211631. [Google Scholar] [CrossRef]
  27. IEEE Standard 754-2008; Standard for Binary Floating Point Arithmetic. ANSI: New York, NY, USA, 2008.
  28. Clenshaw, C.W. A note on the summation of Chebyshev series. Math. Comput. 1955, 9, 118–120. [Google Scholar] [CrossRef] [Green Version]
  29. Szegö, G. Orthogonal Polynomials; American Mathematical Society: Providence, RI, USA, 1939. [Google Scholar]
  30. Knuth, D.E. The Art of Computer Programming: Seminumerical Algorithms, 3rd ed.; Addison-Wesley: Boston, MA, USA, 1998. [Google Scholar]
  31. Dekker, T.J. A floating-point technique for extending the available precision. Numer. Math. 1971, 18, 224–242. [Google Scholar] [CrossRef] [Green Version]
  32. Graillat, S.; Lauter, C.; Tang, P.T.; Yamanaka, N.; Oishi, S. Efficient Calculations of Faithfully Rounded I2-Norms of n-Vectors. ACM Trans. Math. Softw. 2015, 41, 1–20. [Google Scholar] [CrossRef] [Green Version]
  33. Higham, N.J. Accuracy and Stability of Numerical Algorithms, 2nd ed.; Society for Industrial and Applied Mathematics (SIAM): Philadelphia, PA, USA, 2002. [Google Scholar]
  34. Markstein, P. IA-64 and Elementary Functions: Speed and Precision; Prentice-Hall: Englewood Cliffs, NJ, USA, 2000. [Google Scholar]
  35. Nievergelt, Y. Scalar fused multiply-add instructions produce floating-point matrix arithmetic provably accurate to the penultimate digit. ACM Trans. Math. Softw. 2003, 29, 27–48. [Google Scholar] [CrossRef] [Green Version]
Figure 1. The flowchart of the compensated Goertzel algorithm.
Figure 1. The flowchart of the compensated Goertzel algorithm.
Mathematics 10 01788 g001
Figure 2. The flowchart of the compensated Goertzel algorithm with dynamic error estimates.
Figure 2. The flowchart of the compensated Goertzel algorithm with dynamic error estimates.
Mathematics 10 01788 g002
Figure 3. Accuracy of evaluation of ω ( z ) = ( z 1 i ) n at z = 1.333 + 1.333 i for n = 3 :42.
Figure 3. Accuracy of evaluation of ω ( z ) = ( z 1 i ) n at z = 1.333 + 1.333 i for n = 3 :42.
Mathematics 10 01788 g003
Figure 4. The relative errors of DFT for polynomials with random coefficients.
Figure 4. The relative errors of DFT for polynomials with random coefficients.
Mathematics 10 01788 g004
Table 1. Error-free transformations, their properties and operation costs.
Table 1. Error-free transformations, their properties and operation costs.
AlgorithmPropertiesFlops
[ x , y ] = TwoSum ( a , b ) x = a b , x + y = a + b 6
[ x , y ] = TwoProd ( a , b ) x = a b , x + y = a × b 17
[ x , y ] = TwoProdRC ( a , b ) x = a b , x + y = a × b 34
[ x , y ] = TwoSumCplx ( a , b ) x = a b , x + y = a + b 12
[ p , e , f , g ] = TwoProdCplx ( a , b ) p = a b , p + e + f + g = a × b 80
{⊕, ⊗} represents {+, ×} in floating-point operations.
Table 2. Comparison of computational costs of Horner , Goertzel , CompHorner , CompGoertzel and CompGoertzelwErr algorithms.
Table 2. Comparison of computational costs of Horner , Goertzel , CompHorner , CompGoertzel and CompGoertzelwErr algorithms.
VariatesCoefficients Horner Goertzel CompHorner CompGoertzel CompGoertzelwErr
z R a n R 2N4N + 426N + 355N + 4559N + 60
a n C 4N8N + 652N + 6110N + 66114N + 81
z C and | z | 1 a n R 7N − 44N + 790N + 655N + 9159N + 106
a n C 8N8N + 1297N + 6110N + 150114N + 165
z C , | z | = 1 and  z ± 1 a n R 7N − 43N + 490N + 634N + 2639N + 41
a n C 8N6N + 997N + 668N + 9072N + 105
Table 3. Theoretical computational complexity and measured running time ratios in a n R .
Table 3. Theoretical computational complexity and measured running time ratios in a n R .
Variates CompGoertzel Goertzel CompGoertzel CompHorner CompGoertzelwErr CompGoertzel CompGoertzelwErr CompHorner
Theoretical13.752.121.072.27
z R Evn19.151.421.131.59
Evn22.761.351.111.49
Theoretical13.7561.14%1.0765.59%
z C and | z | 1 Evn19.2964.19%1.1372.22%
Evn22.8167.71%1.174.14%
Theoretical11.3337.79%1.1543.35%
z C , | z | = 1 and z ± 1 Evn16.5344.09%1.148.43%
Evn22.2253.66%1.0958.38%
Table 4. Theoretical computational complexity and measured running time ratios in a n C .
Table 4. Theoretical computational complexity and measured running time ratios in a n C .
Variates CompGoertzel Goertzel CompGoertzel CompHorner CompGoertzelwErr CompGoertzel CompGoertzelwErr CompHorner
Theoretical13.752.121.042.19
z R Evn110.321.241.171.46
Evn24.81.161.351.57
Theoretical13.751.131.041.18
z C and | z | 1 Evn110.371.251.191.48
Evn24.81.171.351.58
Theoretical11.3370.13%1.0674.26%
z C , | z | = 1 and z ± 1 Evn17.2184.61%1.1798.54%
Evn23.6789.22%1.331.18
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Li, C.; Du, P.; Li, K.; Liu, Y.; Jiang, H.; Quan, Z. Accurate Goertzel Algorithm: Error Analysis, Validations and Applications. Mathematics 2022, 10, 1788. https://doi.org/10.3390/math10111788

AMA Style

Li C, Du P, Li K, Liu Y, Jiang H, Quan Z. Accurate Goertzel Algorithm: Error Analysis, Validations and Applications. Mathematics. 2022; 10(11):1788. https://doi.org/10.3390/math10111788

Chicago/Turabian Style

Li, Chuanying, Peibing Du, Kuan Li, Yu Liu, Hao Jiang, and Zhe Quan. 2022. "Accurate Goertzel Algorithm: Error Analysis, Validations and Applications" Mathematics 10, no. 11: 1788. https://doi.org/10.3390/math10111788

APA Style

Li, C., Du, P., Li, K., Liu, Y., Jiang, H., & Quan, Z. (2022). Accurate Goertzel Algorithm: Error Analysis, Validations and Applications. Mathematics, 10(11), 1788. https://doi.org/10.3390/math10111788

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop