Next Article in Journal
Optimization Design of PSS and SVC Coordination Controller Based on the Neighborhood Rough Set and Improved Whale Optimization Algorithm
Previous Article in Journal
Revealing GLCM Metric Variations across a Plant Disease Dataset: A Comprehensive Examination and Future Prospects for Enhanced Deep Learning Applications
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

The Development of Fast DST-II Algorithms for Short-Length Input Sequences

by
Krystian Bielak
1,
Aleksandr Cariow
1,* and
Mateusz Raciborski
2,*
1
Faculty of Computer Science and Information Technology, West Pomeranian University of Technology in Szczecin, Żołnierska 49, 71-210 Szczecin, Poland
2
Faculty of Computer Science and Telecommunications, Maritime University of Szczecin, Wały Chrobrego 1-2, 70-500 Szczecin, Poland
*
Authors to whom correspondence should be addressed.
Electronics 2024, 13(12), 2301; https://doi.org/10.3390/electronics13122301
Submission received: 7 May 2024 / Revised: 5 June 2024 / Accepted: 10 June 2024 / Published: 12 June 2024

Abstract

:
The subject of this work is the development of fast algorithms for the discrete sinusoidal transformation of the second type (DST-II) for sequences of input data of small length N = 2, 3, 4, 5, 6, 7, 8. The starting point for the development of algorithms is the well-known possibility of representing any discrete transformation in the form of a matrix–vector product. Due to the remarkable structural properties of the matrices of the DST-II transformation base, these matrices can be successfully factorized, which should lead to a reduction in the computational complexity of the procedure as a whole. You can factorize matrices in different ways. The art of designing fast algorithms is to find the factorization that produces the maximum effect. We justified the correctness of the obtained algorithmic solutions theoretically, using strict mathematical derivations of each of them. The developed algorithms were then further tested using MATLAB R2023b software to finally confirm their performance. Finally, we presented estimates of the computational complexity for each solution obtained and compared them with direct computational methods that rely on the direct calculation of matrix–vector products.

1. Introduction

Discrete trigonometric transforms are widely used in solving problems in many modern computing systems for digital signal and image processing, including filtering and denoising, noisy speech enhancement, interpolation, video coding, etc. [1,2,3,4,5,6,7,8,9,10]. There are eight different types of discrete cosine transform and eight types of discrete sine transform [11]. The popularity of the discrete cosine transform is based on the fact that it closely approximates the optimal Karhunen–Löwe transform (KLT) under a stationary first-order Markov condition with strong inter-pixel correlations. However, for low-correlation input signals, discrete sine transform (DST) provides lower data rates [12,13,14] because, like other orthogonal transforms, implementing the discrete sine transforms requires much time to search for algorithmic solutions. To reduce this time is an important task; this problem can be solved in two ways. One direction is the hardware implementation of calculations [15,16,17,18] and the other one is the reduction of the number of arithmetic operations necessary to implement the transform. A large number of papers is devoted to the development of effective algorithms for the implementation of various discrete cosine transforms (DCTs) and DSTs, but most of them pursue the search for universal solutions that allow reducing the number of arithmetic operations for arbitrary lengths of input data sequences [19,20,21,22,23,24,25]. There is a third way, which also has a right to exist. This is an approximation of discrete trigonometric transforms. To date, a large number of algorithms have been developed that use approximations of the DCT/DST transforms.
Approximation algorithms for sequences of standard lengths N = 4, 8, and 16 are known [26,27,28,29,30]. However, the development of reduced complexity algorithms for traditional small-size DCT/DST transforms has not been canceled. Some applications require the use of conventional DCT/DST transforms for various short-length input data sequences. This is explained by the fact that algorithms for small-size transforms can serve as kernels for the synthesis of larger algorithms [30,31]. A fairly large number of works have been devoted to the development of small-sized DCT algorithms, and much less attention has been paid to similar algorithms for the DST. Among the other types of discrete trigonometric transforms, DCT-II/DST-II plays an important role [29,32,33,34,35]. For small-sized DCT-II, the algorithms were shown in one of our previous papers [36]. We did not find any small-sized type II DSTs in the sources known to us. To fill this gap, we are developing fast algorithms for low-dimensional discrete trigonometric transformations to expand their collection. This paper is devoted to reduced complexity DST-II algorithms for input sequences of length N = 2, 3, 4, 5, 6, 7, 8.

2. Short Background

The discrete sine transform is one of the orthogonal transforms used, among others, for the analysis and processing of sounds and signals. DST-II can be represented by the following expression:
y k = 2 N ϵ k n = 0 N 1 x n sin n + 1 2 k + 1 π N
where
  • k = 0 , , N 1 ;
  • ϵ k equals 1 2 for k = N 1 and equals 1 for the remaining k;
  • y is the output sequence after the DST-II operation is performed;
  • x n is the sequence of input data;
  • N is the number of signal samples.
In matrix notation, DST-II can be represented as follows:
Y N × 1 = C N X N × 1
where
Y N × 1 = y 0 , y 1 , , y N 1 T , X N × 1 = x 0 , x 1 , , x N 1 T ,
y k , l = 2 N ϵ k sin l + 1 2 k + 1 π N
where
  • k , l = 0 , , N 1 ;
  • ϵ k equals 1 2 for k = N 1 and equals 1 for the remaining k.
DST-II in matrix notation is as follows:
y 0 y 1 y N 1 = c 0 , 0 c 0 , 1 c 0 , N 1 c 1 , 0 c 1 , 1 c 1 , N 1 c N 1 , 0 c N 1 , 1 c N 1 , N 1 x 0 x 1 x N 1 .
In this paper, we use the following markings and signs:
  • I N is an order N identity matrix;
  • H 2 is a 2 × 2 Hadamard matrix;
  • ⊗ is the Kronecker product of two matrices;
  • ⊕ is the direct sum of two matrices.
An empty cell in a matrix means it contains zero. We mark the multipliers as s m ( N ) , but we do not use a superscript in the data flow graphs in order to maintain greater readability and elegance.

3. Algorithm for 2-Point DST-II

The expression for two-point DST-II is as follows:
Y 2 × 1 = C 2 X 2 × 1
where
Y 2 × 1 = y 0 , y 1 T , X 2 × 1 = x 0 , x 1 T , C 2 = a 2 a 2 a 2 a 2 , a 2 = 0.7071 .
The expression for DST-II for N = 2 can be presented as follows:
Y 2 × 1 = H 2 D 2 X 2 × 1
where
H 2 = 1 1 1 1 , D 2 = diag s 0 ( 2 ) , s 1 ( 2 ) , s 0 ( 2 ) = s 1 ( 2 ) = a 2 .
Figure 1 shows a data flow graph of the synthesized algorithm for the two-point DST-II. As can be seen, we are able to reduce the number of multiplication operations from 4 to 2, while the number of addition operations is 2, which is the same as when using the direct method.

4. Algorithm for 3-Point DST-II

The expression for three-point DST-II is as follows:
Y 3 × 1 = C 3 X 3 × 1
where
Y 3 × 1 = y 0 , y 1 , y 2 T , X 3 × 1 = x 0 , x 1 , x 2 T , C 3 = a 3 d 3 a 3 b 3 0 b 3 c 3 c 3 c 3 , a 3 = 0.4082 , b 3 = 0.7071 , c 3 = 0.5774 , d 3 = 0.8165 .
Now, we will decompose the matrix C 3 into two components:
C 3 = C 3 ( a ) + C 3 ( b )
where
C 3 ( a ) = d 3 c 3 c 3 c 3 , C 3 ( b ) = a 3 a 3 b 3 b 3 .
After eliminating redundancy in matrix C 3 ( b ) and eliminating rows and columns containing only zero entries, we obtain matrix C 2 :
C 2 = a 3 a 3 b 3 b 3 .
Thanks to the already noted remarkable properties of structural matrices, the final computational procedure for the three-point DST-II takes the following form:
Y 3 × 1 = P 3 × 4 D 4 ( 0 ) W 4 ( 0 ) P 4 × 3 X 3 × 1
where
P 4 × 3 = 1 1 1 1 1 1 , W 4 ( 0 ) = H 2 I 2 , D 4 ( 0 ) = diag s 0 ( 3 ) , s 1 ( 3 ) , s 2 ( 3 ) , s 3 ( 3 ) ,
s 0 ( 3 ) = a 3 , s 1 ( 3 ) = b 3 , s 2 ( 3 ) = c 3 , s 3 ( 3 ) = d 3 , P 3 × 4 = 1 1 1 1 .
Figure 2 shows a data flow graph of the synthesized algorithm for the three-point DST-II. As can be seen, we are able to reduce the number of multiplication operations from 9 to 4, while the number of addition operations is 5, which is the same as when using the direct method.

5. Algorithm for 4-Point DST-II

The expression for four-point DST-II is as follows:
Y 4 × 1 = C 4 X 4 × 1
where
Y 4 × 1 = y 0 , y 1 , y 2 , y 3 T , X 4 × 1 = x 0 , x 1 , x 2 , x 3 T , C 4 = a 4 c 4 c 4 a 4 b 4 b 4 b 4 b 4 c 4 a 4 a 4 c 4 b 4 b 4 b 4 b 4 , a 4 = 0.2706 , b 4 = 0.5 , c 4 = 0.6533 .
Now, we need to change the order of columns and rows.
Let us define the permutations π 4 ( 0 ) and π 4 ( 1 ) in the following form:
π 4 ( 0 ) = 1 2 3 4 1 2 4 3 , π 4 ( 1 ) = 1 2 3 4 1 3 2 4 .
Permute columns of C 4 according to π 4 ( 0 ) and rows according to π 4 ( 1 ) . After permutations, the matrix acquires the following structure:
A 2 A 2 B 2 B 2 where A 2 = a 4 c 4 c 4 a 4 , B 2 = b 4 b 4 b 4 b 4 .
Matrices with such a structure allow effective factorization, which leads to a reduction in the number of arithmetic operations when calculating matrix–vector products [37]. In this work, we preserve designations of matrices T 2 × 3 ( 4 ) , T 3 × 2 ( 3 ) and T 2 × 3 ( 3 ) taken from [37]. Matrices A 2 and B 2 also have remarkable structures that reduce computational complexity.
Taking this into account, we can derive the final expression:
Y 4 × 1 = P 4 ( 1 ) W 4 × 5 D 5 W 5 × 4 W 4 ( 1 ) P 4 ( 0 ) X 4 × 1
where
P 4 ( 0 ) = 1 1 1 1 , W 4 ( 1 ) = H 2 I 2 , W 5 × 4 = T 3 × 2 ( 5 ) H 2 , T 3 × 2 ( 5 ) = 1 1 1 1 ,
D 5 = diag s 0 ( 4 ) , s 1 ( 4 ) , , s 4 ( 4 ) , s 0 ( 4 ) = a 4 c 4 , s 2 ( 4 ) = c 4 , s 1 ( 4 ) = a 4 + c 4 , s 3 ( 4 ) = s 4 ( 4 ) = b 4 ,
W 4 × 5 = T 2 × 3 ( 4 ) I 2 , T 2 × 3 ( 4 ) = 1 1 1 1 , P 4 ( 1 ) = 1 1 1 1 .
Figure 3 shows a data flow graph of the synthesized algorithm for four-point DST-II. As can be seen, we are able to reduce the number of multiplication operations from 8 to 3 and the number of addition operations from 12 to 9.

6. Algorithm for 5-Point DST-II

The expression for five-point DST-II is as follows:
Y 5 × 1 = C 5 X 5 × 1
where
Y 5 × 1 = y 0 , y 1 , y 2 , y 3 , y 4 T , X 5 × 1 = x 0 , x 1 , x 2 , x 3 , x 4 T ,
C 5 = a 5 c 5 f 5 c 5 a 5 b 5 d 5 0 d 5 b 5 c 5 a 5 f 5 a 5 c 5 d 5 b 5 0 b 5 d 5 e 5 e 5 e 5 e 5 e 5 ,
a 5 = 0.1954 , b 5 = 0.3717 , c 5 = 0.5117 , d 5 = 0.6015 , e 5 = 0.4472 , f 5 = 0.6325 .
Now, we will decompose the matrix C 5 into two components:
C 5 = C 5 ( a ) + C 5 ( b )
where
C 5 ( a ) = f 5 f 5 e 5 e 5 e 5 e 5 e 5 , C 5 ( b ) = a 5 c 5 c 5 a 5 b 5 d 5 d 5 b 5 c 5 a 5 a 5 c 5 d 5 b 5 b 5 d 5 .
The matrix C 5 ( a ) has one entry in the first and third rows and five entries with the same value in the fifth row, which means that the number of operations is small and we do not need to perform further transformations for this matrix.
After eliminating redundancy in matrix C 5 ( b ) and eliminating rows and columns containing only zero entries, we obtain matrix C 4 :
C 4 = a 5 c 5 c 5 a 5 b 5 d 5 d 5 b 5 c 5 a 5 a 5 c 5 d 5 b 5 b 5 d 5 .
We permute columns of C 4 according to π 4 ( 0 ) and rows according to π 4 ( 1 ) . After permutations, the matrix matches the matrix pattern:
A 2 A 2 B 2 B 2
where
A 2 = a 5 c 5 c 5 a 5 , B 2 = b 5 d 5 d 5 b 5 .
Considering the structures of the resulting matrices, the final computational procedure can be derived as:
Y 5 × 1 = P 5 × 6 W 6 × 7 D 7 W 7 × 6 W 6 ( 0 ) P 6 × 5 X 5 × 1
where
P 6 × 5 = 1 1 1 1 1 1 1 1 1 1 , W 6 ( 0 ) = 1 W 4 ( 1 ) 1 = 1 W 4 ( 1 ) 1 , W 7 × 6 = 1 H 2 T 3 × 2 ( 5 ) 1 ,
D 7 = diag s 0 ( 5 ) , s 1 ( 5 ) , , s 6 ( 5 ) , s 0 ( 5 ) = f 5 , s 1 ( 5 ) = a 5 + c 5 2 , s 2 ( 5 ) = a 5 c 5 2 ,
s 3 ( 5 ) = b 5 d 5 , s 4 ( 5 ) = b 5 + d 5 , s 5 ( 5 ) = d 5 , s 6 ( 5 ) = e 5 , W 6 × 7 = 1 H 2 T 2 × 3 ( 4 ) 1 , P 5 × 6 = 1 1 1 1 1 1 1 .
Figure 4 shows a data flow graph of the synthesized algorithm for five-point DST-II. As can be seen, we are able to reduce the number of multiplication operations from 23 to 7 and the number of addition operations from 18 to 17.

7. Algorithm for 6-Point DST-II

The expression for six-point DST-II is as follows:
Y 6 × 1 = C 6 X 6 × 1
where
Y 6 × 1 = y 0 , y 1 , y 2 , y 3 , y 4 , y 5 T , X 6 × 1 = x 0 , x 1 , x 2 , x 3 , x 4 , x 5 T ,
C 6 = a 6 c 6 e 6 e 6 c 6 a 6 b 6 f 6 b 6 b 6 f 6 b 6 c 6 c 6 c 6 c 6 c 6 c 6 d 6 0 d 6 d 6 0 d 6 e 6 c 6 a 6 a 6 c 6 e 6 c 6 c 6 c 6 c 6 c 6 c 6 , a 6 = 0.1494 , b 6 = 0.2887 , c 6 = 0.4082 , d 6 = 0.5 , e 6 = 0.5577 , f 6 = 0.5774 .
Now, we will decompose the matrix C 6 into two components:
C 6 = C 6 ( a ) + C 6 ( b )
where
C 6 ( a ) = c 6 c 6 f 6 f 6 c 6 c 6 c 6 c 6 c 6 c 6 c 6 c 6 c 6 c 6 c 6 c 6 c 6 c 6 , C 6 ( b ) = a 6 e 6 e 6 a 6 b 6 b 6 b 6 b 6 d 6 d 6 d 6 d 6 e 6 a 6 a 6 e 6 .
Matrix C 6 ( a ) has two of the same entries in the first, second, and fifth rows and six entries with the same value in the third and sixth rows, which allows us to reduce the number of operations without the need for further transformations.
After eliminating redundancy in matrix C 6 ( b ) and eliminating rows and columns containing only zero entries, we obtain matrix C 4 :
C 4 = a 6 e 6 e 6 a 6 b 6 b 6 b 6 b 6 d 6 d 6 d 6 d 6 e 6 a 6 a 6 e 6 .
Let us define the permutation π 4 ( 2 ) in the following form:
π 4 ( 2 ) = 1 2 3 4 1 4 2 3 .
We permute columns of C 4 according to π 4 ( 0 ) and rows according to π 4 ( 2 ) . After permutation, the matrix matches the matrix pattern:
C 4 = A 2 A 2 B 2 B 2
where
A 2 = a 6 e 6 e 6 a 6 , B 2 = b 6 b 6 d 6 d 6 .
Taking this into account, we can derive the final expression:
Y 6 × 1 = P 6 × 8 W 8 ( 2 ) D 8 W 8 ( 1 ) W 8 ( 0 ) P 8 × 6 X 6 × 1
where
P 8 × 6 = 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 , W 8 ( 0 ) = 1 W 4 ( 1 ) I 3 , W 8 ( 1 ) = 1 H 2 H 2 I 3 , D 8 = diag s 0 ( 6 ) , s 1 ( 6 ) , , s 7 ( 6 ) ,
s 0 ( 6 ) = f 6 , s 1 ( 6 ) = a 6 + e 6 2 , s 2 ( 6 ) = a 6 e 6 2 , s 3 ( 6 ) = b 6 , s 4 ( 6 ) = d 6 ,
s 5 ( 6 ) = s 6 ( 6 ) = s 7 ( 6 ) = c 6 , W 8 ( 2 ) = 1 H 2 I 5 ,
P 6 × 8 = 1 1 1 1 1 1 1 1 1 .
Figure 5 shows a data flow graph of the synthesized algorithm for six-point DST-II. As can be seen, we are able to reduce the number of multiplication operations from 30 to 7 and the number of addition operations from 28 to 25.

8. Algorithm for 7-Point DST-II

The expression for seven-point DST-II is as follows:
Y 7 × 1 = C 7 X 7 × 1
where
Y 7 × 1 = y 0 , y 1 , y 2 , y 3 , y 4 , y 5 , y 6 T , X 7 × 1 = x 0 , x 1 , x 2 , x 3 , x 4 , x 5 , x 6 T ,
C 7 = a 7 c 7 e 7 h 7 e 7 c 7 a 7 b 7 f 7 d 7 0 d 7 f 7 b 7 c 7 e 7 a 7 h 7 a 7 e 7 c 7 d 7 b 7 f 7 0 f 7 b 7 d 7 e 7 a 7 c 7 h 7 c 7 a 7 e 7 f 7 d 7 b 7 0 b 7 d 7 f 7 g 7 g 7 g 7 g 7 g 7 g 7 g 7 , a 7 = 0.1189 , b 7 = 0.2319 , c 7 = 0.3333 , d 7 = 0.4179 , e 7 = 0.4816 , f 7 = 0.5211 , g 7 = 0.3780 , h 7 = 0.5345 .
Now, we will decompose the matrix C 7 into two components:
C 7 = C 7 ( a ) + C 7 ( b )
where
C 7 ( a ) = h 7 h 7 h 7 g 7 g 7 g 7 g 7 g 7 g 7 g 7 ,
C 7 ( b ) = a 7 c 7 e 7 e 7 c 7 a 7 b 7 f 7 d 7 d 7 f 7 b 7 c 7 e 7 a 7 a 7 e 7 c 7 d 7 b 7 f 7 f 7 b 7 d 7 e 7 a 7 c 7 c 7 a 7 e 7 f 7 d 7 b 7 b 7 d 7 f 7 .
The matrix C 7 ( a ) has one entry in the first, third, and fifth rows and seven entries with the same value in the seventh row, which allows us to reduce the number of operations without the need for further transformations. After eliminating redundancy in matrix C 7 ( b ) and eliminating rows and columns containing only zero entries, we obtain matrix C 6 :
C 6 = a 7 c 7 e 7 e 7 c 7 a 7 b 7 f 7 d 7 d 7 f 7 b 7 c 7 e 7 a 7 a 7 e 7 c 7 d 7 b 7 f 7 f 7 b 7 d 7 e 7 a 7 c 7 c 7 a 7 e 7 f 7 d 7 b 7 b 7 d 7 f 7 .
Let us define the permutations π 6 ( 0 ) and π 6 ( 1 ) in the following form:
π 6 ( 0 ) = 1 2 3 4 5 6 1 5 3 2 4 6 , π 6 ( 1 ) = 1 2 3 4 5 6 1 2 3 6 5 4 .
We permute columns of C 6 according to π 6 ( 1 ) and rows according to π 6 ( 0 ) . After permutation, the matrix matches the matrix pattern:
C 6 = A 3 A 3 B 3 B 3 where A 3 = a 7 c 7 e 7 e 7 a 7 c 7 c 7 e 7 a 7 , B 3 = b 7 f 7 d 7 d 7 b 7 f 7 f 7 d 7 b 7 .
Now, we will move on to dealing with matrices A 3 and B 3 . In this case, a circular convolution matrix will be used [38]. The circular convolution matrix for N = 3 and the expressions for calculating the values are as follows:
H 3 = h 0 h 2 h 1 h 1 h 0 h 2 h 2 h 1 h 0 , s 0 = 1 3 h 0 + h 1 + h 2 , s 2 = h 1 h 2 , s 1 = h 0 h 2 , s 3 = 1 3 h 0 + h 1 2 h 2 .
The calculation procedure for the circular convolution matrix for N = 3 is presented below:
H 3 = T 3 ( 1 ) T 3 × 4 D 4 ( 1 ) T 4 × 3 T 3 ( 0 )
where
T 3 ( 0 ) = 1 1 1 1 1 1 1 , T 4 × 3 = 1 1 1 1 1 , D 4 ( 1 ) = diag s 0 , s 1 , s 2 , s 3 ,
T 3 × 4 = 1 1 1 1 1 , T 3 ( 1 ) = 1 1 1 1 1 1 1 .
To make the A 3 and B 3 matrices consistent with the circular convolution expression, we need to modify them. In the A 3 matrix, we change the sign of all terms in the second column and third row. In the B 3 matrix, we change the sign in the first row and first column. In this way, we obtain the matrices:
A 3 = a 7 c 7 e 7 e 7 a 7 c 7 c 7 e 7 a 7 , B 3 = b 7 f 7 d 7 d 7 b 7 f 7 f 7 d 7 b 7 .
Using the three-point convolution algorithm, the values s i for matrices A 3 and B 3 take the following form:
s 1 ( 7 ) = a 7 + e 7 c 7 3 , s 2 ( 7 ) = a 7 + c 7 , s 3 ( 7 ) = e 7 + c 7 , s 4 ( 7 ) = a 7 + e 7 + 2 c 7 3 ;
s 5 ( 7 ) = b 7 d 7 f 7 3 , s 6 ( 7 ) = b 7 + f 7 , s 7 ( 7 ) = d 7 + f 7 , s 8 ( 7 ) = b 7 d 7 + 2 f 7 3 .
Considering the presented derivations, the final computational procedure for 7-point DST-II will look as follows.
Y 7 × 1 = P 7 × 8 W 8 ( 7 ) W 8 ( 6 ) W 8 × 10 ( 0 ) D 10 W 10 × 8 ( 0 ) W 8 ( 5 ) W 8 ( 4 ) W 8 ( 3 ) P 8 × 7 X 7 × 1
where
P 8 × 7 = 1 1 1 1 1 1 1 1 1 1 1 1 1 1 , W 8 ( 3 ) = 1 W 6 ( 1 ) 1 , W 6 ( 1 ) = H 2 I 3 , W 8 ( 4 ) = I 2 1 1 1 I 3 , W 8 ( 5 ) = 1 T 3 ( 0 ) T 3 ( 0 ) 1 ,
W 10 × 8 ( 0 ) = 1 T 4 × 3 T 4 × 3 1 , D 10 = diag s 0 ( 7 ) , s 1 ( 7 ) , , s 9 ( 7 ) ,
s 0 ( 7 ) = h 7 , s 9 ( 7 ) = g 7 , W 8 × 10 ( 0 ) = 1 T 3 × 4 T 3 × 4 1 , W 8 ( 6 ) = 1 T 3 ( 1 ) T 3 ( 1 ) 1 , W 8 ( 7 ) = I 3 I 2 I 3 , P 7 × 8 = 1 1 1 1 1 1 1 1 1 1 .
Figure 6 shows a data flow graph of the synthesized algorithm for seven-point DST-II. As can be seen, we are able to reduce the number of multiplication operations from 46 to 10 and the number of addition operations from 39 to 37.

9. Algorithm for 8-Point DST-II

The expression for eight-point DST-II is as follows:
Y 8 × 1 = C 8 X 8 × 1
where
Y 8 × 1 = y 0 , y 1 , y 2 , y 3 , y 4 , y 5 , y 6 , y 7 T , X 8 × 1 = x 0 , x 1 , x 2 , x 3 , x 4 , x 5 , x 6 , x 7 T ,
C 8 = a 8 c 8 e 8 g 8 g 8 e 8 c 8 a 8 b 8 f 8 f 8 b 8 b 8 f 8 f 8 b 8 c 8 g 8 a 8 e 8 e 8 a 8 g 8 c 8 d 8 d 8 d 8 d 8 d 8 d 8 d 8 d 8 e 8 a 8 g 8 c 8 c 8 g 8 a 8 e 8 f 8 b 8 b 8 f 8 f 8 b 8 b 8 f 8 g 8 e 8 c 8 a 8 a 8 c 8 e 8 g 8 d 8 d 8 d 8 d 8 d 8 d 8 d 8 d 8 , a 8 = 0.0975 , b 8 = 0.1913 , c 8 = 0.2778 , d 8 = 0.3536 , e 8 = 0.4157 , f 8 = 0.4619 , g 8 = 0.4904 .
Let us define the permutations π 8 ( 0 ) and π 8 ( 1 ) in the following form:
π 8 ( 0 ) = 1 2 3 4 5 6 7 8 1 2 3 4 8 7 6 5 , π 8 ( 1 ) = 1 2 3 4 5 6 7 8 1 5 3 7 2 6 4 8 .
We permute columns of C 8 according to π 8 ( 0 ) and rows according to π 8 ( 1 ) . After permutation, the matrix matches the matrix pattern:
A 4 A 4 B 4 B 4
where
A 4 = a 8 c 8 e 8 g 8 e 8 a 8 g 8 c 8 c 8 g 8 a 8 e 8 g 8 e 8 c 8 a 8 , B 4 = b 8 f 8 f 8 b 8 f 8 b 8 b 8 f 8 d 8 d 8 d 8 d 8 d 8 d 8 d 8 d 8 .
After this operation, the calculation procedure is as follows:
Y 8 × 1 = P 8 ( 1 ) D 8 W 8 ( 8 ) P 8 ( 0 ) X 8 × 1
where
P 8 ( 0 ) = 1 1 1 1 1 1 1 1 , D 8 = A 4 B 4 , W 8 ( 8 ) = H 2 I 4 , P 8 ( 1 ) = 1 1 1 1 1 1 1 1 .
Now, we will deal with matrices A 4 and B 4 . The matrix A 4 does not fit any pattern and we need to modify it. We do this by changing the sign for the third column. In this way, we obtain a matrix that looks like this:
A 4 = a 8 c 8 e 8 g 8 e 8 a 8 g 8 c 8 c 8 g 8 a 8 e 8 g 8 e 8 c 8 a 8 .
Let us define the permutation π 4 ( 3 ) in the following form:
π 4 ( 3 ) = 1 2 3 4 2 1 3 4 .
We permute columns of A 4 according to π 4 ( 0 ) and rows according to π 4 ( 3 ) . Now, A 4 fits the pattern:
A 4 = A 2 B 2 B 2 A 2 where A 2 = e 8 a 8 a 8 c 8 , B 2 = c 8 g 8 g 8 e 8 .
Let us permute columns of B 4 according to π 4 ( 0 ) . Then, we are able to use the matrix pattern:
B 4 = E 2 E 2 F 2 F 2 where E 2 = b 8 f 8 f 8 b 8 , F 2 = d 8 d 8 d 8 d 8 .
After this operation, the calculation procedure is as follows:
Y 8 × 1 = P 8 ( 1 ) P 8 ( 3 ) W 8 × 10 ( 1 ) D 10 W 10 ( 0 ) W 10 × 8 ( 1 ) P 8 ( 2 ) W 8 ( 9 ) W 8 ( 8 ) P 8 ( 0 ) X 8 × 1
where
W 8 ( 9 ) = I 2 1 I 5 , P 8 ( 2 ) = P 4 ( 0 ) P 4 ( 0 ) , W 10 × 8 ( 1 ) = T 3 × 2 ( 3 ) I 2 I 4 , T 3 × 2 ( 3 ) = 1 1 1 1 , W 10 ( 0 ) = I 6 W 4 ( 1 ) , D 10 = G 2 J 2 B 2 E 2 F 2 ,
G 2 = A 2 B 2 = e 8 c 8 a 8 g 8 a 8 g 8 c 8 + e 8 , J 2 = A 2 B 2 = e 8 c 8 a 8 g 8 a 8 g 8 c 8 + e 8 ,
W 8 × 10 ( 1 ) = T 2 × 3 ( 4 ) I 2 I 4 , P 8 ( 3 ) = P 4 ( 2 ) I 4 , P 4 ( 2 ) = 1 1 1 1 .
Finally, we will deal with five matrices of size 2. In the matrices G 2 , J 2 , and B 2 we need to swap every row. Then, all matrices fit the pattern of the matrix:
a b c a .
The matrices E 2 and F 2 immediately fit this pattern:
a b b a .
Taking into account the matrix structures described above, the final computational procedure for the 8-point DST-II can be written as follows:
Y 8 × 1 = P 8 ( 1 ) P 8 ( 3 ) W 8 × 10 ( 1 ) P 10 W 10 ( 1 ) W 10 × 14 D 14 W 14 × 10 W 10 ( 0 ) W 10 × 8 ( 1 ) P 8 ( 2 ) W 8 ( 9 ) W 8 ( 8 ) P 8 ( 0 ) X 8 × 1
where
W 14 × 10 = T 3 × 2 ( 3 ) T 3 × 2 ( 3 ) T 3 × 2 ( 3 ) T 3 × 2 ( 3 ) I 2 , D 14 = diag s 0 ( 8 ) , s 1 ( 8 ) , , s 13 ( 8 ) ,
s 0 ( 8 ) = e 8 c 8 a 8 + g 8 , s 1 ( 8 ) = c 8 + e 8 a 8 + g 8 , s 2 ( 8 ) = a 8 g 8 ,
s 3 ( 8 ) = e 8 c 8 + a 8 + g 8 , s 4 ( 8 ) = c 8 + e 8 + a 8 + g 8 , s 5 ( 8 ) = a 8 g 8 ,
s 6 ( 8 ) = c 8 g 8 , s 7 ( 8 ) = e 8 g 8 , s 8 ( 8 ) = g 8 ,
s 9 ( 8 ) = b 8 f 8 , s 10 ( 8 ) = b 8 f 8 , s 11 ( 8 ) = f 8 , s 12 ( 8 ) = s 13 ( 8 ) = d 8 ,
W 10 × 14 = T 2 × 3 ( 3 ) T 2 × 3 ( 3 ) T 2 × 3 ( 3 ) T 2 × 3 ( 3 ) I 2 , T 2 × 3 ( 3 ) = 1 1 1 1 ,
W 10 ( 1 ) = I 8 H 2 , P 10 = P 2 P 2 P 2 I 4 , P 2 = 1 1 .
Figure 7 shows a data flow graph of the synthesized algorithm for eight-point DST-II. As can be seen, we are able to reduce the number of multiplication operations from 64 to 14 and the number of addition operations from 56 to 32.

10. Discussion of Computational Complexity

Firstly, we explain how to calculate the number of multiplication and addition operations for the direct DST-2 calculation method and proposed solutions. For any number that is a power of two, a shift can be used instead of a multiplication operation. If the value is zero, then we do not count addition and multiplication operations for it.
The above appear in the matrices: C 3 —one zero; C 4 —eight values of 0.5; C 5 —two zeros; C 6 —two zeros and four values of 0.5; C 7 —three zeros. In the proposed solutions, in diagonal matrices are the following: D 5 —two values of 0.5; D 8 —one value of 0.5.
The work shows how it is possible to reduce the number of multiplication operations in DST-II algorithms of sizes 2 to 8. At the same time, the number of addition operations was slightly reduced. The number of addition operations was reduced by an average of 20%, and the number of multiplication operations was reduced by an average of 74%. The achieved results are presented in Table 1, which contains data by taking into account the above rules.
This allows for a significant reduction in the amount of resources used on the signal processor while speeding up work and allowing for easier operation in real time. A significant reduction in multiplication operations contributes to this because, due to their characteristics, they are more expensive to use than addition operations.
Each proposed algorithm has been implemented in the MATLAB environment and we are confident that they all work correctly.

11. Conclusions

To date, many papers have already been published concerning the development of fast algorithms for implementing discrete trigonometric transforms [1,2,11,20,27]. These studies have not lost their relevance today. The presented article is a continuation of these studies. For well-known reasons, we have focused on developing fast algorithms for short sequences of input data. Generally speaking, small-sized, fast discrete trigonometric transform algorithms are of particular interest because they are subsequently used as building blocks for larger-sized algorithms. We plan to collect a library of fast short-length algorithms for all types of discrete trigonometric transforms. For some types of trigonometric transformations, such as DCT-I, DCT-II, and DCT-IV (as well as some others), such algorithms have already been developed [32,36,37]. The subject of our research is fast algorithms for small-sized DST-II transforms. The solutions presented here are intended to replenish the collection of fast discrete trigonometric transformation algorithms that many researchers have been working on for several decades. We present here the new algorithms we have obtained, without, however, claiming that they are optimal. This is what we managed to obtain, and we want to share our solutions with the scientific community. If someone manages to achieve better results, we will only be happy about it.

Author Contributions

Conceptualization, A.C.; methodology, A.C., K.B. and M.R.; software, K.B. and M.R.; validation, M.R.; formal analysis, A.C. and K.B.; investigation, K.B. and M.R.; writing—original draft preparation, M.R. and A.C.; writing—review and editing, A.C. and M.R.; supervision, A.C.; All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Britanak, V.; Yip, P.C.; Rao, K.R. Discrete Cosine and Sine Transforms: General Properties, Fast Algorithms and Integer Approximations; Academic Press: Amsterdam, The Netherlands; Boston, MA, USA, 2007. [Google Scholar]
  2. Britanak, V.; Rao, K.R. Cosine-Sine-Modulated Filter Banks: General Properties, Fast Algorithms and Integer Approximations; Springer International Publishing: Cham, Switzerland, 2018. [Google Scholar] [CrossRef]
  3. Mathur, S.; Mathur, A.; Agarwal, N. A Review of basic Mathematical Transformation used in Image Processing. Int. J. Eng. Res. Technol. 2016, 4, 12. [Google Scholar]
  4. Martucci, S.A.; Mersereau, R.M. New approaches to block filtering of images using symmetric convolution and the DST or DCT. In Proceedings of the 1993 IEEE International Symposium on Circuits and Systems, Chicago, IL, USA, 3–6 May 1993; Volume 1, pp. 259–262. [Google Scholar]
  5. Malini, S.; Moni, R.S. Use of Discrete Sine Transform for A Novel Image Denoising Technique. Int. J. Image Process. 2014, 8, 204–213. [Google Scholar]
  6. Dhamija, S.; Jain, P. Comparative Analysis for Discrete Sine Transform as a suitable method for noise estimation. Int. J. Comput. Sci. Issues 2011, 8, 162–164. [Google Scholar]
  7. Li, X.; Xie, H.; Cheng, B. Noisy Speech Enhancement Based on Discrete Sine Transform. In Proceedings of the First International Multi-Symposiums on Computer and Computational Sciences (IMSCCS’06), Hangzhou, China, 20–24 June 2006; Volume 1, pp. 199–202. [Google Scholar] [CrossRef]
  8. Wang, Z.; Jullien, G.; Miller, W. Interpolation using the discrete sine transform with increased accuracy. Electron. Lett. 1993, 29, 1918–1920. [Google Scholar] [CrossRef]
  9. Agarwal, N.; Solanki, R.; Khan, A. Application of Discrete Sine Transform in Image Processing. Int. J. Eng. Res. Technol. 2015, 3, 23. [Google Scholar]
  10. Joshi, R.; Reznik, Y.A.; Karczewicz, M. Efficient large size transforms for high-performance video coding. In Proceedings of the Applications of Digital Image Processing XXXIII, San Diego, CA, USA, 7 September 2010; Tescher, A.G., Ed.; SPIE: Paris, France, 2010. [Google Scholar] [CrossRef]
  11. Reznik, Y.A. Relationship between DCT-II, DCT-VI, and DST-VII transforms. In Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, BC, Canada, 26–31 May 2013; pp. 5642–5646. [Google Scholar] [CrossRef]
  12. Jain, P.; Jain, A. Regressive Structures for Computation of DST-II and Its Inverse. ISRN Electron. 2012, 2012, 1–4. [Google Scholar] [CrossRef]
  13. Clarke, R. Relation between the Karhunen–Loeve and sine transforms. Electron. Lett. 1984, 20, 12. [Google Scholar] [CrossRef]
  14. Chiper, D.; Swamy, M.; Ahmad, M.; Stouraitis, T. Systolic algorithms and a memory-based design approach for a unified architecture for the computation of DCT/DST/IDCT/IDST. IEEE Trans. Circuits Syst. Regul. Pap. 2005, 52, 1125–1137. [Google Scholar] [CrossRef]
  15. Britanak, V.; Rao, R. Two-dimensional DCT/DST universal computational structure for 2m × 2n block sizes. IEEE Trans. Signal Process. 2000, 48, 3250–3255. [Google Scholar] [CrossRef]
  16. Chiper, D.; Swamy, M.; Ahmad, M.; Stouraitis, T. A systolic array architecture for the discrete sine transform. IEEE Trans. Signal Process. 2002, 50, 2347–2354. [Google Scholar] [CrossRef]
  17. Meher, P.K.; Vinod, A.P.; Patra, J.C.; Swamy, M.N.S. Reduced-Complexity Concurrent Systolic Implementation of the Discrete Sine Transform. In Proceedings of the APCCAS 2006—2006 IEEE Asia Pacific Conference on Circuits and Systems, Singapore, 4–7 December 2006; pp. 1535–1538. [Google Scholar] [CrossRef]
  18. Meher, P.K.; Swamy, M.N.S. New Systolic Algorithm and Array Architecture for Prime-Length Discrete Sine Transform. IEEE Trans. Circuits Syst. II Express Briefs 2007, 54, 262–266. [Google Scholar] [CrossRef]
  19. Wang, Z. Fast discrete sine transform algorithms. Signal Process. 1990, 19, 91–102. [Google Scholar] [CrossRef]
  20. Murthy, N.; Swamy, M. On the computation of running discrete cosine and sine transform. IEEE Trans. Signal Process. 1992, 40, 1430–1437. [Google Scholar] [CrossRef]
  21. Nikara, J.A.; Takala, J.H.; Astola, J.T. Discrete cosine and sine transforms—regular algorithms and pipeline architectures. Signal Process. 2006, 86, 230–249. [Google Scholar] [CrossRef]
  22. Murty, M.N. Realization of Prime-Length Discrete Sine Transform Using Cyclic Convolution. Int. J. Eng. Sci. Technol. 2013, 5, 583–589. [Google Scholar]
  23. Shao, X.; Johnson, S.G. Type-II/III DCT/DST algorithms with reduced number of arithmetic operations. Signal Process. 2008, 88, 1553–1564. [Google Scholar] [CrossRef]
  24. Gupta, A.; Rao, K. A fast recursive algorithm for the discrete sine transform. IEEE Trans. Acoust. Speech, Signal Process. 1990, 38, 553–557. [Google Scholar] [CrossRef]
  25. Perera, S.M.; Lingsch, L.E. Sparse Matrix Based Low-Complexity, Recursive, and Radix-2 Algorithms for Discrete Sine Transforms. IEEE Access 2021, 9, 141181–141198. [Google Scholar] [CrossRef]
  26. Gnativ, L.A.; Shevchuk, E.S. Methods of Synthesis of Efficient Orthogonal Transforms of High and Low Correlation and Their Fast Algorithms for Coding and Compressing Digital Images. Cybern. Syst. Anal. 2002, 38, 879–890. [Google Scholar] [CrossRef]
  27. Hnativ, L.O.; Luts, V.K. Integer Modified Sine-Cosine Transforms Type VII. A construction Method and Separable Directional Adaptive Transforms for Intra Prediction with 8 × 8 Chroma Blocks in Image/Video Coding. Cybern. Syst. Anal. 2021, 57, 155–164. [Google Scholar] [CrossRef]
  28. Hnativ, L.O.; Luts, V.K. Algorithms for Fast Implementation of 4-Point Integer Sine Type VII Transforms without Multiplication and Separable Directional Adaptive Transforms for Intra Prediction in Image/Video Coding. Cybern. Syst. Anal. 2020, 56, 159–170. [Google Scholar] [CrossRef]
  29. Hnativ, L.O. Integer Cosine Transforms for High-Efficiency Image and Video Coding. Cybern. Syst. Anal. 2016, 52, 802–816. [Google Scholar] [CrossRef]
  30. Cintra, R.J.; Bayer, F.M.; Madanayake, A.; Potluri, U.S.; Edirisuriya, A. Fast Algorithms and Architectures for 8-Point DST-II/DST-VII Approximations. J. Circuits Syst. Comput. 2016, 26, 1750045. [Google Scholar] [CrossRef]
  31. Cariow, A.; Lesiecki, L. Small-Size Algorithms for Type-IV Discrete Cosine Transform with Reduced Multiplicative Complexity. Radioelectron. Commun. Syst. 2020, 63, 465–487. [Google Scholar] [CrossRef]
  32. Kolenderski, M.; Cariow, A. Small-Size Algorithms for the Type-I Discrete Cosine Transform with Reduced Complexity. Electronics 2022, 11, 2411. [Google Scholar] [CrossRef]
  33. Murty, M.N. Radix-2 Algorithms for Implementation of Type-II Discrete Cosine Transform and Discrete Sine Transform. Int. J. Eng. Res. Appl. 2013, 3, 602–608. [Google Scholar]
  34. Murty, M.N.; Padhy, B. Radix-3 Algorithm for Realization of Type-II Discrete Sine Transform. Int. J. Eng. Res. Appl. 2015, 5, 9–15. [Google Scholar]
  35. Wu, Y.; Zhu, Z. A new radix-3 fast algorithm for computing the DST-II. In Proceedings of the IEEE 1995 National Aerospace and Electronics Conference, NAECON 1995, Dayton, OH, USA, 22–26 May 1995; Volume 1, pp. 324–327. [Google Scholar] [CrossRef]
  36. Cariow, A.; Makowska, M.; Strzelec, P. Small-Size FDCT/IDCT Algorithms with Reduced Multiplicative Complexity. Radioelectron. Commun. Syst. 2019, 62, 559–576. [Google Scholar] [CrossRef]
  37. Cariow, A. Strategies for the Synthesis of Fast Algorithms for the Computation of the Matrix-vector Products. J. Signal Process. Theory Appl. 2014, 3, 1–19. [Google Scholar] [CrossRef]
  38. Blahut, R.E. Fast Algorithms for Signal Processing; Cambridge University Press: Cambridge, MA, USA, 2010. [Google Scholar] [CrossRef]
Figure 1. The data flow graph of the proposed algorithm for computation of two-point DST-II.
Figure 1. The data flow graph of the proposed algorithm for computation of two-point DST-II.
Electronics 13 02301 g001
Figure 2. The data flow graph of the proposed algorithm for computation of three-point DST-II.
Figure 2. The data flow graph of the proposed algorithm for computation of three-point DST-II.
Electronics 13 02301 g002
Figure 3. The data flow graph of the proposed algorithm for computation of four-point DST-II.
Figure 3. The data flow graph of the proposed algorithm for computation of four-point DST-II.
Electronics 13 02301 g003
Figure 4. The data flow graph of the proposed algorithm for computation of five-point DST-II.
Figure 4. The data flow graph of the proposed algorithm for computation of five-point DST-II.
Electronics 13 02301 g004
Figure 5. The data flow graph of the proposed algorithm for computation of six-point DST-II.
Figure 5. The data flow graph of the proposed algorithm for computation of six-point DST-II.
Electronics 13 02301 g005
Figure 6. The data flow graph of the proposed algorithm for computation of seven-point DST-II.
Figure 6. The data flow graph of the proposed algorithm for computation of seven-point DST-II.
Electronics 13 02301 g006
Figure 7. The data flow graph of the proposed algorithm for computation of eight-point DST-II.
Figure 7. The data flow graph of the proposed algorithm for computation of eight-point DST-II.
Electronics 13 02301 g007
Table 1. Comparison of the direct method with the proposed solutions.
Table 1. Comparison of the direct method with the proposed solutions.
Direct MethodProposed Solutions
N Additions Multiplications Additions Multiplications
22422
35854
412893
51823177
62830257
739463710
856643214
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Bielak, K.; Cariow, A.; Raciborski, M. The Development of Fast DST-II Algorithms for Short-Length Input Sequences. Electronics 2024, 13, 2301. https://doi.org/10.3390/electronics13122301

AMA Style

Bielak K, Cariow A, Raciborski M. The Development of Fast DST-II Algorithms for Short-Length Input Sequences. Electronics. 2024; 13(12):2301. https://doi.org/10.3390/electronics13122301

Chicago/Turabian Style

Bielak, Krystian, Aleksandr Cariow, and Mateusz Raciborski. 2024. "The Development of Fast DST-II Algorithms for Short-Length Input Sequences" Electronics 13, no. 12: 2301. https://doi.org/10.3390/electronics13122301

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop