Next Article in Journal
Development of a Novel Spherical Light-Based Positioning Sensor in Solar Tracking
Previous Article in Journal
Multi-Head Spatiotemporal Attention Graph Convolutional Network for Traffic Prediction
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Brief Report

Invariant Pattern Recognition with Log-Polar Transform and Dual-Tree Complex Wavelet-Fourier Features

Department of Computer Science and Software Engineering, Concordia University, Montreal, QC H3G 1M8, Canada
*
Author to whom correspondence should be addressed.
Sensors 2023, 23(8), 3842; https://doi.org/10.3390/s23083842
Submission received: 11 March 2023 / Revised: 29 March 2023 / Accepted: 7 April 2023 / Published: 9 April 2023
(This article belongs to the Topic Applications in Image Analysis and Pattern Recognition)

Abstract

:
In this paper, we propose a novel method for 2D pattern recognition by extracting features with the log-polar transform, the dual-tree complex wavelet transform (DTCWT), and the 2D fast Fourier transform (FFT2). Our new method is invariant to translation, rotation, and scaling of the input 2D pattern images in a multiresolution way, which is very important for invariant pattern recognition. We know that very low-resolution sub-bands lose important features in the pattern images, and very high-resolution sub-bands contain significant amounts of noise. Therefore, intermediate-resolution sub-bands are good for invariant pattern recognition. Experiments on one printed Chinese character dataset and one 2D aircraft dataset show that our new method is better than two existing methods for a combination of rotation angles, scaling factors, and different noise levels in the input pattern images in most testing cases.

1. Introduction

Pattern recognition is a very important topic in computer vision. It is extremely useful in optical character recognition (OCR), face recognition, iris recognition, fingerprint recognition, palmprint recognition, etc. Furthermore, feature extraction from 2D pattern images is a crucial step in invariant pattern recognition [1]. Most existing methods lack the invariant property, which is undesirable in real-life applications. For example, translation invariance, rotation invariance, and scaling invariance are very important in invariant pattern recognition. Pattern recognition can automatically recognize patterns in data, which can be anything from text and images to sounds or other definable qualities. It can recognize 2D pattern images quickly and precisely.
In this paper, we propose to extract invariant features by the log-polar transform [2], the dual-tree complex transform (DTCWT [3]), and the 2D fast Fourier transform (FFT2 [4]). The DTCWT transform decomposes the pattern image in a multiresolution way, and it is invariant to spatial shift, which is very important in pattern recognition. We know that very low-resolution sub-bands lose fine features in the pattern images, and very high-resolution sub-bands contain a significant amount of noise [5]. Hence, intermediate-resolution sub-bands are extremely good for invariant pattern recognition. Our extracted features are invariant to translation, rotation, and scaling. Experiments show that our new method is better than the log-polar-FFT2 method, and the log-polar discrete wavelet transform (DWT [6])-FFT2 method for recognizing printed Chinese characters and 2D aircraft in most testing cases. This demonstrates that our new method is very useful in many real-life applications.
The organization of this paper is as follows. Section 2 proposes a novel method for invariant pattern recognition. Section 3 describes an experiment conducted to test the effectiveness of our proposed method. Finally, Section 4 draws the conclusion of the paper and introduces future research directions.

2. Proposed Method

The log-polar transform converts rotation and scaling into spatial shifts. Mathematically speaking, it is a coordinate system in two dimensions, where a point is identified by two numbers, one for the logarithm of the distance to a certain point, and one for an angle. Let image F2 be the translated, rotated, and scaled version of image F1; then they are correlated as
F 2 ( r , θ ) = F 1 ( r a , θ θ 0 )
F 2 ( log r , θ ) = F 1 ( log r log a , θ θ 0 )
F 2 ( ξ , θ ) = F 1 ( ξ d , θ θ 0 )
where
ξ = log r
d = log a .
As a result, their Fourier spectra will be invariant to translation, rotation, and scaling of the input pattern images. This is because the magnitudes of the Fourier coefficients are invariant to spatial shifts.
The DTCWT can decompose the image into multiresolution scales in a translation- invariant way. It computes the complex transform of an image by using two separate DWT decompositions, namely, tree a and tree b. If the filters used in one are specifically designed differently from those in the other, it is possible for one DWT to produce the real coefficients and the other the imaginary. Taking the FFT2 transform of each DTCWT sub-band coefficient will result in translation-invariant sub-band coefficients, so that we can recognize each pattern image effectively. In this paper, we use the 3rd, 4th, and 5th sub-band coefficients of the DTCWT transform for the recognition of printed Chinese characters and 2D aircraft. It is clear thar very low-resolution sub-bands lose important features in the pattern images and very high-resolution sub-bands contain a lot of noise. As a result, intermediate-resolution sub-bands are desirable for invariant pattern recognition.
Our newly proposed method in this paper can be summarized as follows:
Translate the input pattern image to the centroid of the pattern.
Convert the pattern from cartesian coordinates to log-polar coordinates of 128 × 128 pixels in size. Let us denote it as LP.
Perform DTCWT transform on the log-polar image LP for K = 5 decomposition scales.
Construct complex sub-bands: COMPLEX (k) = Tree a (k) + i × Tree b (k), kϵ[1,K].
Conduct FFT2 transform for each COMPLEX sub-band, and take their spectra.
Recognize the pattern image to one known class by using the 3rd, 4th, and 5th sub-bands of the computed transform coefficients.
Compute the correct recognition rate for the testing dataset by using the nearest neighbor (NN) classifier.
The FFT2 [3] converts an image from its original domain in space to a representation in the frequency domain. It can reduce the complexity of computing the discrete Fourier transform (DFT) of an image from O((M × N)2) to O(M × N log (M × N)), where M and N are the row and column numbers of the pattern image. The Fourier spectra are invariant to spatial shifts, which is very important for invariant pattern recognition.
We also compare our new method with two existing methods: log-polar FFT2 and log-polar DWT-FFT2. The first method computes the log-polar transform and then takes the FFT2 to obtain invariant features. The second method performs the log-polar transform, the DWT transform [4], and the FFT2 transform to obtain invariant features.
Our invariant pattern recognition method is still useful and competitive when compared to convolutional neural networks (CNN). It can extract invariant features from pattern images quickly, instead of training a CNN for many hours. Our new method achieves very high classification accuracies for two datasets and for different combinations of deformations in the input pattern images, as demonstrated in the experimental section in the paper.
The computational complexity of this paper can be given as follows. Let the pattern image be of size M × N. The log-polar transform is a linear operation with complexity O(M × N). The DTCWT transform is a linear operation with complexity O(M × N). The FFT2 is in the complexity of O(M × N log(M × N)). As a result, the total complexity of our new method is O(M × N log(M × N)).
The major contribution of this paper is that we have successfully extracted very stable features in multiresolution from the pattern images, which are invariant to translation, rotation, and scaling. It is not common to see published papers that achieve all three invariant properties in a multiresolution way. It is well-known that very low-resolution sub-bands lose important features in the pattern images, and very high-resolution sub-bands contain a lot of noise. Consequently, intermediate-resolution sub-bands are very good for invariant pattern recognition. Experimental results demonstrate that our new method proposed in this paper is better than the log-polar FFT2 method and the log-polar DWT-FFT2 method in most testing cases for recognizing printed Chinese characters and 2D aircraft.

3. Experiments

We conducted experiments with one printed Chinese character dataset (Figure 1) and one 2D aircraft dataset (Figure 2), where 85 characters and 20 aircraft exist in each dataset, respectively. Both datasets are in binary format. We performed experiments with the proposed method in this paper, the log-polar-FFT2 method, and the log-polar DWT-FFT2 method. We deformed the input pattern images with scaling factors 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, and 1.0. We rotated the input pattern images with 30°, 60°, 90°, 120°, 150°, 180°, 210°, 240° and 270°. We also added noise to the input pattern images with signal-to-noise-ratio (SNR) = 20, 15, 10, 5, 4, 3, 2, 1 and 0.5 (Figure 3). The SNR is defined as:
S N R = ( F i , j a v g ( F ) ) 2 ( n i , j a v g ( n ) ) 2
where F is the noise-free image, n is the added Gaussian white noise, and avg(F) and avg(n) are the average values of the image F and image n, respectively. A combination of rotation angles and scaling factors for an aircraft is shown in Figure 4.
Our experimental results are demonstrated as follows. Table 1 tabulates the correct recognition rates for a combination of rotation angles and scaling factors for the proposed method for the printed Chinese character dataset. Table 2 displays the correct recognition rates for a combination of rotation angles and scaling factors for the log-polar-FFT2 method for the printed Chinese character dataset. Table 3 shows the correct recognition rates for a combination of rotation angles and scaling factors for the log-polar DWT-FFT2 method for the printed Chinese character dataset. Table 4 tabulates the correct recognition rates for a combination of rotation angles and scaling factors for the proposed method for the 2D aircraft dataset. Table 5 shows the correct recognition rates for a combination of rotation angles and scaling factors for the log-polar-FFT2 method for the 2D aircraft dataset. Table 6 shows the correct recognition rates for a combination of rotation angles and scaling factors for the log-polar DWT-FFT2 method for the 2D aircraft dataset. Table 7 tabulates the correct recognition rates for a combination of rotation angles and noise levels for the proposed method for the printed Chinese character dataset. Table 8 shows the correct recognition rates for a combination of rotation angles and noise levels for the log-polar-FFT2 method for the printed Chinese character dataset. Table 9 lists the correct recognition rates for a combination of rotation angles and noise levels for the log-polar DWT-FFT2 method for the printed Chinese character dataset. Table 10 shows the correct recognition rates for a combination of rotation angles and noise levels for the proposed method for the 2D aircraft dataset. Table 11 tabulates the correct recognition rates for a combination of rotation angles and noise levels for the log-polar-FFT2 method for the 2D aircraft dataset. Table 12 shows the correct recognition rates for a combination of rotation angles and noise levels for the log-polar DWT-FFT2 method for the 2D aircraft dataset.
From Table 1, Table 2, Table 3, Table 4, Table 5, Table 6, Table 7, Table 8, Table 9, Table 10, Table 11 and Table 12, we can see that our proposed method in this paper performs the best in most testing cases. Our new method is better than both the log-polar-FFT2 method and the log-polar DWT-FFT2 method for a combination of rotation and scaling factors and different noise levels in most testing cases. Our new method is not as good as existing methods in rare cases in our experiments. Furthermore, our new method is fast as well, in terms of CPU computational time for invariant pattern recognition.

4. Conclusions

Invariant pattern recognition is an extremely important topic in today’s computer vision applications. For example, it is very useful in OCR and biometrics such as face recognition, iris recognition, palmprint recognition, fingerprint recognition, and so forth. Furthermore, extracting invariant features from 2D pattern images is very useful for many real-life applications.
In this paper, we have proposed a novel method for pattern recognition by using the log-polar transform, the DTCWT transform, and the FFT2 transform. Our extracted features are invariant to translation, rotation, and scaling in a multiresolution way. It is well-known that very low-resolution sub-bands lose fine features in the pattern images, and very high-resolution sub-bands contain significant amounts of noise. Hence, intermediate-resolution sub-bands are very good for invariant pattern recognition. Experiments demonstrate that our new method is better than the log-polar-FFT2 method and the log-polar DWT-FFT2 method for recognizing printed Chinese characters and 2D aircraft in most testing cases.
Future research will be conducted by introducing denoising to the pattern images so that better recognition results can be obtained. For instance, we can use our previously published image denoising methods [7,8,9,10] to preprocess the input pattern images. We will also study deep convolutional neural networks (DNN) for invariant pattern recognition, which have achieved amazing results in recent years in real-life applications.

Author Contributions

Conceptualization, G.C.; Software, G.C.; Writing—original draft, G.C.; Writing—review & editing, G.C. and A.K.; Supervision, A.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki, and approved by the Institutional Review Board.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.

Acknowledgments

The authors would like to thank the anonymous reviewers for their constructive suggestions and comments, which improve the quality of this paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Trier, Ø.D.; Jain, A.K.; Taxt, T. Feature extraction methods for character recognition: A survey. Pattern Recognit. 1996, 29, 641–662. [Google Scholar] [CrossRef]
  2. Wolberg, G.; Zokai, S. Robust image registration using log-polar transform. In Proceedings of the IEEE International Conference on Image Processing (ICIP), Vancouver, BC, Canada, 10–13 September 2000; pp. 493–496. [Google Scholar]
  3. Kingsbury, N. Complex Wavelets for Shift Invariant Analysis and Filtering of Signals. Appl. Comput. Harmon. Anal. 2001, 10, 234–253. [Google Scholar] [CrossRef] [Green Version]
  4. Heideman, M.T.; Johnson, D.H.; Burrus, C.S. Gauss and the history of the fast Fourier transform. IEEE ASSP Mag. 1984, 1, 14–21. [Google Scholar] [CrossRef] [Green Version]
  5. Chen, G.; Bui, T.D. Invariant Fourier-wavelet descriptor for pattern recognition. Pattern Recognit. 1999, 32, 1083–1088. [Google Scholar] [CrossRef]
  6. Akansu, A.N.; Medley, M.J. (Eds.) Wavelet, Subband and Block Transforms in Communications and Multimedia; Springer: Berlin/Heidelberg, Germany, 1999. [Google Scholar]
  7. Chen, G.; Zhu, W.P.; Xie, W. Wavelet-based image denoising using three scales of dependency. IET Image Process. 2012, 6, 756–760. [Google Scholar] [CrossRef]
  8. Chen, G.Y.; Kégl, B. Image denoising with complex ridgelets. Pattern Recognit. 2007, 40, 578–585. [Google Scholar] [CrossRef]
  9. Chen, G.Y.; Bui, T.; Krzyżak, A. Image denoising with neighbour dependency and customized wavelet and threshold. Pattern Recognit. 2005, 38, 115–124. [Google Scholar] [CrossRef]
  10. Chen, G.Y.; Bui, T.; Krzyzak, A. Image denoising using neighbouring wavelet coefficients. Integr. Comput. Eng. 2005, 12, 99–107. [Google Scholar] [CrossRef]
Figure 1. The printed Chinese character dataset.
Figure 1. The printed Chinese character dataset.
Sensors 23 03842 g001
Figure 2. The 2D aircraft dataset.
Figure 2. The 2D aircraft dataset.
Sensors 23 03842 g002
Figure 3. The noisy images with SNR = 20, 15, 10, 5, 4, 3, 2, 1, and 0.5, respectively.
Figure 3. The noisy images with SNR = 20, 15, 10, 5, 4, 3, 2, 1, and 0.5, respectively.
Sensors 23 03842 g003
Figure 4. A combination of scaling factors (0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, and 1.0) and rotation angles (30°, 60°, 90°, 120°, 150°, 180°, 210°, 240°, and 270°) for an aircraft image.
Figure 4. A combination of scaling factors (0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, and 1.0) and rotation angles (30°, 60°, 90°, 120°, 150°, 180°, 210°, 240°, and 270°) for an aircraft image.
Sensors 23 03842 g004
Table 1. The correct recognition rates for a combination of rotation angles and scaling factors for the proposed method for the printed Chinese character dataset.
Table 1. The correct recognition rates for a combination of rotation angles and scaling factors for the proposed method for the printed Chinese character dataset.
Scaling
Factor
Rotation
30°60°90°120°150°180°210°240°270°
0.27.0610.599.417.0611.769.417.0611.769.41
0.327.0623.9350.5924.7122.3542.3523.5322.3540.00
0.457.6560.0088.2457.6560.0088.2457.6560.0088.24
0.576.4784.7188.2480.0081.1894.1283.5382.3595.29
0.697.6596.4797.6597.6596.4797.6597.6596.4797.65
0.710010010010098.8210010098.82100
0.8100100100100100100100100100
0.9100100100100100100100100100
1.0100100100100100100100100100
Table 2. The correct recognition rates for a combination of rotation angles and scaling factors for the log-polar-FFT2 method for the printed Chinese character dataset.
Table 2. The correct recognition rates for a combination of rotation angles and scaling factors for the log-polar-FFT2 method for the printed Chinese character dataset.
Scaling
Factor
Rotation
30°60°90°120°150°180°210°240°270°
0.25.885.888.245.885.888.247.067.098.24
0.324.7121.1835.2925.8818.8230.5925.8820.0027.06
0.443.5347.0664.7143.5348.2464.7143.5347.0664.71
0.564.7171.7676.4772.9475.2976.4770.5967.0678.82
0.697.6598.8295.2997.6598.8295.2997.6598.8295.29
0.7100100100100100100100100100
0.8100100100100100100100100100
0.9100100100100100100100100100
1.0100100100100100100100100100
Table 3. The correct recognition rates for a combination of rotation angles and scaling factors for the log-polar DWT-FFT2 method for the printed Chinese character dataset.
Table 3. The correct recognition rates for a combination of rotation angles and scaling factors for the log-polar DWT-FFT2 method for the printed Chinese character dataset.
Scaling
Factor
Rotation
30°60°90°120°150°180°210°240°270°
0.24.719.415.884.719.415.884.719.415.88
0.321.1818.8248.2420.0020.0045.8820.0020.0037.65
0.440.0044.7184.7140.0044.7184.7140.0044.7184.71
0.558.8270.5988.2469.4162.3592.9462.3555.2995.29
0.685.8888.2498.8287.0687.0698.8285.8887.0698.82
0.796.4794.1298.8294.1292.9498.8296.4791.76100
0.898.8295.2910098.8295.2910098.8295.29100
0.998.8210010098.82100100100100100
1.0100100100100100100100100100
Table 4. The correct recognition rates for a combination of rotation angles and scaling factors for the proposed method for the 2D aircraft dataset.
Table 4. The correct recognition rates for a combination of rotation angles and scaling factors for the proposed method for the 2D aircraft dataset.
Scaling
Factor
Rotation
30°60°90°120°150°180°210°240°270°
0.2655570655570655570
0.390951009095959590100
0.4100100100100100100100100100
0.5100100100100100100100100100
0.6100100100100100100100100100
0.7100100100100100100100100100
0.8100100100100100100100100100
0.9100100100100100100100100100
1.0100100100100100100100100100
Table 5. The correct recognition rates for a combination of rotation angles and scaling factors for the log-polar-FFT2 method for the 2D aircraft dataset.
Table 5. The correct recognition rates for a combination of rotation angles and scaling factors for the log-polar-FFT2 method for the 2D aircraft dataset.
Scaling
Factor
Rotation
30°60°90°120°150°180°210°240°270°
0.2656065656065656065
0.39510010010010095100100100
0.4100100100100100100100100100
0.5100100100100100100100100100
0.6100100100100100100100100100
0.7100100100100100100100100100
0.8100100100100100100100100100
0.9100100100100100100100100100
1.0100100100100100100100100100
Table 6. The correct recognition rates for a combination of rotation angles and scaling factors for the log-polar DWT-FFT2 method for the 2D aircraft dataset.
Table 6. The correct recognition rates for a combination of rotation angles and scaling factors for the log-polar DWT-FFT2 method for the 2D aircraft dataset.
Scaling
Factor
Rotation
30°60°90°120°150°180°210°240°270°
0.2303060303060303060
0.380751006075957075100
0.4909510090951009095100
0.595100100959510010095100
0.6100100100100100100100100100
0.7100100100100100100100100100
0.8100100100100100100100100100
0.9100100100100100100100100100
1.0100100100100100100100100100
Table 7. The correct recognition rates for a combination of rotation angles and noise levels for the proposed method for the printed Chinese character dataset.
Table 7. The correct recognition rates for a combination of rotation angles and noise levels for the proposed method for the printed Chinese character dataset.
SNRRotation
30°60°90°120°150°180°210°240°270°
20100100100100100100100100100
15100100100100100100100100100
10100100100100100100100100100
5100100100100100100100100100
4100100100100100100100100100
3100100100100100100100100100
2100100100100100100100100100
196.4797.6510094.1296.4798.8298.8294.1298.82
0.514.1217.6534.1210.5914.1234.1220.0010.5940.00
Table 8. The correct recognition rates for a combination of rotation angles and noise levels for the log-polar-FFT2 method for the printed Chinese character dataset.
Table 8. The correct recognition rates for a combination of rotation angles and noise levels for the log-polar-FFT2 method for the printed Chinese character dataset.
SNRRotation
30°60°90°120°150°180°210°240°270°
20100100100100100100100100100
15100100100100100100100100100
10100100100100100100100100100
5100100100100100100100100100
4100100100100100100100100100
3100100100100100100100100100
2100100100100100100100100100
184.7180.0091.7683.5381.1892.9484.7175.2995.29
0.57.064.719.411.184.717.064.714.7110.59
Table 9. The correct recognition rates for a combination of rotation angles and noise levels for the log-polar DWT-FFT2 method for the printed Chinese character dataset.
Table 9. The correct recognition rates for a combination of rotation angles and noise levels for the log-polar DWT-FFT2 method for the printed Chinese character dataset.
SNRRotation
30°60°90°120°150°180°210°240°270°
20100100100100100100100100100
15100100100100100100100100100
10100100100100100100100100100
5100100100100100100100100100
410010010098.82100100100100100
310010010010010010098.82100100
298.8210010098.8210010097.65100100
184.7184.7110082.3587.0610082.3585.88100
0.511.7611.7632.9411.7610.5940.007.0612.9432.94
Table 10. The correct recognition rates for a combination of rotation angles and noise levels for the proposed method for the 2D aircraft dataset.
Table 10. The correct recognition rates for a combination of rotation angles and noise levels for the proposed method for the 2D aircraft dataset.
SNRRotation
30°60°90°120°150°180°210°240°270°
20100100100100100100100100100
15100100100100100100100100100
10100100100100100100100100100
5100100100100100100100100100
4100100100100100100100100100
3100100100100100100100100100
2100100100100100100100100100
19095100951001008090100
0.515101015102051020
Table 11. The correct recognition rates for a combination of rotation angles and noise levels for the log-polar-FFT2 method for the 2D aircraft dataset.
Table 11. The correct recognition rates for a combination of rotation angles and noise levels for the log-polar-FFT2 method for the 2D aircraft dataset.
SNRRotation
30°60°90°120°150°180°210°240°270°
20100100100100100100100100100
15100100100100100100100100100
10100100100100100100100100100
5100100100100100100100100100
4100100100100100100100100100
3100100100100100100100100100
2100100100100100100100100100
1706560554545505070
0.5151515151010151515
Table 12. The correct recognition rates for a combination of rotation angles and noise levels for the log-polar DWT-FFT2 method for the 2D aircraft dataset.
Table 12. The correct recognition rates for a combination of rotation angles and noise levels for the log-polar DWT-FFT2 method for the 2D aircraft dataset.
SNRRotation
30°60°90°120°150°180°210°240°270°
20100100100100100100100100100
15100100100100100100100100100
10100100100100100100100100100
5100100100100100100100100100
4100100100100100100100100100
3100100100100100100100100100
21001001009510010095100100
160801005575956580100
0.510520510255520
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Chen, G.; Krzyzak, A. Invariant Pattern Recognition with Log-Polar Transform and Dual-Tree Complex Wavelet-Fourier Features. Sensors 2023, 23, 3842. https://doi.org/10.3390/s23083842

AMA Style

Chen G, Krzyzak A. Invariant Pattern Recognition with Log-Polar Transform and Dual-Tree Complex Wavelet-Fourier Features. Sensors. 2023; 23(8):3842. https://doi.org/10.3390/s23083842

Chicago/Turabian Style

Chen, Guangyi, and Adam Krzyzak. 2023. "Invariant Pattern Recognition with Log-Polar Transform and Dual-Tree Complex Wavelet-Fourier Features" Sensors 23, no. 8: 3842. https://doi.org/10.3390/s23083842

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop