Computer Vision, Pattern Recognition, Machine Learning, and Symmetry

A special issue of Symmetry (ISSN 2073-8994). This special issue belongs to the section "Computer".

Deadline for manuscript submissions: closed (15 June 2023) | Viewed by 49104

Special Issue Editor


E-Mail Website
Guest Editor
Key Laboratory of Digital Performance and Simulation Technology, Beijing Institute of Technology, Beijing 100081, China
Interests: multimedia retrieval; computer vision; machine learning; digital performance

Special Issue Information

Dear Colleagues,

We would like to invite you to submit your work to the Special Issue, "Computer Vision, Pattern Recognition, Machine Learning, and Symmetry", on the topic of symmetry/asymmetry. This Special Issue seeks high-quality contributions in the fields of computer vision/pattern recognition/machine learning and symmetry in theory, and applications to solve practical application problems.

This Special Issue of Symmetry will collect articles on solving real-world problems by solving data- and learning-centric technologies, including computer vision, pattern recognition, and the correlation between machine learning and symmetry. We are soliciting contributions covering all related topics, including but not limited to vision, multimedia, biometrics, behavior analysis, adversarial learning, simulation, network security, Internet of Things, and performance. The main criteria for submission are theoretical and application-centric innovative methods to solve real-world problems. There is no limit on the number of pages, but the submissions must demonstrate an understanding of the theme and a contribution to the topic.

Prof. Dr. Longfei Zhang
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Symmetry is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Computer vision
  • Applied statistics
  • Pattern recognition
  • Behavior analysis
  • Artificial intelligence
  • Machine learning
  • Adversarial learning
  • Reinforcement learning
  • Deep learning
  • Emerging technologies (telecommunications, blockchain, Internet of Things, cyber security, digital performance, smart creativity)

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (19 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

16 pages, 1629 KiB  
Article
Transformer-Based Recognition Model for Ground-Glass Nodules from the View of Global 3D Asymmetry Feature Representation
by Jun Miao, Maoxuan Zhang, Yiru Chang and Yuanhua Qiao
Symmetry 2023, 15(12), 2192; https://doi.org/10.3390/sym15122192 - 12 Dec 2023
Cited by 1 | Viewed by 1079
Abstract
Ground-glass nodules (GGN) are the main manifestation of early lung cancer, and accurate and efficient identification of ground-glass pulmonary nodules is of great significance for the treatment of lung diseases. In response to the problem of traditional machine learning requiring manual feature extraction, [...] Read more.
Ground-glass nodules (GGN) are the main manifestation of early lung cancer, and accurate and efficient identification of ground-glass pulmonary nodules is of great significance for the treatment of lung diseases. In response to the problem of traditional machine learning requiring manual feature extraction, and most deep learning models applied to 2D image classification, this paper proposes a Transformer-based recognition model for ground-glass nodules from the view of global 3D asymmetry feature representation. Firstly, a 3D convolutional neural network is used as the backbone to extract the features of the three-dimensional CT-image block of pulmonary nodules automatically; secondly, positional encoding information is added to the extracted feature map and input into the Transformer encoder layer for further extraction of global 3D asymmetry features, which can preserve more spatial information and obtain higher-order asymmetry feature representation; finally, the extracted asymmetry features are entered into a support vector machine or ELM-KNN model to further improve the recognition ability of the model. The experimental results show that the recognition accuracy of the proposed method reaches 95.89%, which is 4.79, 2.05, 4.11, and 2.74 percentage points higher than the common deep learning models of AlexNet, DenseNet121, GoogLeNet, and VGG19, respectively; compared with the latest models proposed in the field of pulmonary nodule classification, the accuracy has been improved by 2.05, 2.05, and 0.68 percentage points, respectively, which can effectively improve the recognition accuracy of ground-glass nodules. Full article
(This article belongs to the Special Issue Computer Vision, Pattern Recognition, Machine Learning, and Symmetry)
Show Figures

Figure 1

18 pages, 7192 KiB  
Article
Symmetry-Based Fusion Algorithm for Bone Age Detection with YOLOv5 and ResNet34
by Wenshun Sheng, Jiahui Shen, Qiming Huang, Zhixuan Liu, Jiayan Lin, Qi Zhu and Lan Zhou
Symmetry 2023, 15(7), 1377; https://doi.org/10.3390/sym15071377 - 6 Jul 2023
Cited by 3 | Viewed by 1493
Abstract
Bone age is the chronological age of human bones, which serves as a key indicator of the maturity of bone development and can more objectively reflect the extent of human growth and development. The prevalent viewpoint and research development direction now favor the [...] Read more.
Bone age is the chronological age of human bones, which serves as a key indicator of the maturity of bone development and can more objectively reflect the extent of human growth and development. The prevalent viewpoint and research development direction now favor the employment of deep learning-based bone age detection algorithms to determine bone age. Although bone age detection accuracy has increased when compared to more established methods, more work needs to be conducted to raise it because bone age detection is primarily used in clinical medicine, forensic identification, and other critical and rigorous fields. Due to the symmetry of human hand bones, bone age detection can be performed on either the left hand or the right hand, and the results are the same. In other words, the bone age detection results of both hands are universal. In this regard, the left hand is chosen as the target of bone age detection in this paper. To accomplish this, the You Only Look Once-v5 (YOLOv5) and Residual Network-34 (ResNet34) integration techniques are combined in this paper to create an innovative bone age detection model (YARN), which is then combined with the RUS-CHN scoring method that applies to Chinese adolescent children to comprehensively assess bone age at multiple levels. In this study, the images in the hand bone dataset are first preprocessed with number enhancement, then YOLOv5 is used to train the hand bone dataset to identify and filter out the main 13 joints in the hand bone, and finally, ResNet34 is used to complete the classification of local joints and achieve the determination of the developmental level of the detected region, followed by the calculation of the bone age by combining with the RUS-CHN method. The bone age detection model based on YOLOv5 and ResNet34 can significantly improve the accuracy and efficiency of bone age detection, and the model has significant advantages in the deep feature extraction of key regions of hand bone joints, which can efficiently complete the task of bone age detection. This was discovered through experiments on the public dataset of Flying Paddle AI Studio. Full article
(This article belongs to the Special Issue Computer Vision, Pattern Recognition, Machine Learning, and Symmetry)
Show Figures

Figure 1

15 pages, 2030 KiB  
Article
Oracle-Preserving Latent Flows
by Alexander Roman, Roy T. Forestano, Konstantin T. Matchev, Katia Matcheva and Eyup B. Unlu
Symmetry 2023, 15(7), 1352; https://doi.org/10.3390/sym15071352 - 3 Jul 2023
Cited by 4 | Viewed by 1584
Abstract
A fundamental task in data science is the discovery, description, and identification of any symmetries present in the data. We developed a deep learning methodology for the simultaneous discovery of multiple non-trivial continuous symmetries across an entire labeled dataset. The symmetry transformations and [...] Read more.
A fundamental task in data science is the discovery, description, and identification of any symmetries present in the data. We developed a deep learning methodology for the simultaneous discovery of multiple non-trivial continuous symmetries across an entire labeled dataset. The symmetry transformations and the corresponding generators are modeled with fully connected neural networks trained with a specially constructed loss function, ensuring the desired symmetry properties. The two new elements in this work are the use of a reduced-dimensionality latent space and the generalization to invariant transformations with respect to high-dimensional oracles. The method is demonstrated with several examples on the MNIST digit dataset, where the oracle is provided by the 10-dimensional vector of logits of a trained classifier. We find classes of symmetries that transform each image from the dataset into new synthetic images while conserving the values of the logits. We illustrate these transformations as lines of equal probability (“flows”) in the reduced latent space. These results show that symmetries in the data can be successfully searched for and identified as interpretable non-trivial transformations in the equivalent latent space. Full article
(This article belongs to the Special Issue Computer Vision, Pattern Recognition, Machine Learning, and Symmetry)
Show Figures

Figure 1

15 pages, 24572 KiB  
Article
A Deep-Learning Approach for Identifying and Classifying Digestive Diseases
by J. V. Thomas Abraham, A. Muralidhar, Kamsundher Sathyarajasekaran and N. Ilakiyaselvan
Symmetry 2023, 15(2), 379; https://doi.org/10.3390/sym15020379 - 31 Jan 2023
Cited by 10 | Viewed by 2469
Abstract
The digestive tract, often known as the gastrointestinal (GI) tract or the gastrointestinal system, is affected by digestive ailments. The stomach, large and small intestines, liver, pancreas and gallbladder are all components of the digestive tract. A digestive disease is any illness that [...] Read more.
The digestive tract, often known as the gastrointestinal (GI) tract or the gastrointestinal system, is affected by digestive ailments. The stomach, large and small intestines, liver, pancreas and gallbladder are all components of the digestive tract. A digestive disease is any illness that affects the digestive system. Serious to moderate conditions can exist. Heartburn, cancer, irritable bowel syndrome (IBS) and lactose intolerance are only a few of the frequent issues. The digestive system may be treated with many different surgical treatments. Laparoscopy, open surgery and endoscopy are a few examples of these techniques. This paper proposes transfer-learning models with different pre-trained models to identify and classify digestive diseases. The proposed systems showed an increase in metrics, such as the accuracy, precision and recall, when compared with other state-of-the-art methods, and EfficientNetB0 achieved the best performance results of 98.01% accuracy, 98% precision and 98% recall. Full article
(This article belongs to the Special Issue Computer Vision, Pattern Recognition, Machine Learning, and Symmetry)
Show Figures

Figure 1

17 pages, 807 KiB  
Article
PointMapNet: Point Cloud Feature Map Network for 3D Human Action Recognition
by Xing Li, Qian Huang, Yunfei Zhang, Tianjin Yang and Zhijian Wang
Symmetry 2023, 15(2), 363; https://doi.org/10.3390/sym15020363 - 30 Jan 2023
Cited by 4 | Viewed by 2184
Abstract
3D human action recognition is crucial in broad industrial application scenarios such as robotics, video surveillance, autonomous driving, or intellectual education, etc. In this paper, we present a new point cloud sequence network called PointMapNet for 3D human action recognition. In PointMapNet, two [...] Read more.
3D human action recognition is crucial in broad industrial application scenarios such as robotics, video surveillance, autonomous driving, or intellectual education, etc. In this paper, we present a new point cloud sequence network called PointMapNet for 3D human action recognition. In PointMapNet, two point cloud feature maps symmetrical to depth feature maps are proposed to summarize appearance and motion representations from point cloud sequences. Specifically, we first convert the point cloud frames to virtual action frames using static point cloud techniques. The virtual action frame is a 1D vector used to characterize the structural details in the point cloud frame. Then, inspired by feature map-based human action recognition on depth sequences, two point cloud feature maps are symmetrically constructed to recognize human action from the point cloud sequence, i.e., Point Cloud Appearance Map (PCAM) and Point Cloud Motion Map (PCMM). To construct PCAM, an MLP-like network architecture is designed and used to capture the spatio-temporal appearance feature of the human action in a virtual action sequence. To construct PCMM, the MLP-like network architecture is used to capture the motion feature of the human action in a virtual action difference sequence. Finally, the two point cloud feature map descriptors are concatenated and fed to a fully connected classifier for human action recognition. In order to evaluate the performance of the proposed approach, extensive experiments are conducted. The proposed method achieves impressive results on three benchmark datasets, namely NTU RGB+D 60 (89.4% cross-subject and 96.7% cross-view), UTD-MHAD (91.61%), and MSR Action3D (91.91%). The experimental results outperform existing state-of-the-art point cloud sequence classification networks, demonstrating the effectiveness of our method. Full article
(This article belongs to the Special Issue Computer Vision, Pattern Recognition, Machine Learning, and Symmetry)
Show Figures

Figure 1

15 pages, 903 KiB  
Article
Absolute 3D Human Pose Estimation Using Noise-Aware Radial Distance Predictions
by Inho Chang, Min-Gyu Park, Je Woo Kim and Ju Hong Yoon
Symmetry 2023, 15(1), 25; https://doi.org/10.3390/sym15010025 - 22 Dec 2022
Viewed by 2211
Abstract
We present a simple yet effective pipeline for absolute three-dimensional (3D) human pose estimation from two-dimensional (2D) joint keypoints, namely, the 2D-to-3D human pose lifting problem. Our method comprises two simple baseline networks, a 3D conversion function, and a correction network. The former [...] Read more.
We present a simple yet effective pipeline for absolute three-dimensional (3D) human pose estimation from two-dimensional (2D) joint keypoints, namely, the 2D-to-3D human pose lifting problem. Our method comprises two simple baseline networks, a 3D conversion function, and a correction network. The former two networks predict the root distance and the root-relative joint distance simultaneously. Given the input and predicted distances, the 3D conversion function recovers the absolute 3D pose, and the correction network reduces 3D pose noise caused by input uncertainties. Furthermore, to cope with input noise implicitly, we adopt a Siamese architecture that enforces the consistency of features between two training inputs, i.e., ground truth 2D joint keypoints and detected 2D joint keypoints. Finally, we experimentally validate the advantages of the proposed method and demonstrate its competitive performance over state-of-the-art absolute 2D-to-3D pose-lifting methods. Full article
(This article belongs to the Special Issue Computer Vision, Pattern Recognition, Machine Learning, and Symmetry)
Show Figures

Figure 1

14 pages, 1326 KiB  
Article
Detecting Compressed Deepfake Images Using Two-Branch Convolutional Networks with Similarity and Classifier
by Ping Chen, Ming Xu and Xiaodong Wang
Symmetry 2022, 14(12), 2691; https://doi.org/10.3390/sym14122691 - 19 Dec 2022
Cited by 1 | Viewed by 3434
Abstract
As a popular technique for swapping faces with someone else’s in images or videos through deep neural networks, deepfake causes a serious threat to the security of multimedia content today. However, because counterfeit images are usually compressed when propagating over the Internet, and [...] Read more.
As a popular technique for swapping faces with someone else’s in images or videos through deep neural networks, deepfake causes a serious threat to the security of multimedia content today. However, because counterfeit images are usually compressed when propagating over the Internet, and because the compression factor used is unknown, most of the existing deepfake detection models have poor robustness for the detection of compressed images with unknown compression factors. To solve this problem, we notice that an image has a high similarity with its compressed image based on symmetry, and this similarity is not easily affected by the compression factor, so this similarity feature can be used as an important clue for compressed deepfake detection. A TCNSC (Two-branch Convolutional Networks with Similarity and Classifier) method that combines compression factor independence is proposed in this paper. The TCNSC method learns two feature representations from the deepfake image, i.e., similarity of the image and its compressed counterpart and authenticity of the deepfake image. A joint training strategy is then utilized for feature extraction, in which the similarity characteristics are obtained by similarity learning while obtaining authenticity characteristics, so the proposed TCNSC model is trained for robust feature learning. Experimental results on the FaceForensics++ (FF++) dataset show that the proposed method significantly outperforms all competing methods under three compression settings of high-quality (HQ), medium-quality (MQ), and low-quality (LQ). For the LQ, MQ, and HQ settings, TCNSC achieves 91.8%, 93.4%, and 95.3% in accuracy, and outperforms the state-of-art method (Xception-RAW) by 16.9%, 10.1%, and 4.1%, respectively. Full article
(This article belongs to the Special Issue Computer Vision, Pattern Recognition, Machine Learning, and Symmetry)
Show Figures

Figure 1

19 pages, 20756 KiB  
Article
A Recognition Method of Ancient Architectures Based on the Improved Inception V3 Model
by Xinyang Wang, Jiaxun Li, Jin Tao, Ling Wu, Chao Mou, Weihua Bai, Xiaotian Zheng, Zirui Zhu and Zhuohong Deng
Symmetry 2022, 14(12), 2679; https://doi.org/10.3390/sym14122679 - 18 Dec 2022
Cited by 8 | Viewed by 3626
Abstract
Traditional ancient architecture is a symbolic product of cultural development and inheritance, with high social and cultural value. An automatic recognition model of ancient building types is one possible application of asymmetric systems, and it will be of great significance to be able [...] Read more.
Traditional ancient architecture is a symbolic product of cultural development and inheritance, with high social and cultural value. An automatic recognition model of ancient building types is one possible application of asymmetric systems, and it will be of great significance to be able to identify ancient building types via machine vision. In the context of Chinese traditional ancient buildings, this paper proposes a recognition method of ancient buildings, based on the improved asymmetric Inception V3 model. Firstly, the improved Inception V3 model adds a dropout layer between the global average pooling layer and the SoftMax classification layer to solve the overfitting problem caused by the small sample size of the ancient building data set. Secondly, migration learning and the ImageNet dataset are integrated into model training, which improves the speed of network training while solving the problems of the small scale of the ancient building dataset and insufficient model training. Thirdly, through ablation experiments, the effects of different data preprocessing methods and different dropout rates on the accuracy of model recognition were compared, to obtain the optimized model parameters. To verify the effectiveness of the model, this paper takes the ancient building dataset that was independently constructed by the South China University of Technology team as the experimental data and compares the recognition effect of the improved Inception V3 model proposed in this paper with several classical models. The experimental results show that when the data preprocessing method is based on filling and the dropout rate is 0.3, the recognition accuracy of the model is the highest; the accuracy rate of identifying ancient buildings using our proposed improved Inception V3 model can reach up to 98.64%. Compared with other classical models, the model accuracy rate has increased by 17.32%, and the average training time has accelerated by 2.29 times, reflecting the advantages of the model proposed in this paper. Finally, the improved Inception V3 model was loaded into the ancient building identification system to prove the practical application value of this research. Full article
(This article belongs to the Special Issue Computer Vision, Pattern Recognition, Machine Learning, and Symmetry)
Show Figures

Figure 1

33 pages, 2144 KiB  
Article
Distributed Network of Adaptive and Self-Reconfigurable Active Vision Systems
by Shashank and Indu Sreedevi
Symmetry 2022, 14(11), 2281; https://doi.org/10.3390/sym14112281 - 31 Oct 2022
Cited by 4 | Viewed by 1900
Abstract
The performance of a computer vision system depends on the accuracy of visual information extracted by the sensors and the system’s visual-processing capabilities. To derive optimum information from the sensed data, the system must be capable of identifying objects of interest (OOIs) and [...] Read more.
The performance of a computer vision system depends on the accuracy of visual information extracted by the sensors and the system’s visual-processing capabilities. To derive optimum information from the sensed data, the system must be capable of identifying objects of interest (OOIs) and activities in the scene. Active vision systems intend to capture OOIs with the highest possible resolution to extract the optimum visual information by calibrating the configuration spaces of the cameras. As the data processing and reconfiguration of cameras are interdependent, it becomes very challenging for advanced active vision systems to perform in real time. Due to limited computational resources, model-based asymmetric active vision systems only work in known conditions and fail miserably in unforeseen conditions. Symmetric/asymmetric systems employing artificial intelligence, while they manage to tackle unforeseen environments, require iterative training and thus are not reliable for real-time applications. Thus, the contemporary symmetric/asymmetric reconfiguration systems proposed to obtain optimum configuration spaces of sensors for accurate activity tracking and scene understanding may not be adequate to tackle unforeseen conditions in real time. To address this problem, this article presents an adaptive self-reconfiguration (ASR) framework for active vision systems operating co-operatively in a distributed blockchain network. The ASR framework enables active vision systems to share their derived learning about an activity or an unforeseen environment, which learning can be utilized by other active vision systems in the network, thus lowering the time needed for learning and adaptation to new conditions. Further, as the learning duration is reduced, the duration of the reconfiguration of the cameras is also reduced, yielding better performance in terms of understanding of a scene. The ASR framework enables resource and data sharing in a distributed network of active vision systems and outperforms state-of-the-art active vision systems in terms of accuracy and latency, making it ideal for real-time applications. Full article
(This article belongs to the Special Issue Computer Vision, Pattern Recognition, Machine Learning, and Symmetry)
Show Figures

Figure 1

18 pages, 8483 KiB  
Article
Hand Gesture Recognition with Symmetric Pattern under Diverse Illuminated Conditions Using Artificial Neural Network
by Muhammad Haroon, Saud Altaf, Shafiq Ahmad, Mazen Zaindin, Shamsul Huda and Sofia Iqbal
Symmetry 2022, 14(10), 2045; https://doi.org/10.3390/sym14102045 - 30 Sep 2022
Cited by 8 | Viewed by 2802
Abstract
This paper investigated the effects of variant lighting conditions on the recognition process. A framework is proposed to improve the performance of gesture recognition under variant illumination using the luminosity method. To prove the concept, a workable testbed has been developed in the [...] Read more.
This paper investigated the effects of variant lighting conditions on the recognition process. A framework is proposed to improve the performance of gesture recognition under variant illumination using the luminosity method. To prove the concept, a workable testbed has been developed in the laboratory by using a Microsoft Kinect sensor to capture the depth images for the purpose of acquiring diverse resolution data. For this, a case study was formulated to achieve an improved accuracy rate in gesture recognition under diverse illuminated conditions. For data preparation, American Sign Language (ASL) was used to create a dataset of all twenty-six signs, evaluated in real-time under diverse lighting conditions. The proposed method uses a set of symmetric patterns as a feature set in order to identify human hands and recognize gestures extracted through hand perimeter feature-extraction methods. A Scale-Invariant Feature Transform (SIFT) is used in the identification of significant key points of ASL-based images with their relevant features. Finally, an Artificial Neural Network (ANN) trained on symmetric patterns under different lighting environments was used to classify hand gestures utilizing selected features for validation. The experimental results showed that the proposed system performed well in diverse lighting effects with multiple pixel sizes. A total aggregate 97.3% recognition accuracy rate is achieved across 26 alphabet datasets with only a 2.7% error rate, which shows the overall efficiency of the ANN architecture in terms of processing time. Full article
(This article belongs to the Special Issue Computer Vision, Pattern Recognition, Machine Learning, and Symmetry)
Show Figures

Figure 1

15 pages, 9394 KiB  
Article
Introducing Urdu Digits Dataset with Demonstration of an Efficient and Robust Noisy Decoder-Based Pseudo Example Generator
by Wisal Khan, Kislay Raj, Teerath Kumar, Arunabha M. Roy and Bin Luo
Symmetry 2022, 14(10), 1976; https://doi.org/10.3390/sym14101976 - 21 Sep 2022
Cited by 40 | Viewed by 3573
Abstract
In the present work, we propose a novel method utilizing only a decoder for generation of pseudo-examples, which has shown great success in image classification tasks. The proposed method is particularly constructive when the data are in a limited quantity used for semi-supervised [...] Read more.
In the present work, we propose a novel method utilizing only a decoder for generation of pseudo-examples, which has shown great success in image classification tasks. The proposed method is particularly constructive when the data are in a limited quantity used for semi-supervised learning (SSL) or few-shot learning (FSL). While most of the previous works have used an autoencoder to improve the classification performance for SSL, using a single autoencoder may generate confusing pseudo-examples that could degrade the classifier’s performance. On the other hand, various models that utilize encoder–decoder architecture for sample generation can significantly increase computational overhead. To address the issues mentioned above, we propose an efficient means of generating pseudo-examples by using only the generator (decoder) network separately for each class that has shown to be effective for both SSL and FSL. In our approach, the decoder is trained for each class sample using random noise, and multiple samples are generated using the trained decoder. Our generator-based approach outperforms previous state-of-the-art SSL and FSL approaches. In addition, we released the Urdu digits dataset consisting of 10,000 images, including 8000 training and 2000 test images collected through three different methods for purposes of diversity. Furthermore, we explored the effectiveness of our proposed method on the Urdu digits dataset by using both SSL and FSL, which demonstrated improvement of 3.04% and 1.50% in terms of average accuracy, respectively, illustrating the superiority of the proposed method compared to the current state-of-the-art models. Full article
(This article belongs to the Special Issue Computer Vision, Pattern Recognition, Machine Learning, and Symmetry)
Show Figures

Figure 1

14 pages, 4690 KiB  
Article
Application of Wavelet Characteristics and GMDH Neural Networks for Precise Estimation of Oil Product Types and Volume Fractions
by Abdulilah Mohammad Mayet, Seyed Mehdi Alizadeh, Karwan Mohammad Hamakarim, Ali Awadh Al-Qahtani, Abdullah K. Alanazi, John William Grimaldo Guerrero, Hala H. Alhashim and Ehsan Eftekhari-Zadeh
Symmetry 2022, 14(9), 1797; https://doi.org/10.3390/sym14091797 - 30 Aug 2022
Cited by 6 | Viewed by 1594
Abstract
Given that one of the most critical operations in the oil and gas industry is to instantly determine the volume and type of product passing through the pipelines, in this research, a detection system for monitoring oil pipelines is proposed. The proposed system [...] Read more.
Given that one of the most critical operations in the oil and gas industry is to instantly determine the volume and type of product passing through the pipelines, in this research, a detection system for monitoring oil pipelines is proposed. The proposed system works in such a way that the radiation from the dual-energy source which symmetrically emits radiation, was received by the NaI detector after passing through the shield window and test pipeline. In the test pipe, four petroleum products—ethylene glycol, crude oil, gasoil, and gasoline—were simulated in pairs in different volume fractions. A total of 118 simulations were performed, and their signals were categorized. Then, feature extraction operations were started to reduce the volume of data, increase accuracy, increase the learning speed of the neural network, and better interpret the data. Wavelet features were extracted from the recorded signal and used as GMDH neural network input. The signals of each test were divided into details and approximation sections and characteristics with the names STD of A3, D3, D2 and were extracted. This described structure is modelled in the Monte Carlo N Particle code (MCNP). In fact, precise estimation of oil product types and volume fractions were done using a combination of symmetrical source and asymmetrical neural network. Four GMDH neural networks were trained to estimate the volumetric ratio of each product, and the maximum RMSE was 0.63. In addition to this high accuracy, the low implementation and computational cost compared to previous detection methods are among the advantages of present investigation, which increases its application in the oil industry. Full article
(This article belongs to the Special Issue Computer Vision, Pattern Recognition, Machine Learning, and Symmetry)
Show Figures

Figure 1

19 pages, 3146 KiB  
Article
A Graph Skeleton Transformer Network for Action Recognition
by Yujian Jiang, Zhaoneng Sun, Saisai Yu, Shuang Wang and Yang Song
Symmetry 2022, 14(8), 1547; https://doi.org/10.3390/sym14081547 - 28 Jul 2022
Cited by 7 | Viewed by 2850
Abstract
Skeleton-based action recognition is a research hotspot in the field of computer vision. Currently, the mainstream method is based on Graph Convolutional Networks (GCNs). Although there are many advantages of GCNs, GCNs mainly rely on graph topologies to draw dependencies between the joints, [...] Read more.
Skeleton-based action recognition is a research hotspot in the field of computer vision. Currently, the mainstream method is based on Graph Convolutional Networks (GCNs). Although there are many advantages of GCNs, GCNs mainly rely on graph topologies to draw dependencies between the joints, which are limited in capturing long-distance dependencies. Meanwhile, Transformer-based methods have been applied to skeleton-based action recognition because they effectively capture long-distance dependencies. However, existing Transformer-based methods lose the inherent connection information of human skeleton joints because they do not yet focus on initial graph structure information. This paper aims to improve the accuracy of skeleton-based action recognition. Therefore, a Graph Skeleton Transformer network (GSTN) for action recognition is proposed, which is based on Transformer architecture to extract global features, while using undirected graph information represented by the symmetric matrix to extract local features. Two encodings are utilized in feature processing to improve joints’ semantic and centrality features. In the process of multi-stream fusion strategies, a grid-search-based method is used to assign weights to each input stream to optimize the fusion results. We tested our method using three action recognition datasets: NTU RGB+D 60, NTU RGB+D 120, and NW-UCLA. The experimental results show that our model’s accuracy is comparable to state-of-the-art approaches. Full article
(This article belongs to the Special Issue Computer Vision, Pattern Recognition, Machine Learning, and Symmetry)
Show Figures

Figure 1

19 pages, 16662 KiB  
Article
Learning Augmented Memory Joint Aberrance Repressed Correlation Filters for Visual Tracking
by Yuanfa Ji, Jianzhong He, Xiyan Sun, Yang Bai, Zhaochuan Wei and Kamarul Hawari bin Ghazali
Symmetry 2022, 14(8), 1502; https://doi.org/10.3390/sym14081502 - 22 Jul 2022
Cited by 2 | Viewed by 1460
Abstract
With its outstanding performance and tracking speed, discriminative correlation filters (DCF) have gained much attention in visual object tracking, where time-consuming correlation operations can be efficiently computed utilizing the discrete Fourier transform (DFT) with symmetric properties. Nevertheless, the inherent issues of boundary effects [...] Read more.
With its outstanding performance and tracking speed, discriminative correlation filters (DCF) have gained much attention in visual object tracking, where time-consuming correlation operations can be efficiently computed utilizing the discrete Fourier transform (DFT) with symmetric properties. Nevertheless, the inherent issues of boundary effects and filter degradation, as well as occlusion and background clutter, degrade the tracking performance. In this work, we proposed an augmented memory joint aberrance repressed correlation filter (AMRCF) for visual tracking. Based on the background-aware correlation filter (BACF), we introduced adaptive spatial regularity to mitigate the boundary effect. Several historical views and the current view are exploited to train the model together as a way to reinforce the memory. Furthermore, aberrance repression regularization was introduced to suppress response anomalies due to occlusion and deformation, while adopting the dynamic updating strategy to reduce the impact of anomalies on the appearance model. Finally, extensive experimental results over four well-known tracking benchmarks indicate that the proposed AMRCF tracker achieved comparable tracking performance to most state-of-the-art (SOTA) trackers. Full article
(This article belongs to the Special Issue Computer Vision, Pattern Recognition, Machine Learning, and Symmetry)
Show Figures

Figure 1

14 pages, 3610 KiB  
Article
Detection of Key Points in Mice at Different Scales via Convolutional Neural Network
by Zhengyang Xu, Ruiqing Liu, Zhizhong Wang, Songwei Wang and Juncai Zhu
Symmetry 2022, 14(7), 1437; https://doi.org/10.3390/sym14071437 - 13 Jul 2022
Cited by 3 | Viewed by 1926
Abstract
In this work, we propose a symmetry approach and design a convolutional neural network for mouse pose estimation under scale variation. The backbone adopts the UNet structure, uses the residual network to extract features, and adds the ASPP module into the appropriate residual [...] Read more.
In this work, we propose a symmetry approach and design a convolutional neural network for mouse pose estimation under scale variation. The backbone adopts the UNet structure, uses the residual network to extract features, and adds the ASPP module into the appropriate residual units to expand the perceptual field, and uses the deep and shallow feature fusion to fuse and process the features at multiple scales to capture the various spatial relationships related to body parts to improve the recognition accuracy of the model. Finally, a set of prediction results based on heat map and coordinate offset is generated. We used our own built mouse dataset and obtained state-of-the-art results on the dataset. Full article
(This article belongs to the Special Issue Computer Vision, Pattern Recognition, Machine Learning, and Symmetry)
Show Figures

Figure 1

26 pages, 5936 KiB  
Article
Recognition of Car Front Facing Style for Machine-Learning Data Annotation: A Quantitative Approach
by Lisha Ma, Yu Wu, Qingnan Li and Xiaofang Yuan
Symmetry 2022, 14(6), 1181; https://doi.org/10.3390/sym14061181 - 8 Jun 2022
Cited by 5 | Viewed by 2512
Abstract
Car front facing style (CFFS) recognition is crucial to enhancing a company’s market competitiveness and brand image. However, there is a problem impeding its development: with the sudden increase in style design information, the traditional methods, based on feature calculation, are insufficient to [...] Read more.
Car front facing style (CFFS) recognition is crucial to enhancing a company’s market competitiveness and brand image. However, there is a problem impeding its development: with the sudden increase in style design information, the traditional methods, based on feature calculation, are insufficient to quickly handle style analysis with a large volume of data. Therefore, we introduced a deep feature-based machine learning approach to solve the problem. Datasets are the basis of machine learning, but there is a lack of references for car style data annotations, which can lead to unreliable style data annotation. Therefore, a CFFS recognition method was proposed for machine-learning data annotation. Specifically, this study proposes a hierarchical model for analyzing CFFS style from the morphological perspective of layout, surface, graphics, and line. Based on the quantitative percentage of the three elements of style, this paper categorizes the CFFS into eight basic types of style and distinguishes the styles by expert analysis to summarize the characteristics of each layout, shape surface, and graphics. We use imagery diagrams and typical CFFS examples and characteristic laws of each style as annotation references to guide manual annotation data. This investigation established a CFFS dataset with eight types of style. The method was evaluated from a design perspective; we found that the accuracy obtained when using this method for CFFS data annotation exceeded that obtained when not using this method by 32.03%. Meanwhile, we used Vgg19, ResNet, ViT, MAE, and MLP-Mixer, five classic classifiers, to classify the dataset; the average accuracy rates were 76.75%, 78.47%, 78.07%, 75.80%, and 81.06%. This method effectively transforms human design knowledge into machine-understandable structured knowledge. There is a symmetric transformation of knowledge in the computer-aided design process, providing a reference for machine learning to deal with abstract style problems. Full article
(This article belongs to the Special Issue Computer Vision, Pattern Recognition, Machine Learning, and Symmetry)
Show Figures

Graphical abstract

11 pages, 982 KiB  
Article
Image Translation for Oracle Bone Character Interpretation
by Feng Gao, Jingping Zhang, Yongge Liu and Yahong Han
Symmetry 2022, 14(4), 743; https://doi.org/10.3390/sym14040743 - 4 Apr 2022
Cited by 5 | Viewed by 6157
Abstract
The Oracle Bone Characters are the earliest known ancient Chinese characters and are an important record of the civilization of ancient China.The interpretation of the Oracle Bone Characters is challenging and requires professional knowledge from ancient Chinese language experts. Although some works have [...] Read more.
The Oracle Bone Characters are the earliest known ancient Chinese characters and are an important record of the civilization of ancient China.The interpretation of the Oracle Bone Characters is challenging and requires professional knowledge from ancient Chinese language experts. Although some works have utilized deep learning to perform image detection and recognition using the Oracle Bone Characters, these methods have proven difficult to use for the interpretation of uninterpreted Oracle Bone Character images. Inspired by the prior knowledge that there exists a relation between glyphs from Oracle Bone Character images and images of modern Chinese characters, we proposed a method of image translation from Oracle Bone Characters to modern Chinese characters based on the use of a generative adversarial network to capture the implicit relationship between glyphs from Oracle Bone Characters and modern Chinese characters. The image translation process between Oracle Bone Characters and the modern Chinese characters forms a symmetrical structure, comprising an encoder and decoder. To our knowledge, our symmetrical image translation method is the first of its kind used for the task of interpreting Oracle Bone Characters. Our experiments indicated that our image translation method can provide glyph information to aid in the interpretation of Oracle Bone Characters. Full article
(This article belongs to the Special Issue Computer Vision, Pattern Recognition, Machine Learning, and Symmetry)
Show Figures

Figure 1

14 pages, 1618 KiB  
Article
Hybrid Domain Attention Network for Efficient Super-Resolution
by Qian Zhang, Linxia Feng, Hong Liang and Ying Yang
Symmetry 2022, 14(4), 697; https://doi.org/10.3390/sym14040697 - 28 Mar 2022
Cited by 3 | Viewed by 2149
Abstract
Image SR reconstruction methods focus on recovering the lost details in the image, that is, high-frequency information, which exists in the region of edges and textures. Consequently, the low-frequency information of an image often requires few computational resources. At present, most of the [...] Read more.
Image SR reconstruction methods focus on recovering the lost details in the image, that is, high-frequency information, which exists in the region of edges and textures. Consequently, the low-frequency information of an image often requires few computational resources. At present, most of the recent CNN-based image SR reconstruction methods allocate computational resources uniformly and treat all features equally, which inevitably results in wasted computational resources and increased computational effort. However, the limited computational resources of mobile devices can hardly afford the expensive computational cost. This paper proposes a symmetric CNN (HDANet), which is based on the Transformer’s self-attention mechanism and uses symmetric convolution to capture the dependencies of image features in two dimensions, spatial and channel, respectively. Specifically, the spatial self-attention module identifies important regions in the image, and the channel self-attention module adaptively emphasizes important channels. The output of the two symmetric modules can be summed to further enhance the feature representation and selectively emphasize important feature information, which can enable the network architecture to precisely locate and bypass low-frequency information and reduce computational cost. Extensive experimental results on Set5, Set14, B100, and Urban100 datasets show that HDANet achieves advanced SR reconstruction performance while reducing computational complexity. HDANet reduces FLOPs by nearly 40% compared to the original model. ×2 SR reconstruction of images on the Set5 test set achieves a PSNR value of 37.94 dB. Full article
(This article belongs to the Special Issue Computer Vision, Pattern Recognition, Machine Learning, and Symmetry)
Show Figures

Figure 1

15 pages, 3422 KiB  
Article
Frame Identification of Object-Based Video Tampering Using Symmetrically Overlapped Motion Residual
by Tae Hyung Kim, Cheol Woo Park and Il Kyu Eom
Symmetry 2022, 14(2), 364; https://doi.org/10.3390/sym14020364 - 12 Feb 2022
Cited by 2 | Viewed by 2001
Abstract
Image and video manipulation has been actively used in recent years with the development of multimedia editing technologies. However, object-based video tampering, which adds or removes objects within a video frame, is posing challenges because it is difficult to verify the authenticity of [...] Read more.
Image and video manipulation has been actively used in recent years with the development of multimedia editing technologies. However, object-based video tampering, which adds or removes objects within a video frame, is posing challenges because it is difficult to verify the authenticity of videos. In this paper, we present a novel object-based frame identification network. The proposed method uses symmetrically overlapped motion residuals to enhance the discernment of video frames. Since the proposed motion residual features are generated on the basis of overlapped temporal windows, temporal variations in the video sequence can be exploited in the deep neural network. In addition, this paper introduces an asymmetric network structure for training and testing a single basic convolutional neural network. In the training process, two networks with an identical structure are used, each of which has a different input pair. In the testing step, two types of testing methods corresponding to two- and three-class frame identifications are proposed. We compare the identification accuracy of the proposed method with that of the existing methods. The experimental results demonstrate that the proposed method generates reasonable identification results for both two- and three-class forged frame identifications. Full article
(This article belongs to the Special Issue Computer Vision, Pattern Recognition, Machine Learning, and Symmetry)
Show Figures

Figure 1

Back to TopTop