Computer Vision, Image Processing Technologies and Artificial Intelligence, 2nd Edition

A special issue of Mathematics (ISSN 2227-7390). This special issue belongs to the section "Mathematics and Computer Science".

Deadline for manuscript submissions: 30 September 2024 | Viewed by 2797

Special Issue Editors


E-Mail Website
Guest Editor
Institute of Computing Technology, University of Chinese Academy of Sciences, Beijing 100049, China
Interests: video coding; computer vision; deep learning
Special Issues, Collections and Topics in MDPI journals
School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 100049, China
Interests: image processing; signal processing; artificial intelligence
Special Issues, Collections and Topics in MDPI journals
School of Computer Science, Beijing Information Science and Technology University, Beijing 100101, China
Interests: neural networks; machine learning; computer vision and developmental robotics
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China
Interests: artificial intelligence; information security

Special Issue Information

Dear Colleagues,

Computer vision has been expanded into various researching fields, where information is extracted from vision data including image and video. The application of computer vision technology is prevailing in modern human life, with billions of people utilizing applications with the relevant technologies, including image recognition, image processing, object detection, etc., showing the necessity and potential of research in computer vision and its applications. The development of artificial intelligence has now equipped these techniques with the ability to outperform human beings. However, there are still many valuable problems in the research and application of computer vision and image processing technology and artificial intelligence.

This Special Issue on “Computer Vision, Image Processing Technologies and Artificial Intelligence” is aimed at gathering a collection of original articles contributing to the progress of theoretical and practical research in the domains of computer vision, image processing, and artificial intelligence, including but not limited to the following aspects and tasks:

  • Image augmentation;
  • Image restoration;
  • Image encoding;
  • Image segmentation;
  • Image recognition;
  • Image classification;
  • Image and video retrieval;
  • Image and video synthesis;
  • Object detection;
  • Image depiction;
  • Image-to-image translation;
  • Image forensics;
  • Artificial intelligence applied in information security;
  • Large-scale model on computer vision.

Prof. Dr. Hongang Qi
Dr. Yan Liu
Dr. Jun Miao
Prof. Dr. Lijuan Duan
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Mathematics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • computer vision
  • artificial intelligence
  • deep learning
  • machine learning
  • neural networks
  • image processing
  • vision information
  • large-scale model on computer vision

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Related Special Issue

Published Papers (3 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

23 pages, 1980 KiB  
Article
GaitSTAR: Spatial–Temporal Attention-Based Feature-Reweighting Architecture for Human Gait Recognition
by Muhammad Bilal, He Jianbiao, Husnain Mushtaq, Muhammad Asim, Gauhar Ali and Mohammed ElAffendi
Mathematics 2024, 12(16), 2458; https://doi.org/10.3390/math12162458 - 8 Aug 2024
Viewed by 459
Abstract
Human gait recognition (HGR) leverages unique gait patterns to identify individuals, but the effectiveness of this technique can be hindered due to various factors such as carrying conditions, foot shadows, clothing variations, and changes in viewing angles. Traditional silhouette-based systems often neglect the [...] Read more.
Human gait recognition (HGR) leverages unique gait patterns to identify individuals, but the effectiveness of this technique can be hindered due to various factors such as carrying conditions, foot shadows, clothing variations, and changes in viewing angles. Traditional silhouette-based systems often neglect the critical role of instantaneous gait motion, which is essential for distinguishing individuals with similar features. We introduce the ”Enhanced Gait Feature Extraction Framework (GaitSTAR)”, a novel method that incorporates dynamic feature weighting through the discriminant analysis of temporal and spatial features within a channel-wise architecture. Key innovations in GaitSTAR include dynamic stride flow representation (DSFR) to address silhouette distortion, a transformer-based feature set transformation (FST) for integrating image-level features into set-level features, and dynamic feature reweighting (DFR) for capturing long-range interactions. DFR enhances contextual understanding and improves detection accuracy by computing attention distributions across channel dimensions. Empirical evaluations show that GaitSTAR achieves impressive accuracies of 98.5%, 98.0%, and 92.7% under NM, BG, and CL conditions, respectively, with the CASIA-B dataset; 67.3% with the CASIA-C dataset; and 54.21% with the Gait3D dataset. Despite its complexity, GaitSTAR demonstrates a favorable balance between accuracy and computational efficiency, making it a powerful tool for biometric identification based on gait patterns. Full article
Show Figures

Figure 1

20 pages, 15016 KiB  
Article
Masked Feature Compression for Object Detection
by Chengjie Dai, Tiantian Song, Yuxuan Jin, Yixiang Ren, Bowei Yang and Guanghua Song
Mathematics 2024, 12(12), 1848; https://doi.org/10.3390/math12121848 - 14 Jun 2024
Viewed by 580
Abstract
Deploying high-accuracy detection models on lightweight edge devices (e.g., drones) is challenging due to hardware constraints. To achieve satisfactory detection results, a common solution is to compress and transmit the images to a cloud server where powerful models can be used. However, the [...] Read more.
Deploying high-accuracy detection models on lightweight edge devices (e.g., drones) is challenging due to hardware constraints. To achieve satisfactory detection results, a common solution is to compress and transmit the images to a cloud server where powerful models can be used. However, the image compression process for transmission may lead to a reduction in detection accuracy. In this paper, we propose a feature compression method tailored for object detection tasks, and it can be easily integrated with existing learned image compression models. In the method, the encoding process consists of two steps. Firstly, we use a feature extractor to obtain the low-level feature, and then use a mask generator to obtain an object mask to select regions containing objects. Secondly, we use a neural network encoder to compress the masked feature. As for decoding, a neural network decoder is used to restore the compressed representation into the feature that can be directly inputted into the object detection model. The experimental results demonstrate that our method surpasses existing compression techniques. Specifically, when compared to one of the leading methods—TCM2023—our approach achieves a 25.3% reduction in compressed file size and a 6.9% increase in mAP0.5. Full article
Show Figures

Figure 1

23 pages, 8070 KiB  
Article
Enhancing Emergency Vehicle Detection: A Deep Learning Approach with Multimodal Fusion
by Muhammad Zohaib, Muhammad Asim and Mohammed ELAffendi
Mathematics 2024, 12(10), 1514; https://doi.org/10.3390/math12101514 - 13 May 2024
Viewed by 1230
Abstract
Emergency vehicle detection plays a critical role in ensuring timely responses and reducing accidents in modern urban environments. However, traditional methods that rely solely on visual cues face challenges, particularly in adverse conditions. The objective of this research is to enhance emergency vehicle [...] Read more.
Emergency vehicle detection plays a critical role in ensuring timely responses and reducing accidents in modern urban environments. However, traditional methods that rely solely on visual cues face challenges, particularly in adverse conditions. The objective of this research is to enhance emergency vehicle detection by leveraging the synergies between acoustic and visual information. By incorporating advanced deep learning techniques for both acoustic and visual data, our aim is to significantly improve the accuracy and response times. To achieve this goal, we developed an attention-based temporal spectrum network (ATSN) with an attention mechanism specifically designed for ambulance siren sound detection. In parallel, we enhanced visual detection tasks by implementing a Multi-Level Spatial Fusion YOLO (MLSF-YOLO) architecture. To combine the acoustic and visual information effectively, we employed a stacking ensemble learning technique, creating a robust framework for emergency vehicle detection. This approach capitalizes on the strengths of both modalities, allowing for a comprehensive analysis that surpasses existing methods. Through our research, we achieved remarkable results, including a misdetection rate of only 3.81% and an accuracy of 96.19% when applied to visual data containing emergency vehicles. These findings represent significant progress in real-world applications, demonstrating the effectiveness of our approach in improving emergency vehicle detection systems. Full article
Show Figures

Figure 1

Back to TopTop