Submit to Special Issue Submit Abstract to Special Issue Review for AI Propose a Special Issue

Journal Menu

Journal Browser

Artificial Intelligence-Based Image Processing and Computer Vision

Special Issue Editors
Special Issue Information
Keywords
Published Papers

A special issue of AI (ISSN 2673-2688). This special issue belongs to the section "AI Systems: Theory and Applications".

Deadline for manuscript submissions: 31 July 2024 | Viewed by 8627

Share This Special Issue

Special Issue Editor

Dr. Arslan Munir

E-Mail Website
Guest Editor

Department of Computer Science, Kansas State University, Manhattan, KS 66506, USA
Interests: artificial intelligence; computer vision; parallel computing; embedded systems; secure and trustworthy systems
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Modern image processing is a process of transforming an image into a digital form and using computing systems to process, manipulate, and/or enhance digital images through various algorithms. Image processing is also a requisite for many computer vision tasks as it helps to preprocess images and prepares data in a form suitable for various computer vision models. Computer vision generally refers to techniques that enable computers to understand and make sense of images. Computer vision enables machines to extract latent information from visual data and to mimic the human perception of sight with computational algorithms. Active research is ongoing on developing novel image processing and computer vision algorithms including artificial intelligence (AI), in particular, deep-learning-based algorithms to enable new and fascinating applications. Advances in AI-based image processing and computer vision have enabled many exciting new applications, such as autonomous vehicles, unmanned aerial vehicles, computational photography, augmented reality, surveillance, optical character recognition, machine inspection, autonomous package delivery, photogrammetry, biometrics, computer-aided inspection of medical images, and remote patient monitoring. Image processing and computer vision have applications in various domains including healthcare, transportation, retail, agriculture, business, manufacturing, construction, space, and military.

This Special Issue will explore algorithms and applications of image processing and computer vision inspired by AI. For this Special Issue, we welcome the submission of original research articles and reviews that relate to computing, architecture, algorithms, security, and applications of image processing and computer vision. Topics of interest include but are not limited to the following:

Image interpretation
Object detection and recognition
Spatial artificial intelligence
Event detection and activity recognition
Image segmentation
Video classification and analysis
Face and gesture recognition
Pose estimation
Computational photography
Image security
Vision hardware and/or software architectures
Image/vision acceleration techniques
Monitoring and surveillance
Situational awareness.

Dr. Arslan Munir
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. AI is an international peer-reviewed open access quarterly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

image processing
computer vision
image fusion
vision algorithms
deep learning
stereo vision
activity recognition
image/video analysis
image encryption
computational photography
vision hardware/software
monitoring and surveillance
biometrics
robotics
augmented reality

Published Papers (6 papers)

Download All Papers

Research

Jump to: Review

13 pages, 9978 KiB

Open AccessArticle

The Eye in the Sky—A Method to Obtain On-Field Locations of Australian Rules Football Athletes

by Zachery Born, Marion Mundt, Ajmal Mian, Jason Weber and Jacqueline Alderson

AI 2024, 5(2), 733-745; https://doi.org/10.3390/ai5020038 - 16 May 2024

Viewed by 338

Abstract

The ability to overcome an opposition in team sports is reliant upon an understanding of the tactical behaviour of the opposing team members. Recent research is limited to a performance analysts’ own playing team members, as the required opposing team athletes’ geolocation (GPS) data are unavailable. However, in professional Australian rules Football (AF), animations of athlete GPS data from all teams are commercially available. The purpose of this technical study was to obtain the on-field location of AF athletes from animations of the 2019 Australian Football League season to enable the examination of the tactical behaviour of any team. The pre-trained object detection model YOLOv4 was fine-tuned to detect players, and a custom convolutional neural network was trained to track numbers in the animations. The object detection and the athlete tracking achieved an accuracy of 0.94 and 0.98, respectively. Subsequent scaling and translation coefficients were determined through solving an optimisation problem to transform the pixel coordinate positions of a tracked player number to field-relative Cartesian coordinates. The derived equations achieved an average Euclidean distance from the athletes’ raw GPS data of 2.63 m. The proposed athlete detection and tracking approach is a novel methodology to obtain the on-field positions of AF athletes in the absence of direct measures, which may be used for the analysis of opposition collective team behaviour and in the development of interactive play sketching AF tools. Full article

(This article belongs to the Special Issue Artificial Intelligence-Based Image Processing and Computer Vision)

► Show Figures

Figure 1

18 pages, 1201 KiB

Open AccessArticle

Efficient Paddy Grain Quality Assessment Approach Utilizing Affordable Sensors

by Aditya Singh, Kislay Raj, Teerath Meghwar and Arunabha M. Roy

AI 2024, 5(2), 686-703; https://doi.org/10.3390/ai5020036 - 14 May 2024

Viewed by 217

Abstract

Paddy (Oryza sativa) is one of the most consumed food grains in the world. The process from its sowing to consumption via harvesting, processing, storage and management require much effort and expertise. The grain quality of the product is heavily affected by the weather conditions, irrigation frequency, and many other factors. However, quality control is of immense importance, and thus, the evaluation of grain quality is necessary. Since it is necessary and arduous, we try to overcome the limitations and shortcomings of grain quality evaluation using image processing and machine learning (ML) techniques. Most existing methods are designed for rice grain quality assessment, noting that the key characteristics of paddy and rice are different. In addition, they have complex and expensive setups and utilize black-box ML models. To handle these issues, in this paper, we propose a reliable ML-based IoT paddy grain quality assessment system utilizing affordable sensors. It involves a specific data collection procedure followed by image processing with an ML-based model to predict the quality. Different explainable features are used for classifying the grain quality of paddy grain, like the shape, size, moisture, and maturity of the grain. The precision of the system was tested in real-world scenarios. To our knowledge, it is the first automated system to precisely provide an overall quality metric. The main feature of our system is its explainability in terms of utilized features and fuzzy rules, which increases the confidence and trustworthiness of the public toward its use. The grain variety used for experiments majorly belonged to the Indian Subcontinent, but it covered a significant variation in the shape and size of the grain. Full article

(This article belongs to the Special Issue Artificial Intelligence-Based Image Processing and Computer Vision)

► Show Figures

Figure 1

20 pages, 6807 KiB

Open AccessArticle

Single Image Super Resolution Using Deep Residual Learning

by Moiz Hassan, Kandasamy Illanko and Xavier N. Fernando

AI 2024, 5(1), 426-445; https://doi.org/10.3390/ai5010021 - 21 Mar 2024

Viewed by 1563

Abstract

Single Image Super Resolution (SSIR) is an intriguing research topic in computer vision where the goal is to create high-resolution images from low-resolution ones using innovative techniques. SSIR has numerous applications in fields such as medical/satellite imaging, remote target identification and autonomous vehicles. Compared to interpolation based traditional approaches, deep learning techniques have recently gained attention in SISR due to their superior performance and computational efficiency. This article proposes an Autoencoder based Deep Learning Model for SSIR. The down-sampling part of the Autoencoder mainly uses 3 by 3 convolution and has no subsampling layers. The up-sampling part uses transpose convolution and residual connections from the down sampling part. The model is trained using a subset of the VILRC ImageNet database as well as the RealSR database. Quantitative metrics such as PSNR and SSIM are found to be as high as 76.06 and 0.93 in our testing. We also used qualitative measures such as perceptual quality. Full article

(This article belongs to the Special Issue Artificial Intelligence-Based Image Processing and Computer Vision)

► Show Figures

Figure 1

31 pages, 4849 KiB

Open AccessArticle

MultiWave-Net: An Optimized Spatiotemporal Network for Abnormal Action Recognition Using Wavelet-Based Channel Augmentation

by Ramez M. Elmasry, Mohamed A. Abd El Ghany, Mohammed A.-M. Salem and Omar M. Fahmy

AI 2024, 5(1), 259-289; https://doi.org/10.3390/ai5010014 - 24 Jan 2024

Viewed by 1213

Abstract

Human behavior is regarded as one of the most complex notions present nowadays, due to the large magnitude of possibilities. These behaviors and actions can be distinguished as normal and abnormal. However, abnormal behavior is a vast spectrum, so in this work, abnormal behavior is regarded as human aggression or in another context when car accidents occur on the road. As this behavior can negatively affect the surrounding traffic participants, such as vehicles and other pedestrians, it is crucial to monitor such behavior. Given the current prevalent spread of cameras everywhere with different types, they can be used to classify and monitor such behavior. Accordingly, this work proposes a new optimized model based on a novel integrated wavelet-based channel augmentation unit for classifying human behavior in various scenes, having a total number of trainable parameters of 5.3 m with an average inference time of 0.09 s. The model has been trained and evaluated on four public datasets: Real Live Violence Situations (RLVS), Highway Incident Detection (HWID), Movie Fights, and Hockey Fights. The proposed technique achieved accuracies in the range of

92 %

99.5 %

across the used benchmark datasets. Comprehensive analysis and comparisons between different versions of the model and the state-of-the-art have been performed to confirm the model’s performance in terms of accuracy and efficiency. The proposed model has higher accuracy with an average of

4.97 %

, and higher efficiency by reducing the number of parameters by around 139.1 m compared to other models trained and tested on the same benchmark datasets. Full article

(This article belongs to the Special Issue Artificial Intelligence-Based Image Processing and Computer Vision)

► Show Figures

Figure 1

23 pages, 9079 KiB

Open AccessArticle

Deep Learning Performance Characterization on GPUs for Various Quantization Frameworks

by Muhammad Ali Shafique, Arslan Munir and Joonho Kong

AI 2023, 4(4), 926-948; https://doi.org/10.3390/ai4040047 - 18 Oct 2023

Viewed by 2128

Abstract

Deep learning is employed in many applications, such as computer vision, natural language processing, robotics, and recommender systems. Large and complex neural networks lead to high accuracy; however, they adversely affect many aspects of deep learning performance, such as training time, latency, throughput, energy consumption, and memory usage in the training and inference stages. To solve these challenges, various optimization techniques and frameworks have been developed for the efficient performance of deep learning models in the training and inference stages. Although optimization techniques such as quantization have been studied thoroughly in the past, less work has been done to study the performance of frameworks that provide quantization techniques. In this paper, we have used different performance metrics to study the performance of various quantization frameworks, including TensorFlow automatic mixed precision and TensorRT. These performance metrics include training time and memory utilization in the training stage along with latency and throughput for graphics processing units (GPUs) in the inference stage. We have applied the automatic mixed precision (AMP) technique during the training stage using the TensorFlow framework, while for inference we have utilized the TensorRT framework for the post-training quantization technique using the TensorFlow TensorRT (TF-TRT) application programming interface (API).We performed model profiling for different deep learning models, datasets, image sizes, and batch sizes for both the training and inference stages, the results of which can help developers and researchers to devise and deploy efficient deep learning models for GPUs. Full article

(This article belongs to the Special Issue Artificial Intelligence-Based Image Processing and Computer Vision)

► Show Figures

Figure 1

Review

Jump to: Research

21 pages, 683 KiB

Open AccessReview

Few-Shot Fine-Grained Image Classification: A Comprehensive Review

by Jie Ren, Changmiao Li, Yaohui An, Weichuan Zhang and Changming Sun

AI 2024, 5(1), 405-425; https://doi.org/10.3390/ai5010020 - 6 Mar 2024

Viewed by 1639

Abstract

Few-shot fine-grained image classification (FSFGIC) methods refer to the classification of images (e.g., birds, flowers, and airplanes) belonging to different subclasses of the same species by a small number of labeled samples. Through feature representation learning, FSFGIC methods can make better use of limited sample information, learn more discriminative feature representations, greatly improve the classification accuracy and generalization ability, and thus achieve better results in FSFGIC tasks. In this paper, starting from the definition of FSFGIC, a taxonomy of feature representation learning for FSFGIC is proposed. According to this taxonomy, we discuss key issues on FSFGIC (including data augmentation, local and/or global deep feature representation learning, class representation learning, and task-specific feature representation learning). In addition, the existing popular datasets, current challenges and future development trends of feature representation learning on FSFGIC are also described. Full article

(This article belongs to the Special Issue Artificial Intelligence-Based Image Processing and Computer Vision)

► Show Figures