Applications and Challenges of Image Processing in Smart Environment

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Electronic Multimedia".

Deadline for manuscript submissions: 15 May 2025 | Viewed by 7021

Special Issue Editors


E-Mail Website
Guest Editor
School of Electrical and Information Engineering, Tianjin University, Tianjin 300072, China
Interests: deep learning; image processing; power electronics applications
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

The purpose of this Special Issue is to explore the latest advancements, applications, and challenges in the field of image processing in the context of smart environments. With the rapid development of digital technology, image processing has become an essential component in various smart systems and applications, revolutionizing industries such as healthcare, transportation, surveillance, and automation.

This Special Issue aims to gather original research papers, reviews, and case studies that address the diverse applications of image processing techniques in smart environments. Topics of interest include, but are not limited to, the following areas:

  1. Smart healthcare systems: the use of image processing in medical imaging, disease diagnosis, remote patient monitoring, and personalized treatment;
  2. Intelligent transportation systems: the application of image processing for traffic monitoring, vehicle detection and classification, object tracking, and driver assistance systems;
  3. Smart surveillance: techniques used in image and video analysis in surveillance applications, including object detection, tracking, behavior recognition, and anomaly detection;
  4. Automation and robotics: image processing for object recognition, localization, manipulation, and navigation in autonomous robots and industrial automation;
  5. Smart home and Internet of Things (IoT): image processing integration in smart home devices and systems, enabling functions such as facial recognition, activity monitoring, and security systems;
  6. Augmented reality and virtual reality: image processing techniques for enhancing the visual experience in AR/VR applications, including object recognition, scene reconstruction, and motion tracking;
  7. Multimedia forensics in smart environments: image processing techniques on information security in smart environments, including information hiding, cameral model identification, manipulation detection, and location;
  8. Deep learning technologies in smart environments: light model development, model prune, and model deployment technologies for deep learning-based approaches in mobile devices with restricted hardware resources.

This Special Issue will provide a platform for researchers and practitioners to share their innovative work, discuss challenges, and propose future directions in the field of image processing in smart environments. It is expected to contribute to the development of intelligent systems that can perceive and interpret visual information to improve decision-making processes and enhance user experiences in various domains.

Dr. Xinshan Zhu
Dr. Bin Pan
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (6 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

12 pages, 71888 KiB  
Article
Power Grid Violation Action Recognition via Few-Shot Adaptive Network
by Lingwen Meng, Lan Zhang, Guobang Ban, Shasha Luo and Jiangang Liu
Electronics 2025, 14(1), 112; https://doi.org/10.3390/electronics14010112 - 30 Dec 2024
Viewed by 549
Abstract
To address the performance degradation of violation action recognition models due to changing operational scenes in power grid operations, this paper proposes a Few-shot Adaptive Network (FSA-Net). The method incorporates few-shot learning into the network design by adding a parameter mapping layer to [...] Read more.
To address the performance degradation of violation action recognition models due to changing operational scenes in power grid operations, this paper proposes a Few-shot Adaptive Network (FSA-Net). The method incorporates few-shot learning into the network design by adding a parameter mapping layer to the classification network and developing a task-adaptive module to adjust the network parameters for changing scenes. A task-specific linear classifier is added after the backbone, allowing the adaptive generation of classifier weights based on the changing task scene to enhance the model’s generalizability. Additionally, the model uses a strategy of freezing the backbone network and iteratively updating only certain module parameters during training in order to minimize training costs. This approach addresses the challenge of iteratively updating difficulties in the original model, which are caused by limited image data following scene changes. In this paper, 2000 samples under power grid scenarios are used as the experimental dataset; the average recognition accuracy for violation actions is 81.77% for images after scene changes, which represents a 4.58% improvement when compared to the ResNet-50 classification network. Furthermore, the model’s training efficiency is enhanced by 40%. The experimental results show that the method enhances the performance of the violation action recognition model before and after scene changes and improves the efficiency of the iterative model by updating with a smaller sample size, lower model design cost, and lower training cost. Full article
(This article belongs to the Special Issue Applications and Challenges of Image Processing in Smart Environment)
Show Figures

Figure 1

18 pages, 10487 KiB  
Article
BGI-YOLO: Background Image-Assisted Object Detection for Stationary Cameras
by Youn Joo Lee, Ho Gi Jung and Jae Kyu Suhr
Electronics 2025, 14(1), 60; https://doi.org/10.3390/electronics14010060 - 26 Dec 2024
Viewed by 710
Abstract
This paper proposes a method enhancing the accuracy of object detectors by utilizing background images for stationary camera systems. Object detection with stationary cameras is highly valuable across various applications, such as traffic control, crime prevention, and abnormal behavior detection. Deep learning-based object [...] Read more.
This paper proposes a method enhancing the accuracy of object detectors by utilizing background images for stationary camera systems. Object detection with stationary cameras is highly valuable across various applications, such as traffic control, crime prevention, and abnormal behavior detection. Deep learning-based object detectors, which are mainly used in such cases, are developed for general purposes and do not take advantage of stationary cameras at all. Previously, cascade-based object detection methods utilizing background have been studied for stationary camera systems. These methods typically consist of two stages: background subtraction followed by object classification. However, their object detection performance is highly dependent on the accuracy of the background subtraction results, and numerous parameters must be adjusted during background subtraction to adapt to varying conditions. This paper proposes an end-to-end object detection method named BGI-YOLO, which uses a background image simply by combining it with an input image before feeding it into the object detection network. In our experiments, the following five methods are compared: three candidate methods of combining input and background images, baseline YOLOv7, and a traditional cascade method. BGI-YOLO, which combines input and background images at image level, showed a detection performance (mAP) improvement compared to baseline YOLOv7, with an increase of 5.6%p on the WITHROBOT S1 dataset and 2.5%p on the LLVIP dataset. In terms of computational cost (GFLOPs), the proposed method showed a slight increase of 0.19% compared to baseline YOLOv7. The experimental results demonstrated that the proposed method is highly effective for improving detection accuracy without increasing computational cost. Full article
(This article belongs to the Special Issue Applications and Challenges of Image Processing in Smart Environment)
Show Figures

Figure 1

13 pages, 1949 KiB  
Article
Feature Weighted Cycle Generative Adversarial Network with Facial Landmark Recognition and Perceptual Color Distance for Enhanced Face Animation Generation
by Shih-Lun Lo, Hsu-Yung Cheng and Chih-Chang Yu
Electronics 2024, 13(23), 4761; https://doi.org/10.3390/electronics13234761 - 2 Dec 2024
Viewed by 755
Abstract
We propose an anime style transfer model to generate anime faces from human face images. We improve the model by modifying the normalization function to obtain more feature information. To make the face feature position of the anime face similar to the human [...] Read more.
We propose an anime style transfer model to generate anime faces from human face images. We improve the model by modifying the normalization function to obtain more feature information. To make the face feature position of the anime face similar to the human face, we propose facial landmark loss to calculate the error between the generated image and the real human face image. To avoid obvious color deviation in the generated images, we introduced perceptual color loss into the loss function. In addition, due to the lack of reasonable metrics to evaluate the quality of the animated images, we propose the use of Fréchet anime inception distance to calculate the distance between the distribution of the generated animated images and the real animated images in high-dimensional space, so as to understand the quality of the generated animated images. In the user survey, up to 74.46% of users think that the image produced by the proposed method is the best compared with other models. Also, the proposed method reaches a score of 126.05 for Fréchet anime inception distance. Our model performs the best in both user studies and FAID, showing that we have achieved better performance in human visual perception and model distribution. According to the experimental results and user feedback, our proposed method can generate results with better quality compared to existing methods. Full article
(This article belongs to the Special Issue Applications and Challenges of Image Processing in Smart Environment)
Show Figures

Figure 1

19 pages, 36582 KiB  
Article
Optimum Pitch of Volumetric Computational Reconstruction in Integral Imaging
by Youngjun Kim, Jiyong Park, Jungsik Koo, Min-Chul Lee and Myungjin Cho
Electronics 2024, 13(23), 4595; https://doi.org/10.3390/electronics13234595 - 21 Nov 2024
Viewed by 736
Abstract
In this paper, we propose a method for how to find the optimum pitch of volumetric computational reconstruction (VCR) in integral imaging. In conventional VCR, the pixel shifts between elemental images are quantized due to pixel-based processing. As a result, quantization errors may [...] Read more.
In this paper, we propose a method for how to find the optimum pitch of volumetric computational reconstruction (VCR) in integral imaging. In conventional VCR, the pixel shifts between elemental images are quantized due to pixel-based processing. As a result, quantization errors may occur during three-dimensional (3D) reconstruction in integral imaging. This may cause the degradation of the visual quality and depth resolution of the reconstructed 3D image. To overcome this problem, we propose a method to find the optimum pitch for VCR in integral imaging. To minimize the quantization error in VCR, the shifting pixels are defined as a natural number. Using this characteristic, we can find the optimum pitch of VCR in integral imaging. To demonstrate the feasibility of our method, we conducted simulations and optical experiments with performance metrics such as the peak-signal-to-noise ratio (PSNR) and structural similarity index measure (SSIM). Full article
(This article belongs to the Special Issue Applications and Challenges of Image Processing in Smart Environment)
Show Figures

Figure 1

21 pages, 7973 KiB  
Article
Research on Target Hybrid Recognition and Localization Methods Based on an Industrial Camera and a Depth Camera in Complex Scenes
by Mingxin Yuan, Jie Li, Borui Cao, Shihao Bao, Li Sun and Xiangbin Li
Electronics 2024, 13(22), 4381; https://doi.org/10.3390/electronics13224381 - 8 Nov 2024
Viewed by 802
Abstract
In order to improve the target visual recognition and localization accuracy of robotic arms in complex scenes with similar targets, hybrid recognition and localization methods based on an industrial camera and depth camera are proposed. First, according to the speed and accuracy requirements [...] Read more.
In order to improve the target visual recognition and localization accuracy of robotic arms in complex scenes with similar targets, hybrid recognition and localization methods based on an industrial camera and depth camera are proposed. First, according to the speed and accuracy requirements of target recognition and localization, YOLOv5s is introduced as the basic algorithm model for target hybrid recognition and localization. Then, in order to improve the accuracy of target recognition and coarse localization based on an industrial camera (eye-to-hand), the AFPN feature fusion module, simple and parameter-free attention module (SimAM), and soft non-maximum suppression (Soft NMS) are introduced. In order to improve the accuracy of target recognition and fine localization based on a depth camera (eye-in-hand), the SENetV2 backbone network structure, dynamic head module, deformable attention mechanism, and chain-of-thought prompted adaptive enhancer network are introduced. After that, on the basis of constructing a dual camera platform for target hybrid recognition and localization, the hand–eye calibration, collection and production of image datasets required for model training are completed. Finally, for the docking of the oil filling port, the hybrid recognition and localization experimental tests are completed in sequence. The test results show that in target recognition and coarse localization based on the industrial camera, the recognition accuracy of the designed model reaches 99%, and the average localization errors in the horizontal and vertical directions are 2.22 mm and 3.66 mm, respectively. In target recognition and fine localization based on the depth camera, the recognition accuracy of the designed model reaches 98%, and the average errors in depth, horizontal, and vertical directions are 0.12 mm, 0.28 mm, and 0.16 mm, respectively. These not only verify the effectiveness of the target hybrid recognition and localization methods based on dual cameras, but also demonstrate that they meet the high-precision recognition and localization requirements in complex scenes. Full article
(This article belongs to the Special Issue Applications and Challenges of Image Processing in Smart Environment)
Show Figures

Figure 1

37 pages, 5927 KiB  
Article
Object and Pedestrian Detection on Road in Foggy Weather Conditions by Hyperparameterized YOLOv8 Model
by Ahmad Esmaeil Abbasi, Agostino Marcello Mangini and Maria Pia Fanti
Electronics 2024, 13(18), 3661; https://doi.org/10.3390/electronics13183661 - 14 Sep 2024
Cited by 1 | Viewed by 2593
Abstract
Connected cooperative and automated (CAM) vehicles and self-driving cars need to achieve robust and accurate environment understanding. With this aim, they are usually equipped with sensors and adopt multiple sensing strategies, also fused among them to exploit their complementary properties. In recent years, [...] Read more.
Connected cooperative and automated (CAM) vehicles and self-driving cars need to achieve robust and accurate environment understanding. With this aim, they are usually equipped with sensors and adopt multiple sensing strategies, also fused among them to exploit their complementary properties. In recent years, artificial intelligence such as machine learning- and deep learning-based approaches have been applied for object and pedestrian detection and prediction reliability quantification. This paper proposes a procedure based on the YOLOv8 (You Only Look Once) method to discover objects on the roads such as cars, traffic lights, pedestrians and street signs in foggy weather conditions. In particular, YOLOv8 is a recent release of YOLO, a popular neural network model used for object detection and image classification. The obtained model is applied to a dataset including about 4000 foggy road images and the object detection accuracy is improved by changing hyperparameters such as epochs, batch size and augmentation methods. To achieve good accuracy and few errors in detecting objects in the images, the hyperparameters are optimized by four different methods, and different metrics are considered, namely accuracy factor, precision, recall, precision–recall and loss. Full article
(This article belongs to the Special Issue Applications and Challenges of Image Processing in Smart Environment)
Show Figures

Figure 1

Back to TopTop