Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (141)

Search Parameters:
Keywords = false color image

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
23 pages, 2610 KiB  
Article
Feature-Level Fusion Network for Hyperspectral Object Tracking via Mixed Multi-Head Self-Attention Learning
by Long Gao, Langkun Chen, Yan Jiang, Bobo Xi, Weiying Xie and Yunsong Li
Remote Sens. 2025, 17(6), 997; https://doi.org/10.3390/rs17060997 - 12 Mar 2025
Viewed by 326
Abstract
Hyperspectral object tracking has emerged as a promising task in visual object tracking. The rich spectral information within hyperspectral images benefits the accurate tracking in challenging scenarios. The performances of existing hyperspectral object tracking networks are constrained by neglecting the interactive information among [...] Read more.
Hyperspectral object tracking has emerged as a promising task in visual object tracking. The rich spectral information within hyperspectral images benefits the accurate tracking in challenging scenarios. The performances of existing hyperspectral object tracking networks are constrained by neglecting the interactive information among bands within hyperspectral images. Moreover, designing an accurate deep learning-based algorithm for hyperspectral object tracking poses challenges because of the substantial amount of training data required. In order to address these challenges, a new mixed multi-head attention-based feature fusion tracking (MMFT) algorithm for hyperspectral videos is proposed. Firstly, MMFT introduces a feature-level fusion module, mixed multi-head attention feature fusion (MMFF), which fuses false-color features and augments the fused feature with one mixed multi-head attention (MMA) block with interactive information, which increases the representational ability of the features for tracking. Specifically, MMA learns the interactive information across the bands in the false-color images and incorporates the learned interactive information into the fused feature, which is obtained by combining the features of the false-color images. Secondly, a new training procedure is introduced, in which the modules designed for hyperspectral object tracking are first pre-trained on a sufficient amount of modified RGB data to enhance generalization, and then fine-tuned on a limited amount of HS data for task adaption. Extensive experiments verify the effectiveness of MMFT, demonstrating its SOTA performance. Full article
Show Figures

Figure 1

19 pages, 3839 KiB  
Article
YOLO-YSTs: An Improved YOLOv10n-Based Method for Real-Time Field Pest Detection
by Yiqi Huang, Zhenhao Liu, Hehua Zhao, Chao Tang, Bo Liu, Zaiyuan Li, Fanghao Wan, Wanqiang Qian and Xi Qiao
Agronomy 2025, 15(3), 575; https://doi.org/10.3390/agronomy15030575 - 26 Feb 2025
Cited by 1 | Viewed by 596
Abstract
The use of yellow sticky traps is a green pest control method that utilizes the pests’ attraction to the color yellow. The use of yellow sticky traps not only controls pest populations but also enables monitoring, offering a more economical and environmentally friendly [...] Read more.
The use of yellow sticky traps is a green pest control method that utilizes the pests’ attraction to the color yellow. The use of yellow sticky traps not only controls pest populations but also enables monitoring, offering a more economical and environmentally friendly alternative to pesticides. However, the small size and dense distribution of pests on yellow sticky traps lead to lower detection accuracy when using lightweight models. On the other hand, large models suffer from longer training times and deployment difficulties, posing challenges for pest detection in the field using edge computing platforms. To address these issues, this paper proposes a lightweight detection method, YOLO-YSTs, based on an improved YOLOv10n model. The method aims to balance pest detection accuracy and model size and has been validated on edge computing platforms. This model incorporates SPD-Conv convolutional modules, the iRMB inverted residual block attention mechanism, and the Inner-SIoU loss function to improve the YOLOv10n network architecture, ultimately addressing the issues of missed and false detections for small and overlapping targets while balancing model speed and accuracy. Experimental results show that the YOLO-YSTs model achieved precision, recall, mAP50, and mAP50–95 values of 83.2%, 83.2%, 86.8%, and 41.3%, respectively, on the yellow sticky trap dataset. The detection speed reached 139 FPS, with GFLOPs at only 8.8. Compared with the YOLOv10n model, the mAP50 improved by 1.7%. Compared with other mainstream object detection models, YOLO-YSTs also achieved the best overall performance. Through improvements to the YOLOv10n model, the accuracy of pest detection on yellow sticky traps was effectively enhanced, and the model demonstrated good detection performance when deployed on edge mobile platforms. In conclusion, the proposed YOLO-YSTs model offers more balanced performance in the detection of pest images on yellow sticky traps. It performs well when deployed on edge mobile platforms, making it of significant importance for field pest monitoring and integrated pest management. Full article
Show Figures

Figure 1

15 pages, 8698 KiB  
Article
Geometric Self-Supervised Learning: A Novel AI Approach Towards Quantitative and Explainable Diabetic Retinopathy Detection
by Lucas Pu, Oliver Beale and Xin Meng
Bioengineering 2025, 12(2), 157; https://doi.org/10.3390/bioengineering12020157 - 6 Feb 2025
Viewed by 769
Abstract
Background: Diabetic retinopathy (DR) is the leading cause of blindness among working-age adults. Early detection is crucial to reducing DR-related vision loss risk but is fraught with challenges. Manual detection is labor-intensive and often misses tiny DR lesions, necessitating automated detection. Objective: We [...] Read more.
Background: Diabetic retinopathy (DR) is the leading cause of blindness among working-age adults. Early detection is crucial to reducing DR-related vision loss risk but is fraught with challenges. Manual detection is labor-intensive and often misses tiny DR lesions, necessitating automated detection. Objective: We aimed to develop and validate an annotation-free deep learning strategy for the automatic detection of exudates and bleeding spots on color fundus photography (CFP) images and ultrawide field (UWF) retinal images. Materials and Methods: Three cohorts were created: two CFP cohorts (Kaggle-CFP and E-Ophtha) and one UWF cohort. Kaggle-CFP was used for algorithm development, while E-Ophtha, with manually annotated DR-related lesions, served as the independent test set. For additional independent testing, 50 DR-positive cases from both the Kaggle-CFP and UWF cohorts were manually outlined for bleeding and exudate spots. The remaining cases were used for algorithm training. A multiscale contrast-based shape descriptor transformed DR-verified retinal images into contrast fields. High-contrast regions were identified, and local image patches from abnormal and normal areas were extracted to train a U-Net model. Model performance was evaluated using sensitivity and false positive rates based on manual annotations in the independent test sets. Results: Our trained model on the independent CFP cohort achieved high sensitivities for detecting and segmenting DR lesions: microaneurysms (91.5%, 9.04 false positives per image), hemorrhages (92.6%, 2.26 false positives per image), hard exudates (92.3%, 7.72 false positives per image), and soft exudates (90.7%, 0.18 false positives per image). For UWF images, the model’s performance varied by lesion size. Bleeding detection sensitivity increased with lesion size, from 41.9% (6.48 false positives per image) for the smallest spots to 93.4% (5.80 false positives per image) for the largest. Exudate detection showed high sensitivity across all sizes, ranging from 86.9% (24.94 false positives per image) to 96.2% (6.40 false positives per image), though false positive rates were higher for smaller lesions. Conclusions: Our experiments demonstrate the feasibility of training a deep learning neural network for detecting and segmenting DR-related lesions without relying on their manual annotations. Full article
(This article belongs to the Special Issue Artificial Intelligence-Based Medical Imaging Processing)
Show Figures

Figure 1

19 pages, 3375 KiB  
Article
Enhancing Cross-Modal Camera Image and LiDAR Data Registration Using Feature-Based Matching
by Jennifer Leahy, Shabnam Jabari, Derek Lichti and Abbas Salehitangrizi
Remote Sens. 2025, 17(3), 357; https://doi.org/10.3390/rs17030357 - 22 Jan 2025
Viewed by 1001
Abstract
Registering light detection and ranging (LiDAR) data with optical camera images enhances spatial awareness in autonomous driving, robotics, and geographic information systems. The current challenges in this field involve aligning 2D-3D data acquired from sources with distinct coordinate systems, orientations, and resolutions. This [...] Read more.
Registering light detection and ranging (LiDAR) data with optical camera images enhances spatial awareness in autonomous driving, robotics, and geographic information systems. The current challenges in this field involve aligning 2D-3D data acquired from sources with distinct coordinate systems, orientations, and resolutions. This paper introduces a new pipeline for camera–LiDAR post-registration to produce colorized point clouds. Utilizing deep learning-based matching between 2D spherical projection LiDAR feature layers and camera images, we can map 3D LiDAR coordinates to image grey values. Various LiDAR feature layers, including intensity, bearing angle, depth, and different weighted combinations, are used to find correspondence with camera images utilizing state-of-the-art deep learning matching algorithms, i.e., SuperGlue and LoFTR. Registration is achieved using collinearity equations and RANSAC to remove false matches. The pipeline’s accuracy is tested using survey-grade terrestrial datasets from the TX5 scanner, as well as datasets from a custom-made, low-cost mobile mapping system (MMS) named Simultaneous Localization And Mapping Multi-sensor roBOT (SLAMM-BOT) across diverse scenes, in which both outperformed their baseline solutions. SuperGlue performed best in high-feature scenes, whereas LoFTR performed best in low-feature or sparse data scenes. The LiDAR intensity layer had the strongest matches, but combining feature layers improved matching and reduced errors. Full article
(This article belongs to the Special Issue Remote Sensing Satellites Calibration and Validation)
Show Figures

Figure 1

22 pages, 8514 KiB  
Article
Multi-Analytical Characterization of Illuminated Choirbooks from the Royal Audience of Quito
by Martha Romero-Bastidas, Katherine Guacho-Pachacama, Carlos Vásquez-Mora, Fernando Espinoza-Guerra, Rita Díaz-Benalcázar, Johanna Ramírez-Bustamante and Luis Ramos-Guerrero
Heritage 2024, 7(12), 6592-6613; https://doi.org/10.3390/heritage7120305 - 24 Nov 2024
Cited by 1 | Viewed by 908
Abstract
Choirbooks are historical heritage manuscripts used for the performance of vocal music in religious ceremonies in colonial times. This study aimed to understand the characteristics of choirbook manuscripts produced in the Real Audiencia de Quito during the 17th century. The methodology combined non-invasive [...] Read more.
Choirbooks are historical heritage manuscripts used for the performance of vocal music in religious ceremonies in colonial times. This study aimed to understand the characteristics of choirbook manuscripts produced in the Real Audiencia de Quito during the 17th century. The methodology combined non-invasive techniques, such as infrared false-color imaging (IRFC) and X-ray fluorescence (XRF), together with spot analysis by scanning electron microscopy with energy-dispersive X-ray spectroscopy (SEM-EDX) and Fourier transform infrared spectroscopy with attenuated total reflection (FTIR-ATR). The analytical results revealed the use of pumice, chalk and lime carbonate as support materials in the manufacturing process and surface treatment of the parchment. In the illuminations, three pictorial techniques based on protein, polysaccharide and lipid binders were recognized, establishing that the pigments used with greater regularity in the illuminations were vermilion, minium, verdigris, orpiment, azurite, and indigo, preferably in a pure state. Materials used less regularly were also identified, such as yellow ochre, saffron, smalt, red ochre, and bone black, among others. Regarding the vulnerability of the pictorial materials, it was determined that, although most of the pigments exhibit chemical stability, they present some vulnerabilities associated with their intrinsic composition and the medium that contains them. Full article
(This article belongs to the Special Issue Analytical Chemistry for Archaeology and Cultural Heritage)
Show Figures

Figure 1

22 pages, 13050 KiB  
Article
A Deep Learning Model for Detecting Fake Medical Images to Mitigate Financial Insurance Fraud
by Muhammad Asad Arshed, Shahzad Mumtaz, Ștefan Cristian Gherghina, Neelam Urooj, Saeed Ahmed and Christine Dewi
Computation 2024, 12(9), 173; https://doi.org/10.3390/computation12090173 - 29 Aug 2024
Viewed by 2385
Abstract
Artificial Intelligence and Deepfake Technologies have brought a new dimension to the generation of fake data, making it easier and faster than ever before—this fake data could include text, images, sounds, videos, etc. This has brought new challenges that require the faster development [...] Read more.
Artificial Intelligence and Deepfake Technologies have brought a new dimension to the generation of fake data, making it easier and faster than ever before—this fake data could include text, images, sounds, videos, etc. This has brought new challenges that require the faster development of tools and techniques to avoid fraudulent activities at pace and scale. Our focus in this research study is to empirically evaluate the use and effectiveness of deep learning models such as Convolutional Neural Networks (CNNs) and Patch-based Neural Networks in the context of successful identification of real and fake images. We chose the healthcare domain as a potential case study where the fake medical data generation approach could be used to make false insurance claims. For this purpose, we obtained publicly available skin cancer data and used recently introduced stable diffusion approaches—a more effective technique than prior approaches such as Generative Adversarial Network (GAN)—to generate fake skin cancer images. To the best of our knowledge, and based on the literature review, this is one of the few research studies that uses images generated using stable diffusion along with real image data. As part of the exploratory analysis, we analyzed histograms of fake and real images using individual color channels and averaged across training and testing datasets. The histogram analysis demonstrated a clear change by shifting the mean and overall distribution of both real and fake images (more prominent in blue and green) in the training data whereas, in the test data, both means were different from the training data, so it appears to be non-trivial to set a threshold which could give better predictive capability. We also conducted a user study to observe where the naked eye could identify any patterns for classifying real and fake images, and the accuracy of the test data was observed to be 68%. The adoption of deep learning predictive approaches (i.e., patch-based and CNN-based) has demonstrated similar accuracy (~100%) in training and validation subsets of the data, and the same was observed for the test subset with and without StratifiedKFold (k = 3). Our analysis has demonstrated that state-of-the-art exploratory and deep-learning approaches are effective enough to detect images generated from stable diffusion vs. real images. Full article
(This article belongs to the Special Issue Computational Medical Image Analysis—2nd Edition)
Show Figures

Figure 1

24 pages, 13634 KiB  
Article
Exploring Factors Affecting the Performance of Neural Network Algorithm for Detecting Clouds, Snow, and Lakes in Sentinel-2 Images
by Kaihong Huang, Zhangli Sun, Yi Xiong, Lin Tu, Chenxi Yang and Hangtong Wang
Remote Sens. 2024, 16(17), 3162; https://doi.org/10.3390/rs16173162 - 27 Aug 2024
Viewed by 1074
Abstract
Detecting clouds, snow, and lakes in remote sensing images is vital due to their propensity to obscure underlying surface information and hinder data extraction. In this study, we utilize Sentinel-2 images to implement a two-stage random forest (RF) algorithm for image labeling and [...] Read more.
Detecting clouds, snow, and lakes in remote sensing images is vital due to their propensity to obscure underlying surface information and hinder data extraction. In this study, we utilize Sentinel-2 images to implement a two-stage random forest (RF) algorithm for image labeling and delve into the factors influencing neural network performance across six aspects: model architecture, encoder, learning rate adjustment strategy, loss function, input image size, and different band combinations. Our findings indicate the Feature Pyramid Network (FPN) achieved the highest MIoU of 87.14%. The multi-head self-attention mechanism was less effective compared to convolutional methods for feature extraction with small datasets. Incorporating residual connections into convolutional blocks notably enhanced performance. Additionally, employing false-color images (bands 12-3-2) yielded a 4.86% improvement in MIoU compared to true-color images (bands 4-3-2). Notably, variations in model architecture, encoder structure, and input band combination had a substantial impact on performance, with parameter variations resulting in MIoU differences exceeding 5%. These results provide a reference for high-precision segmentation of clouds, snow, and lakes and offer valuable insights for applying deep learning techniques to the high-precision extraction of information from remote sensing images, thereby advancing research in deep neural networks for semantic segmentation. Full article
Show Figures

Graphical abstract

17 pages, 4134 KiB  
Article
CPROS: A Multimodal Decision-Level Fusion Detection Method Based on Category Probability Sets
by Can Li, Zhen Zuo, Xiaozhong Tong, Honghe Huang, Shudong Yuan and Zhaoyang Dang
Remote Sens. 2024, 16(15), 2745; https://doi.org/10.3390/rs16152745 - 27 Jul 2024
Viewed by 1317
Abstract
Images acquired by different sensors exhibit different characteristics because of the varied imaging mechanisms of sensors. The fusion of visible and infrared images is valuable for specific image applications. While infrared images provide stronger object features under poor illumination and smoke interference, visible [...] Read more.
Images acquired by different sensors exhibit different characteristics because of the varied imaging mechanisms of sensors. The fusion of visible and infrared images is valuable for specific image applications. While infrared images provide stronger object features under poor illumination and smoke interference, visible images have rich texture features and color information about the target. This study uses dual optical fusion as an example to explore fusion detection methods at different levels and proposes a multimodal decision-level fusion detection method based on category probability sets (CPROS). YOLOv8—a single-mode detector with good detection performance—was chosen as the benchmark. Next, we innovatively introduced the improved Yager formula and proposed a simple non-learning fusion strategy based on CPROS, which can combine the detection results of multiple modes and effectively improve target confidence. We validated the proposed algorithm using the VEDAI public dataset, which was captured from a drone perspective. The results showed that the mean average precision (mAP) of YOLOv8 using the CPROS method was 8.6% and 16.4% higher than that of the YOLOv8 detection single-mode dataset. The proposed method significantly reduces the missed detection rate (MR) and number of false detections per image (FPPI), and it can be generalized. Full article
(This article belongs to the Special Issue Multi-Sensor Systems and Data Fusion in Remote Sensing II)
Show Figures

Figure 1

29 pages, 9073 KiB  
Article
Color Histogram Contouring: A New Training-Less Approach to Object Detection
by Tamer Rabie, Mohammed Baziyad, Radhwan Sani, Talal Bonny and Raouf Fareh
Electronics 2024, 13(13), 2522; https://doi.org/10.3390/electronics13132522 - 27 Jun 2024
Cited by 6 | Viewed by 1387
Abstract
This paper introduces the Color Histogram Contouring (CHC) method, a new training-less approach to object detection that emphasizes the distinctive features in chrominance components. By building a chrominance-rich feature vector with a bin size of 1, the proposed CHC method exploits the precise [...] Read more.
This paper introduces the Color Histogram Contouring (CHC) method, a new training-less approach to object detection that emphasizes the distinctive features in chrominance components. By building a chrominance-rich feature vector with a bin size of 1, the proposed CHC method exploits the precise information in chrominance features without increasing bin sizes, which can lead to false detections. This feature vector demonstrates invariance to lighting changes and is designed to mimic the opponent color axes used by the human visual system. The proposed CHC algorithm iterates over non-zero histogram bins of unique color features in the model, creating a feature vector for each, and emphasizes those matching in both the scene and model histograms. When both model and scene histograms for these unique features align, it ensures the presence of the model in the scene image. Extensive experiments across various scenarios show that the proposed CHC technique outperforms the benchmark training-less Swain and Ballard method and the algorithm of Viola and Jones. Additionally, a comparative experiment with the state-of-the-art You Only Look Once (YOLO) technique reveals that the proposed CHC technique surpasses YOLO in scenarios with limited training data, highlighting a significant advancement in training-less object detection. This approach offers a valuable addition to computer vision, providing an effective training-less solution for real-time autonomous robot localization and mapping in unknown environments. Full article
(This article belongs to the Special Issue Recent Advances in Image Processing and Computer Vision)
Show Figures

Figure 1

16 pages, 3319 KiB  
Article
HDetect-VS: Tiny Human Object Enhancement and Detection Based on Visual Saliency for Maritime Search and Rescue
by Zhennan Fei, Yingjiang Xie, Da Deng, Lingshuai Meng, Fu Niu and Jinggong Sun
Appl. Sci. 2024, 14(12), 5260; https://doi.org/10.3390/app14125260 - 18 Jun 2024
Cited by 2 | Viewed by 1074
Abstract
Strong sun glint noise is an inevitable obstruction for tiny human object detection in maritime search and rescue (SAR) tasks, which can significantly deteriorate the performance of local contrast method (LCM)-based algorithms and cause high false alarm rates. For SAR tasks in noisy [...] Read more.
Strong sun glint noise is an inevitable obstruction for tiny human object detection in maritime search and rescue (SAR) tasks, which can significantly deteriorate the performance of local contrast method (LCM)-based algorithms and cause high false alarm rates. For SAR tasks in noisy environments, it is more important to find tiny objects than localize them. Hence, considering background clutter and strong glint noise, in this study, a noise suppression methodology for maritime scenarios (HDetect-VS) is established to achieve tiny human object enhancement and detection based on visual saliency. To this end, the pixel intensity value distributions, color characteristics, and spatial distributions are thoroughly analyzed to separate objects from background and glint noise. Using unmanned aerial vehicles (UAVs), visible images with rich details, rather than infrared images, are applied to detect tiny objects in noisy environments. In this study, a grayscale model mapped from the HSV model (HSV-gray) is used to suppress glint noise based on color characteristic analysis, and large-scale Gaussian Convolution is utilized to obtain the pixel intensity surface and suppress background noise based on pixel intensity value distributions. Moreover, based on a thorough analysis of the spatial distribution of objects and noise, two-step clustering is employed to separate objects from noise in a salient point map. Experiments are conducted on the SeaDronesSee dataset; the results illustrate that HDetect-VS has more robust and effective performance in tiny object detection in noisy environments than other pixel-level algorithms. In particular, the performance of existing deep learning-based object detection algorithms can be significantly improved by taking the results of HDetect-VS as input. Full article
(This article belongs to the Special Issue Advanced Image Analysis and Processing Technologies and Applications)
Show Figures

Figure 1

14 pages, 29541 KiB  
Article
A Type of Scale-Oriented Terrain Pattern Derived from Normalized Topographic Relief Layers and Its Interpretation
by Xi Nan, Ainong Li, Zhengwei He and Jinhu Bian
ISPRS Int. J. Geo-Inf. 2024, 13(6), 209; https://doi.org/10.3390/ijgi13060209 - 17 Jun 2024
Cited by 1 | Viewed by 1137
Abstract
Topographic scale characteristics contain valuable information for interpreting landform structures, which is crucial for understanding the spatial differentiation of landforms across large areas. However, the absence of parameters that specifically describe the topographic scale characteristics hinders the quantitative representation of regional topography from [...] Read more.
Topographic scale characteristics contain valuable information for interpreting landform structures, which is crucial for understanding the spatial differentiation of landforms across large areas. However, the absence of parameters that specifically describe the topographic scale characteristics hinders the quantitative representation of regional topography from the perspective of spatial scales. In this study, false-color composite images were generated using normalized topographic relief data, showing a type of scale-oriented terrain pattern. Subsequent analysis indicated a direct correlation between the luminance of the patterns and the normalized topographic relief. Additionally, a linear correlation exists between the color of the patterns and the change rate in normalized topographic relief. Based on the analysis results, the issue of characterizing topographic scale effects was transformed into a problem of interpreting terrain patterns. The introduction of two parameters, flux and curl of topographic field, allowed for the interpretation of the terrain patterns. The assessment indicated that the calculated values of topographic field flux are equivalent to the luminance of the terrain patterns and the variations in the topographic field curl correspond with the spatial differentiation of colors in the terrain patterns. This study introduced a new approach to analyzing topographic scale characteristics, providing a pathway for quantitatively describing scale effects and automatically classifying landforms at a regional scale. Through exploratory analysis on artificially constructed simple DEMs and verification in four typical geomorphological regions of real terrain, it was shown that the terrain pattern method has better intuitiveness than the scale signature approach. It can reflect the scale characteristics of terrain in continuous space. Compared to the MTPCC image, the terrain parameters derived from the terrain pattern method further quantitatively describe the scale effects of the terrain. Full article
Show Figures

Figure 1

18 pages, 7438 KiB  
Article
Autonomous Image-Based Corrosion Detection in Steel Structures Using Deep Learning
by Amrita Das, Sattar Dorafshan and Naima Kaabouch
Sensors 2024, 24(11), 3630; https://doi.org/10.3390/s24113630 - 4 Jun 2024
Cited by 3 | Viewed by 2357
Abstract
Steel structures are susceptible to corrosion due to their exposure to the environment. Currently used non-destructive techniques require inspector involvement. Inaccessibility of the defective part may lead to unnoticed corrosion, allowing the corrosion to propagate and cause catastrophic structural failure over time. Autonomous [...] Read more.
Steel structures are susceptible to corrosion due to their exposure to the environment. Currently used non-destructive techniques require inspector involvement. Inaccessibility of the defective part may lead to unnoticed corrosion, allowing the corrosion to propagate and cause catastrophic structural failure over time. Autonomous corrosion detection is essential for mitigating these problems. This study investigated the effect of the type of encoder–decoder neural network and the training strategy that works the best to automate the segmentation of corroded pixels in visual images. Models using pre-trained DesnseNet121 and EfficientNetB7 backbones yielded 96.78% and 98.5% average pixel-level accuracy, respectively. Deeper EffiecientNetB7 performed the worst, with only 33% true-positive values, which was 58% less than ResNet34 and the original UNet. ResNet 34 successfully classified the corroded pixels, with 2.98% false positives, whereas the original UNet predicted 8.24% of the non-corroded pixels as corroded when tested on a specific set of images exclusive to the investigated training dataset. Deep networks were found to be better for transfer learning than full training, and a smaller dataset could be one of the reasons for performance degradation. Both fully trained conventional UNet and ResNet34 models were tested on some external images of different steel structures with different colors and types of corrosion, with the ResNet 34 backbone outperforming conventional UNet. Full article
(This article belongs to the Section Fault Diagnosis & Sensors)
Show Figures

Figure 1

31 pages, 15223 KiB  
Article
Lightweight Underwater Object Detection Algorithm for Embedded Deployment Using Higher-Order Information and Image Enhancement
by Changhong Liu, Jiawen Wen, Jinshan Huang, Weiren Lin, Bochun Wu, Ning Xie and Tao Zou
J. Mar. Sci. Eng. 2024, 12(3), 506; https://doi.org/10.3390/jmse12030506 - 19 Mar 2024
Cited by 7 | Viewed by 3404
Abstract
Underwater object detection is crucial in marine exploration, presenting a challenging problem in computer vision due to factors like light attenuation, scattering, and background interference. Existing underwater object detection models face challenges such as low robustness, extensive computation of model parameters, and a [...] Read more.
Underwater object detection is crucial in marine exploration, presenting a challenging problem in computer vision due to factors like light attenuation, scattering, and background interference. Existing underwater object detection models face challenges such as low robustness, extensive computation of model parameters, and a high false detection rate. To address these challenges, this paper proposes a lightweight underwater object detection method integrating deep learning and image enhancement. Firstly, FUnIE-GAN is employed to perform data enhancement to restore the authentic colors of underwater images, and subsequently, the restored images are fed into an enhanced object detection network named YOLOv7-GN proposed in this paper. Secondly, a lightweight higher-order attention layer aggregation network (ACC3-ELAN) is designed to improve the fusion perception of higher-order features in the backbone network. Moreover, the head network is enhanced by leveraging the interaction of multi-scale higher-order information, additionally fusing higher-order semantic information from features at different scales. To further streamline the entire network, we also introduce the AC-ELAN-t module, which is derived from pruning based on ACC3-ELAN. Finally, the algorithm undergoes practical testing on a biomimetic sea flatworm underwater robot. The experimental results on the DUO dataset show that our proposed method improves the performance of object detection in underwater environments. It provides a valuable reference for realizing object detection in underwater embedded devices with great practical potential. Full article
(This article belongs to the Special Issue Underwater Engineering and Image Processing)
Show Figures

Figure 1

16 pages, 2735 KiB  
Article
Detection of Ulcerative Colitis Lesions from Weakly Annotated Colonoscopy Videos Using Bounding Boxes
by Safaa Al-Ali, John Chaussard, Sébastien Li-Thiao-Té, Éric Ogier-Denis, Alice Percy-du-Sert, Xavier Treton and Hatem Zaag
Gastrointest. Disord. 2024, 6(1), 292-307; https://doi.org/10.3390/gidisord6010020 - 7 Mar 2024
Viewed by 1917
Abstract
Ulcerative colitis is a chronic disease characterized by bleeding and ulcers in the colon. Disease severity assessment via colonoscopy videos is time-consuming and only focuses on the most severe lesions. Automated detection methods enable fine-grained assessment but depend on the training set quality. [...] Read more.
Ulcerative colitis is a chronic disease characterized by bleeding and ulcers in the colon. Disease severity assessment via colonoscopy videos is time-consuming and only focuses on the most severe lesions. Automated detection methods enable fine-grained assessment but depend on the training set quality. To suit the local clinical setup, an internal training dataset containing only rough bounding box annotations around lesions was utilized. Following previous works, we propose to use linear models in suitable color spaces to detect lesions. We introduce an efficient sampling scheme for exploring the set of linear classifiers and removing trivial models i.e., those showing zero false negative or positive ratios. Bounding boxes lead to exaggerated false detection ratios due to mislabeled pixels, especially in the corners, resulting in decreased model accuracy. Therefore, we propose to evaluate the model sensitivity on the annotation level instead of the pixel level. Our sampling strategy can eliminate up to 25% of trivial models. Despite the limited quality of annotations, the detectors achieved better performance in comparison with the state-of-the-art methods. When tested on a small subset of endoscopic images, the best models exhibit low variability. However, the inter-patient model performance was variable suggesting that appearance normalization is critical in this context. Full article
Show Figures

Figure 1

19 pages, 5744 KiB  
Article
Single-Temporal Sentinel-2 for Analyzing Burned Area Detection Methods: A Study of 14 Cases in Republic of Korea Considering Land Cover
by Doi Lee, Sanghun Son, Jaegu Bae, Soryeon Park, Jeongmin Seo, Dongju Seo, Yangwon Lee and Jinsoo Kim
Remote Sens. 2024, 16(5), 884; https://doi.org/10.3390/rs16050884 - 2 Mar 2024
Cited by 7 | Viewed by 2705
Abstract
Forest fires are caused by various climatic and anthropogenic factors. In Republic of Korea, forest fires occur frequently during spring when the humidity is low. During the past decade, the number of forest fire incidents and the extent of the damaged area have [...] Read more.
Forest fires are caused by various climatic and anthropogenic factors. In Republic of Korea, forest fires occur frequently during spring when the humidity is low. During the past decade, the number of forest fire incidents and the extent of the damaged area have increased. Satellite imagery can be applied to assess damage from these unpredictable forest fires. Despite the increasing threat, there is a lack of comprehensive analysis and effective strategies for addressing these forest fires, particularly considering the diverse topography of Republic of Korea. Herein, we present an approach for the automated detection of forest fire damage using Sentinel-2 images of 14 areas affected by forest fires in Republic of Korea during 2019–2023. The detection performance of deep learning (DL), machine learning, and spectral index methods was analyzed, and the optimal model for detecting forest fire damage was derived. To evaluate the independent performance of the models, two different burned areas exhibiting distinct characteristics were selected as test subjects. To increase the classification accuracy, tests were conducted on various combinations of input channels in DL. The combination of false-color RNG (B4, B8, and B3) images was optimal for detecting forest fire damage. Consequently, among the DL models, the HRNet model achieved excellent results for both test regions with intersection over union scores of 89.40 and 82.49, confirming that the proposed method is applicable for detecting forest fires in diverse Korean landscapes. Thus, suitable mitigation measures can be promptly designed based on the rapid analysis of damaged areas. Full article
(This article belongs to the Special Issue AI-Driven Satellite Data for Global Environment Monitoring)
Show Figures

Figure 1

Back to TopTop