Mathematics

Research

Jump to: Review

22 pages, 26214 KiB

Open AccessArticle

SwinInsSeg: An Improved SOLOv2 Model Using the Swin Transformer and a Multi-Kernel Attention Module for Ship Instance Segmentation

by Rabi Sharma, Muhammad Saqib, Chin-Teng Lin and Michael Blumenstein

Mathematics 2025, 13(1), 165; https://doi.org/10.3390/math13010165 - 5 Jan 2025

Viewed by 1104

Abstract

Maritime surveillance is essential for ensuring security in the complex marine environment. The study presents SwinInsSeg, an instance segmentation model that combines the Swin transformer and a lightweight MKA module to segment ships accurately and efficiently in maritime surveillance. Current models have limitations [...] Read more.

Maritime surveillance is essential for ensuring security in the complex marine environment. The study presents SwinInsSeg, an instance segmentation model that combines the Swin transformer and a lightweight MKA module to segment ships accurately and efficiently in maritime surveillance. Current models have limitations in segmenting multiscale ships and achieving accurate segmentation boundaries. SwinInsSeg addresses these limitations by identifying ships of various sizes and capturing finer details, including both small and large ships, through the MKA module, which emphasizes important information at different processing stages. Performance evaluations on the MariBoats and ShipInsSeg datasets show that SwinInsSeg outperforms YOLACT, SOLO, and SOLOv2, achieving mask average precision scores of 50.6% and 52.0%, respectively. These results demonstrate SwinInsSeg’s superior capability in segmenting ship instances with improved accuracy. Full article

(This article belongs to the Special Issue New Advances and Applications in Image Processing and Computer Vision)

► Show Figures

Figure 1

27 pages, 12355 KiB

Open AccessArticle

Low-Light Image Enhancement Using CycleGAN-Based Near-Infrared Image Generation and Fusion

by Min-Han Lee, Young-Ho Go, Seung-Hwan Lee and Sung-Hak Lee

Mathematics 2024, 12(24), 4028; https://doi.org/10.3390/math12244028 - 22 Dec 2024

Viewed by 1256

Abstract

Image visibility is often degraded under challenging conditions such as low light, backlighting, and inadequate contrast. To mitigate these issues, techniques like histogram equalization, high dynamic range (HDR) tone mapping and near-infrared (NIR)–visible image fusion are widely employed. However, these methods have inherent [...] Read more.

Image visibility is often degraded under challenging conditions such as low light, backlighting, and inadequate contrast. To mitigate these issues, techniques like histogram equalization, high dynamic range (HDR) tone mapping and near-infrared (NIR)–visible image fusion are widely employed. However, these methods have inherent drawbacks: histogram equalization frequently causes oversaturation and detail loss, while visible–NIR fusion requires complex and error-prone images. The proposed algorithm of a complementary cycle-consistent generative adversarial network (CycleGAN)-based training with visible and NIR images, leverages CycleGAN to generate fake NIR images by blending the characteristics of visible and NIR images. This approach presents tone compression and preserves fine details, effectively addressing the limitations of traditional methods. Experimental results demonstrate that the proposed method outperforms conventional algorithms, delivering superior quality and detail retention. This advancement holds substantial promise for applications where dependable image visibility is critical, such as autonomous driving and CCTV (Closed-Circuit Television) surveillance systems. Full article

(This article belongs to the Special Issue New Advances and Applications in Image Processing and Computer Vision)

► Show Figures

Figure 1

17 pages, 396 KiB

Open AccessArticle

Robust Classification via Finite Mixtures of Matrix Variate Skew-t Distributions

by Abbas Mahdavi, Narayanaswamy Balakrishnan and Ahad Jamalizadeh

Mathematics 2024, 12(20), 3260; https://doi.org/10.3390/math12203260 - 17 Oct 2024

Viewed by 995

Abstract

Analysis of matrix variate data is becoming increasingly common in the literature, particularly in the field of clustering and classification. It is well known that real data, including real matrix variate data, often exhibit high levels of asymmetry. To address this issue, one [...] Read more.

Analysis of matrix variate data is becoming increasingly common in the literature, particularly in the field of clustering and classification. It is well known that real data, including real matrix variate data, often exhibit high levels of asymmetry. To address this issue, one common approach is to introduce a tail or skewness parameter to a symmetric distribution. In this regard, we introduce here a new distribution called the matrix variate skew-t distribution (MVST), which provides flexibility, in terms of heavy tail and skewness. We then conduct a thorough investigation of various characterizations and probabilistic properties of the MVST distribution. We also explore extensions of this distribution to a finite mixture model. To estimate the parameters of the MVST distribution, we develop an EM-type algorithm that computes maximum likelihood (ML) estimates of the model parameters. To validate the effectiveness and usefulness of the developed models and associated methods, we performed empirical experiments, using simulated data as well as three real data examples, including an application in skin cancer detection. Our results demonstrate the efficacy of the developed approach in handling asymmetric matrix variate data. Full article

(This article belongs to the Special Issue New Advances and Applications in Image Processing and Computer Vision)

► Show Figures

Figure 1

24 pages, 6380 KiB

Open AccessArticle

Multi-Type Self-Attention-Based Convolutional-Neural-Network Post-Filtering for AV1 Codec

by Woowoen Gwun, Kiho Choi and Gwang Hoon Park

Mathematics 2024, 12(18), 2874; https://doi.org/10.3390/math12182874 - 15 Sep 2024

Viewed by 1297

Abstract

Over the past few years, there has been substantial interest and research activity surrounding the application of Convolutional Neural Networks (CNNs) for post-filtering in video coding. Most current research efforts have focused on using CNNs with various kernel sizes for post-filtering, primarily concentrating [...] Read more.

Over the past few years, there has been substantial interest and research activity surrounding the application of Convolutional Neural Networks (CNNs) for post-filtering in video coding. Most current research efforts have focused on using CNNs with various kernel sizes for post-filtering, primarily concentrating on High-Efficiency Video Coding/H.265 (HEVC) and Versatile Video Coding/H.266 (VVC). This narrow focus has limited the exploration and application of these techniques to other video coding standards such as AV1, developed by the Alliance for Open Media, which offers excellent compression efficiency, reducing bandwidth usage and improving video quality, making it highly attractive for modern streaming and media applications. This paper introduces a novel approach that extends beyond traditional CNN methods by integrating three different self-attention layers into the CNN framework. Applied to the AV1 codec, the proposed method significantly improves video quality by incorporating these distinct self-attention layers. This enhancement demonstrates the potential of self-attention mechanisms to revolutionize post-filtering techniques in video coding beyond the limitations of convolution-based methods. The experimental results show that the proposed network achieves an average BD-rate reduction of 10.40% for the Luma component and 19.22% and 16.52% for the Chroma components compared to the AV1 anchor. Visual quality assessments further validated the effectiveness of our approach, showcasing substantial artifact reduction and detail enhancement in videos. Full article

(This article belongs to the Special Issue New Advances and Applications in Image Processing and Computer Vision)

► Show Figures

Figure 1

12 pages, 4607 KiB

Open AccessArticle

Depth Prior-Guided 3D Voxel Feature Fusion for 3D Semantic Estimation from Monocular Videos

by Mingyun Wen and Kyungeun Cho

Mathematics 2024, 12(13), 2114; https://doi.org/10.3390/math12132114 - 5 Jul 2024

Viewed by 949

Abstract

Existing 3D semantic scene reconstruction methods utilize the same set of features extracted from deep learning networks for both 3D semantic estimation and geometry reconstruction, ignoring the differing requirements of semantic segmentation and geometry construction tasks. Additionally, current methods allocate 2D image features [...] Read more.

Existing 3D semantic scene reconstruction methods utilize the same set of features extracted from deep learning networks for both 3D semantic estimation and geometry reconstruction, ignoring the differing requirements of semantic segmentation and geometry construction tasks. Additionally, current methods allocate 2D image features to all voxels along camera rays during the back-projection process, without accounting for empty or occluded voxels. To address these issues, we propose separating the features for 3D semantic estimation from those for 3D mesh reconstruction. We use a pretrained vision transformer network for image feature extraction and depth priors estimated by a pretrained multi-view stereo-network to guide the allocation of image features within 3D voxels during the back-projection process. The back-projected image features are aggregated within each 3D voxel via averaging, creating coherent voxel features. The resulting 3D feature volume, composed of unified voxel feature vectors, is fed into a 3D CNN with a semantic classification head to produce a 3D semantic volume. This volume can be combined with existing 3D mesh reconstruction networks to produce a 3D semantic mesh. Experimental results on real-world datasets demonstrate that the proposed method significantly increases 3D semantic estimation accuracy. Full article

(This article belongs to the Special Issue New Advances and Applications in Image Processing and Computer Vision)

► Show Figures

Figure 1

23 pages, 23372 KiB

Open AccessArticle

Retinex Jointed Multiscale CLAHE Model for HDR Image Tone Compression

by Yu-Joong Kim, Dong-Min Son and Sung-Hak Lee

Mathematics 2024, 12(10), 1541; https://doi.org/10.3390/math12101541 - 15 May 2024

Cited by 3 | Viewed by 1972

Abstract

Tone-mapping algorithms aim to compress a wide dynamic range image into a narrower dynamic range image suitable for display on imaging devices. A representative tone-mapping algorithm, Retinex theory, reflects color constancy based on the human visual system and performs dynamic range compression. However, [...] Read more.

Tone-mapping algorithms aim to compress a wide dynamic range image into a narrower dynamic range image suitable for display on imaging devices. A representative tone-mapping algorithm, Retinex theory, reflects color constancy based on the human visual system and performs dynamic range compression. However, it may induce halo artifacts in some areas or degrade chroma and detail. Thus, this paper proposes a Retinex jointed multiscale contrast limited adaptive histogram equalization method. The proposed algorithm reduces localized halo artifacts and detail loss while maintaining the tone-compression effect via high-scale Retinex processing. A performance comparison of the experimental results between the proposed and existing methods confirms that the proposed method effectively reduces the existing problems and displays better image quality. Full article

(This article belongs to the Special Issue New Advances and Applications in Image Processing and Computer Vision)

► Show Figures

Figure 1

29 pages, 21511 KiB

Open AccessArticle

Enhancing Surveillance Vision with Multi-Layer Deep Learning Representation

by Dong-Min Son and Sung-Hak Lee

Mathematics 2024, 12(9), 1313; https://doi.org/10.3390/math12091313 - 25 Apr 2024

Viewed by 984

Abstract

This paper aimed to develop a method for generating sand–dust removal and dehazed images utilizing CycleGAN, facilitating object identification on roads under adverse weather conditions such as heavy dust or haze, which severely impair visibility. Initially, the study addressed the scarcity of paired [...] Read more.

This paper aimed to develop a method for generating sand–dust removal and dehazed images utilizing CycleGAN, facilitating object identification on roads under adverse weather conditions such as heavy dust or haze, which severely impair visibility. Initially, the study addressed the scarcity of paired image sets for training by employing unpaired CycleGAN training. The CycleGAN training module incorporates hierarchical single-scale Retinex (SSR) images with varying sigma sizes, facilitating multiple-scaled trainings. Refining the training data into detailed hierarchical layers for virtual paired training enhances the performance of CycleGAN training. Conventional sand–dust removal or dehazing algorithms, alongside deep learning methods, encounter challenges in simultaneously addressing sand–dust removal and dehazing with a singular algorithm. Such algorithms often necessitate resetting hyperparameters to process images from both scenarios. To overcome this limitation, we proposed a unified approach for removing sand–dust and haze phenomena using a single model, leveraging images processed hierarchically with SSR. The image quality and image sharpness metrics of the proposed method were BRIQUE, PIQE, CEIQ, MCMA, LPC-SI, and S3. In sand–dust environments, the proposed method achieved the highest scores, with an average of 21.52 in BRISQUE, 0.724 in MCMA, and 0.968 in LPC-SI compared to conventional methods. For haze images, the proposed method outperformed conventional methods with an average of 3.458 in CEIQ, 0.967 in LPC-SI, and 0.243 in S3. The images generated via this proposed method demonstrated superior performance in image quality and sharpness evaluation compared to conventional algorithms. The outcomes of this study hold particular relevance for camera images utilized in automobiles, especially in the context of self-driving cars or CCTV surveillance systems. Full article

(This article belongs to the Special Issue New Advances and Applications in Image Processing and Computer Vision)

► Show Figures

Figure 1

22 pages, 3459 KiB

Open AccessArticle

MDER-Net: A Multi-Scale Detail-Enhanced Reverse Attention Network for Semantic Segmentation of Bladder Tumors in Cystoscopy Images

by Chao Nie, Chao Xu and Zhengping Li

Mathematics 2024, 12(9), 1281; https://doi.org/10.3390/math12091281 - 24 Apr 2024

Cited by 2 | Viewed by 1330

Abstract

White light cystoscopy is the gold standard for the diagnosis of bladder cancer. Automatic and accurate tumor detection is essential to improve the surgical resection of bladder cancer and reduce tumor recurrence. At present, Transformer-based medical image segmentation algorithms face challenges in restoring [...] Read more.

White light cystoscopy is the gold standard for the diagnosis of bladder cancer. Automatic and accurate tumor detection is essential to improve the surgical resection of bladder cancer and reduce tumor recurrence. At present, Transformer-based medical image segmentation algorithms face challenges in restoring fine-grained detail information and local boundary information of features and have limited adaptability to multi-scale features of lesions. To address these issues, we propose a new multi-scale detail-enhanced reverse attention network, MDER-Net, for accurate and robust bladder tumor segmentation. Firstly, we propose a new multi-scale efficient channel attention module (MECA) to process four different levels of features extracted by the PVT v2 encoder to adapt to the multi-scale changes in bladder tumors; secondly, we use the dense aggregation module (DA) to aggregate multi-scale advanced semantic feature information; then, the similarity aggregation module (SAM) is used to fuse multi-scale high-level and low-level features, complementing each other in position and detail information; finally, we propose a new detail-enhanced reverse attention module (DERA) to capture non-salient boundary features and gradually explore supplementing tumor boundary feature information and fine-grained detail information; in addition, we propose a new efficient channel space attention module (ECSA) that enhances local context and improves segmentation performance by suppressing redundant information in low-level features. Extensive experiments on the bladder tumor dataset BtAMU, established in this article, and five publicly available polyp datasets show that MDER-Net outperforms eight state-of-the-art (SOTA) methods in terms of effectiveness, robustness, and generalization ability. Full article

(This article belongs to the Special Issue New Advances and Applications in Image Processing and Computer Vision)

► Show Figures

Figure 1

10 pages, 318 KiB

Open AccessArticle

Tiling Rectangles and the Plane Using Squares of Integral Sides

by Bahram Sadeghi Bigham, Mansoor Davoodi Monfared, Samaneh Mazaheri and Jalal Kheyrabadi

Mathematics 2024, 12(7), 1027; https://doi.org/10.3390/math12071027 - 29 Mar 2024

Viewed by 1158

Abstract

We study the problem of perfect tiling in the plane and explore the possibility of tiling a rectangle using integral distinct squares. Assume a set of distinguishable squares (or equivalently a set of distinct natural numbers) is given, and one has to decide [...] Read more.

We study the problem of perfect tiling in the plane and explore the possibility of tiling a rectangle using integral distinct squares. Assume a set of distinguishable squares (or equivalently a set of distinct natural numbers) is given, and one has to decide whether it can tile the plane or a rectangle or not. Previously, it has been proved that tiling the plane is not feasible using a set of odd numbers or an infinite sequence of natural numbers including exactly two odd numbers. The problem is open for different situations in which the number of odd numbers is arbitrary. In addition to providing a solution to this special case, we discuss some open problems to tile the plane and rectangles in this paper. Full article

(This article belongs to the Special Issue New Advances and Applications in Image Processing and Computer Vision)

► Show Figures

Figure 1

27 pages, 5593 KiB

Open AccessArticle

Ear-Touch-Based Mobile User Authentication

by Jalil Nourmohammadi Khiarak, Samaneh Mazaheri and Rohollah Moosavi Tayebi

Mathematics 2024, 12(5), 752; https://doi.org/10.3390/math12050752 - 2 Mar 2024

Viewed by 1690

Abstract

Mobile devices have become integral to daily life, necessitating robust user authentication methods to safeguard personal information. In this study, we present a new approach to mobile user authentication utilizing ear-touch interactions. Our novel system employs an analytical algorithm to authenticate users based [...] Read more.

Mobile devices have become integral to daily life, necessitating robust user authentication methods to safeguard personal information. In this study, we present a new approach to mobile user authentication utilizing ear-touch interactions. Our novel system employs an analytical algorithm to authenticate users based on features extracted from ear-touch images. We conducted extensive evaluations on a dataset comprising ear-touch images from 92 subjects, achieving an average equal error rate of 0.04, indicative of high accuracy and reliability. Our results suggest that ear-touch-based authentication is a feasible and effective method for securing mobile devices. Full article

(This article belongs to the Special Issue New Advances and Applications in Image Processing and Computer Vision)

► Show Figures

Figure 1

19 pages, 9187 KiB

Open AccessArticle

Light “You Only Look Once”: An Improved Lightweight Vehicle-Detection Model for Intelligent Vehicles under Dark Conditions

by Tianrui Yin, Wei Chen, Bo Liu, Changzhen Li and Luyao Du

Mathematics 2024, 12(1), 124; https://doi.org/10.3390/math12010124 - 29 Dec 2023

Cited by 4 | Viewed by 1955

Abstract

Vehicle detection is crucial for traffic surveillance and assisted driving. To overcome the loss of efficiency, accuracy, and stability in low-light conditions, we propose a lightweight “You Only Look Once” (YOLO) detection model. A polarized self-attention-enhanced aggregation feature pyramid network is used to [...] Read more.

Vehicle detection is crucial for traffic surveillance and assisted driving. To overcome the loss of efficiency, accuracy, and stability in low-light conditions, we propose a lightweight “You Only Look Once” (YOLO) detection model. A polarized self-attention-enhanced aggregation feature pyramid network is used to improve feature extraction and fusion in low-light scenarios, and enhanced “Swift” spatial pyramid pooling is used to reduce model parameters and enhance real-time nighttime detection. To address imbalanced low-light samples, we integrate an anchor mechanism with a focal loss to improve network stability and accuracy. Ablation experiments show the superior accuracy and real-time performance of our Light-YOLO model. Compared with EfficientNetv2-YOLOv5, Light-YOLO boosts mAP@0.5 and mAP@0.5:0.95 by 4.03 and 2.36%, respectively, cuts parameters by 44.37%, and increases recognition speed by 20.42%. Light-YOLO competes effectively with advanced lightweight networks and offers a solution for efficient nighttime vehicle-detection. Full article

(This article belongs to the Special Issue New Advances and Applications in Image Processing and Computer Vision)

► Show Figures

Figure 1

16 pages, 7068 KiB

Open AccessArticle

Enhancing Focus Volume through Perceptual Focus Factor in Shape-from-Focus

by Khurram Ashfaq and Muhammad Tariq Mahmood

Mathematics 2024, 12(1), 102; https://doi.org/10.3390/math12010102 - 27 Dec 2023

Cited by 1 | Viewed by 1351

Abstract

Shape From Focus (SFF) reconstructs a scene’s shape using a series of images with varied focus settings. However, the effectiveness of SFF largely depends on the Focus Measure (FM) used, which is prone to noise-induced inaccuracies in focus values. To address these issues, [...] Read more.

Shape From Focus (SFF) reconstructs a scene’s shape using a series of images with varied focus settings. However, the effectiveness of SFF largely depends on the Focus Measure (FM) used, which is prone to noise-induced inaccuracies in focus values. To address these issues, we introduce a perception-influenced factor to refine the traditional Focus Volume (FV) derived from a traditional FM. Owing to the strong relationship between the Difference of Gaussians (DoG) and how the visual system perceives edges in a scene, we apply it to local areas of the image sequence by segmenting the image sequence into non-overlapping blocks. This process yields a new metric, the Perceptual Focus Factor (PFF), which we combine with the traditional FV to obtain an enhanced FV and, ultimately, an enhanced depth map. Intensive experiments are conducted by using fourteen synthetic and six real-world data sets. The performance of the proposed method is evaluated using quantitative measures, such as Root Mean Square Error (RMSE) and correlation. For fourteen synthetic data sets, the average RMSE measure of 6.88 and correction measure of 0.65 are obtained, which are improved through PFF from an RMSE of 7.44 and correlation of 0.56, respectively. Experimental results and comparative analysis demonstrate that the proposed approach outperforms the traditional state-of-the-art FMs in extracting depth maps. Full article

(This article belongs to the Special Issue New Advances and Applications in Image Processing and Computer Vision)

► Show Figures

Figure 1

34 pages, 9382 KiB

Open AccessArticle

LRFID-Net: A Local-Region-Based Fake-Iris Detection Network for Fake Iris Images Synthesized by a Generative Adversarial Network

by Jung Soo Kim, Young Won Lee, Jin Seong Hong, Seung Gu Kim, Ganbayar Batchuluun and Kang Ryoung Park

Mathematics 2023, 11(19), 4160; https://doi.org/10.3390/math11194160 - 3 Oct 2023

Cited by 1 | Viewed by 2125

Abstract

Iris recognition is a biometric method using the pattern of the iris seated between the pupil and the sclera for recognizing people. It is widely applied in various fields owing to its high accuracy in recognition and high security. A spoof detection method [...] Read more.

Iris recognition is a biometric method using the pattern of the iris seated between the pupil and the sclera for recognizing people. It is widely applied in various fields owing to its high accuracy in recognition and high security. A spoof detection method for discriminating a spoof attack is essential in biometric recognition systems that include iris recognition. However, previous studies have mainly investigated spoofing attack detection methods based on printed or photographed images, video replaying, artificial eyes, and patterned contact lenses fabricated using iris images from information leakage. On the other hand, there have only been a few studies on spoof attack detection using iris images generated through a generative adversarial network (GAN), which is a method that has drawn considerable research interest with the recent development of deep learning, and the enhancement of spoof detection accuracy by the methods proposed in previous research is limited. To address this problem, the possibility of an attack on a conventional iris recognition system with spoofed iris images generated using cycle-consistent adversarial networks (CycleGAN), which was the motivation of this study, was investigated. In addition, a local region-based fake-iris detection network (LRFID-Net) was developed. It provides a novel method for discriminating fake iris images by segmenting the iris image into three regions based on the iris region. Experimental results using two open databases, the Warsaw LiveDet-Iris-2017 and the Notre Dame Contact Lens Detection LiveDet-Iris-2017 datasets, showed that the average classification error rate of spoof detection by the proposed method was 0.03% for the Warsaw dataset and 0.11% for the Notre Dame Contact Lens Detection dataset. The results confirmed that the proposed method outperformed the state-of-the-art methods. Full article

(This article belongs to the Special Issue New Advances and Applications in Image Processing and Computer Vision)

► Show Figures

Figure 1

21 pages, 6458 KiB

Open AccessArticle

CAGNet: A Multi-Scale Convolutional Attention Method for Glass Detection Based on Transformer

by Xiaohang Hu, Rui Gao, Seungjun Yang and Kyungeun Cho

Mathematics 2023, 11(19), 4084; https://doi.org/10.3390/math11194084 - 26 Sep 2023

Cited by 5 | Viewed by 1429

Abstract

Glass plays a vital role in several fields, making its accurate detection crucial. Proper detection prevents misjudgments, reduces noise from reflections, and ensures optimal performance in other computer vision tasks. However, the prevalent usage of glass in daily applications poses unique challenges for [...] Read more.

Glass plays a vital role in several fields, making its accurate detection crucial. Proper detection prevents misjudgments, reduces noise from reflections, and ensures optimal performance in other computer vision tasks. However, the prevalent usage of glass in daily applications poses unique challenges for computer vision. This study introduces a novel convolutional attention glass segmentation network (CAGNet) predicated on a transformer architecture customized for image glass detection. Based on the foundation of our prior study, CAGNet minimizes the number of training cycles and iterations, resulting in enhanced performance and efficiency. CAGNet is built upon the strategic design and integration of two types of convolutional attention mechanisms coupled with a transformer head applied for comprehensive feature analysis and fusion. To further augment segmentation precision, the network incorporates a custom edge-weighting scheme to optimize glass detection within images. Comparative studies and rigorous testing demonstrate that CAGNet outperforms several leading methodologies in glass detection, exhibiting robustness across a diverse range of conditions. Specifically, the IOU metric improves by 0.26% compared to that in our previous study and presents a 0.92% enhancement over those of other state-of-the-art methods. Full article

(This article belongs to the Special Issue New Advances and Applications in Image Processing and Computer Vision)

► Show Figures

Figure 1

21 pages, 44259 KiB

Open AccessArticle

Cervical Precancerous Lesion Image Enhancement Based on Retinex and Histogram Equalization

by Yuan Ren, Zhengping Li and Chao Xu

Mathematics 2023, 11(17), 3689; https://doi.org/10.3390/math11173689 - 28 Aug 2023

Cited by 1 | Viewed by 2858

Abstract

Cervical cancer is a prevalent chronic malignant tumor in gynecology, necessitating high-quality images of cervical precancerous lesions to enhance detection rates. Addressing the challenges of low contrast, uneven illumination, and indistinct lesion details in such images, this paper proposes an enhancement algorithm based [...] Read more.

Cervical cancer is a prevalent chronic malignant tumor in gynecology, necessitating high-quality images of cervical precancerous lesions to enhance detection rates. Addressing the challenges of low contrast, uneven illumination, and indistinct lesion details in such images, this paper proposes an enhancement algorithm based on retinex and histogram equalization. First, the algorithm solves the color deviation problem by modifying the quantization formula of retinex theory. Then, the contrast-limited adaptive histogram equalization algorithm is selectively conducted on blue and green channels to avoid the problem of image visual quality reduction caused by drastic darkening of local dark areas. Next, a multi-scale detail enhancement algorithm is used to further sharpen the details. Finally, the problem of noise amplification and image distortion in the process of enhancement is alleviated by dynamic weighted fusion. The experimental results confirm the effectiveness of the proposed algorithm in optimizing brightness, enhancing contrast, sharpening details, and suppressing noise in cervical precancerous lesion images. The proposed algorithm has shown superior performance compared to other traditional methods based on objective indicators such as peak signal-to-noise ratio, detail-variance–background-variance, gray square mean deviation, contrast improvement index, and enhancement quality index. Full article

(This article belongs to the Special Issue New Advances and Applications in Image Processing and Computer Vision)

► Show Figures

Figure 1

12 pages, 20257 KiB

Open AccessArticle

Directional Ring Difference Filter for Robust Shape-from-Focus

by Khurram Ashfaq and Muhammad Tariq Mahmood

Mathematics 2023, 11(14), 3056; https://doi.org/10.3390/math11143056 - 11 Jul 2023

Cited by 2 | Viewed by 1609

Abstract

In the shape-from-focus (SFF) method, the quality of the 3D shape generated relies heavily on the focus measure operator (FM) used. Unfortunately, most FMs are sensitive to noise and provide inaccurate depth maps. Among recent FMs, the ring difference filter (RDF) has demonstrated [...] Read more.

In the shape-from-focus (SFF) method, the quality of the 3D shape generated relies heavily on the focus measure operator (FM) used. Unfortunately, most FMs are sensitive to noise and provide inaccurate depth maps. Among recent FMs, the ring difference filter (RDF) has demonstrated excellent robustness against noise and reasonable performance in computing accurate depth maps. However, it also suffers from the response cancellation problem (RCP) encountered in multidimensional kernel-based FMs. To address this issue, we propose an effective and robust FM called the directional ring difference filter (DRDF). In DRDF, the focus quality is computed by aggregating responses of RDF from multiple kernels in different directions. We conducted experiments using synthetic and real image datasets and found that the proposed DRDF method outperforms traditional FMs in terms of noise handling and producing a higher quality 3D shape estimate of the object. Full article

(This article belongs to the Special Issue New Advances and Applications in Image Processing and Computer Vision)

► Show Figures

Figure 1

15 pages, 1329 KiB

Open AccessArticle

Optimal Robot Pose Estimation Using Scan Matching by Turning Function

by Bahram Sadeghi Bigham, Omid Abbaszadeh, Mazyar Zahedi-Seresht, Shahrzad Khosravi and Elham Zarezadeh

Mathematics 2023, 11(6), 1449; https://doi.org/10.3390/math11061449 - 16 Mar 2023

Viewed by 2011

Abstract

The turning function is a tool in image processing that measures the difference between two polygonal shapes. We propose a localization algorithm for the optimal pose estimation of autonomous mobile robots using the scan-matching method based on the turning function algorithm. There are [...] Read more.

The turning function is a tool in image processing that measures the difference between two polygonal shapes. We propose a localization algorithm for the optimal pose estimation of autonomous mobile robots using the scan-matching method based on the turning function algorithm. There are several methodologies aimed at moving the robots in the right way and carrying out their missions well, which involves the integration of localization and control. In the proposed method, the localization problem is implemented in the form of an optimization problem. Afterwards, the turning function algorithm and the simplex method are applied to estimate the localization and orientation of the robot. The proposed algorithm first receives the polygons extracted from two sensors’ data and then allocates a histogram to each sensor scan. This algorithm attempts to maximize the similarity of the two histograms by converting them to a unified coordinate system. In this way, the estimate of the difference between the two situations is calculated. In more detail, the main objective of this study is to provide an algorithm aimed at reducing errors in the localization and orientation of mobile robots. The simulation results indicate the great performance of this algorithm. Experimental results on simulated and real datasets show that the proposed algorithms achieve better results in terms of both position and orientation metrics. Full article

(This article belongs to the Special Issue New Advances and Applications in Image Processing and Computer Vision)

► Show Figures

Figure 1

Review

Jump to: Research

26 pages, 5886 KiB

Open AccessReview

Progress in Blind Image Quality Assessment: A Brief Review

by Pei Yang, Jordan Sturtz and Letu Qingge

Mathematics 2023, 11(12), 2766; https://doi.org/10.3390/math11122766 - 19 Jun 2023

Cited by 8 | Viewed by 2884

Abstract

As a fundamental research problem, blind image quality assessment (BIQA) has attracted increasing interest in recent years. Although great progress has been made, BIQA still remains a challenge. To better understand the research progress and challenges in this field, we review BIQA methods [...] Read more.

As a fundamental research problem, blind image quality assessment (BIQA) has attracted increasing interest in recent years. Although great progress has been made, BIQA still remains a challenge. To better understand the research progress and challenges in this field, we review BIQA methods in this paper. First, we introduce the BIQA problem definition and related methods. Second, we provide a detailed review of the existing BIQA methods in terms of representative hand-crafted features, learning-based features and quality regressors for two-stage methods, as well as one-stage DNN models with various architectures. Moreover, we also present and analyze the performance of competing BIQA methods on six public IQA datasets. Finally, we conclude our paper with possible future research directions based on a performance analysis of the BIQA methods. This review will provide valuable references for researchers interested in the BIQA problem. Full article

(This article belongs to the Special Issue New Advances and Applications in Image Processing and Computer Vision)

► Show Figures

Figure 1

Journal Menu

Journal Browser

New Advances and Applications in Image Processing and Computer Vision

Share This Special Issue

Special Issue Editor

Special Issue Information

Keywords

Benefits of Publishing in a Special Issue

Published Papers (18 papers)

Research

Review

Further Information

Guidelines

MDPI Initiatives

Follow MDPI