MDPI - Publisher of Open Access Journals

22 pages, 8021 KB

Open AccessArticle

Multi-Task Semi-Supervised Approach for Counting Cones in Adaptive Optics Images

by Vidya Bommanapally, Amir Akhavanrezayat, Parvathi Chundi, Quan Dong Nguyen and Mahadevan Subramaniam

Algorithms 2025, 18(9), 552; https://doi.org/10.3390/a18090552 - 2 Sep 2025

Viewed by 478

Abstract

Counting and density estimation of cone cells using adaptive optics (AO) imaging plays an important role in the clinical management of retinal diseases. A novel deep learning approach for the cone counting task with minimal manual labeling of cone cells in AO images is described in this paper. We propose a hybrid multi-task semi-supervised learning (MTSSL) framework that simultaneously trains on unlabeled and labeled data. On the unlabeled images, the model learns structural and relational features by employing two self-supervised pretext tasks—image inpainting (IP) and learning-to-rank (L2R). At the same time, it leverages a small set of labeled examples to supervise a density estimation head for cone counting. By jointly minimizing the image reconstruction loss, the ranking loss, and the supervised density-map loss, our approach harnesses the rich information in unlabeled data to learn feature representations and directly incorporates ground-truth annotations to guide accurate density prediction and counts. Experiments were conducted on a dataset of AO images of 120 subjects captured using a device with a retinal camera (rtx1) with a wide field-of-view. MTSSL gains strengths from hybrid self-supervised pretext tasks of generative and predictive pretraining that aid in learning global and local context required for counting cones. The results show that the proposed MTSSL approach significantly outperforms the individual self-supervised pipelines with an RMSE score improved by a factor of 2 for cone counting. Full article

(This article belongs to the Special Issue Advanced Machine Learning Algorithms for Image Processing)

► Show Figures

Figure 1

16 pages, 6068 KB

Open AccessArticle

MD-GAN: Multi-Scale Diversity GAN for Large Masks Inpainting

by Shibin Wang, Xuening Guo and Wenjie Guo

Electronics 2025, 14(11), 2218; https://doi.org/10.3390/electronics14112218 - 29 May 2025

Viewed by 483

Abstract

Image inpainting approaches have made considerable progress with the assistance of generative adversarial networks (GANs) recently. However, current inpainting methods are incompetent in handling the cases with large masks and they generally suffer from unreasonable structure. We find that the main reason is the lack of an effective receptive field in the inpainting network. To alleviate this issue, we propose a new two-stage inpainting model called MD-GAN, which is a multi-scale diverse GAN. We inject dense combinations of dilated convolutions in multiple scales of inpainting networks to obtain more effective receptive fields. In fact, the result of inpainting large masks is generally not uniquely deterministic. To this end, we newly propose the multi-scale probabilistic diverse module, which achieves diverse content generation by spatial-adaptive normalization. Meanwhile, the convolutional block attention module is introduced to improve the ability to extract complex features. Perceptual diversity loss is added to enhance diversity. Extensive experiments on benchmark datasets including CelebA-HQ, Places2 and Paris Street View demonstrate that our approach is able to effectively inpaint diverse and structurally reasonable images. Full article

► Show Figures

Figure 1

17 pages, 4291 KB

Open AccessArticle

Structure-Guided Image Inpainting Based on Multi-Scale Attention Pyramid Network

by Jun Gong, Senlin Luo, Wenxin Yu and Liang Nie

Appl. Sci. 2024, 14(18), 8325; https://doi.org/10.3390/app14188325 - 15 Sep 2024

Viewed by 1737

Abstract

Current single-view image inpainting methods often suffer from low image information utilization and suboptimal repair outcomes. To address these challenges, this paper introduces a novel image inpainting framework that leverages a structure-guided multi-scale attention pyramid network. This network consists of a structural repair network and a multi-scale attention pyramid semantic repair network. The structural repair component utilizes a dual-branch U-Net network for robust structure prediction under strong constraints. The predicted structural view then serves as auxiliary information for the semantic repair network. This latter network exploits the pyramid structure to extract multi-scale features of the image, which are further refined through an attention feature fusion module. Additionally, a separable gated convolution strategy is employed during feature extraction to minimize the impact of invalid information from missing areas, thereby enhancing the restoration quality. Experiments conducted on standard datasets such as Paris Street View and CelebA demonstrate the superiority of our approach over existing methods through quantitative and qualitative comparisons. Further ablation studies, by incrementally integrating proposed mechanisms into a baseline model, substantiate the effectiveness of our multi-view restoration strategy, separable gated convolution, and multi-scale attention feature fusion. Full article

► Show Figures

Figure 1

22 pages, 27609 KB

Open AccessArticle

Three-Dimensional-Consistent Scene Inpainting via Uncertainty-Aware Neural Radiance Field

by Meng Wang, Qinkang Yu and Haipeng Liu

Electronics 2024, 13(2), 448; https://doi.org/10.3390/electronics13020448 - 22 Jan 2024

Cited by 2 | Viewed by 2563

Abstract

3D (Three-Dimensional) scene inpainting aims to remove objects from scenes and generate visually plausible regions to fill the hollows. Leveraging the foundation of NeRF (Neural Radiance Field), considerable advancements have been achieved in the realm of 3D scene inpainting. However, prevalent issues persist: primarily, the presence of inconsistent 3D details across different viewpoints and occlusion losses of real background details in inpainted regions. This paper presents a NeRF-based inpainting approach using uncertainty estimation that formulates mask and uncertainty branches for consistency enhancement. In the initial training, the mask branch learns a 3D-consistent representation from inaccurate input masks, and after background rendering, the background regions can be fully exposed to the views. The uncertainty branch learns the visibility of spatial points by modeling them as Gaussian distributions, generating variances to identify regions to be inpainted. During the inpainting training phase, the uncertainty branch measures 3D consistency in the inpainted views and calculates the confidence from the variance as dynamic weights, which are used to balance the color and adversarial losses to achieve 3D-consistent inpainting with both the structure and texture. The results were evaluated on datasets such as Spin-NeRF and NeRF-Object-Removal. The proposed approach outperformed the baselines in inpainting metrics of LPIPS and FID, and preserved more spatial details from real backgrounds in multi-scene settings, thus achieving 3D-consistent restoration. Full article

(This article belongs to the Special Issue Application of Machine Learning in Graphics and Images)

► Show Figures

Figure 1

17 pages, 4498 KB

Open AccessArticle

Semantic Image Inpainting with Multi-Stage Feature Reasoning Generative Adversarial Network

by Guangyao Li, Liangfu Li, Yingdan Pu, Nan Wang and Xi Zhang

Sensors 2022, 22(8), 2854; https://doi.org/10.3390/s22082854 - 8 Apr 2022

Cited by 7 | Viewed by 3341

Abstract

Most existing image inpainting methods have achieved remarkable progress in small image defects. However, repairing large missing regions with insufficient context information is still an intractable problem. In this paper, a Multi-stage Feature Reasoning Generative Adversarial Network to gradually restore irregular holes is proposed. Specifically, dynamic partial convolution is used to adaptively adjust the restoration proportion during inpainting progress, which strengthens the correlation between valid and invalid pixels. In the decoding phase, the statistical natures of features in the masked areas differentiate from those of unmasked areas. To this end, a novel decoder is designed which not only dynamically assigns a scaling factor and bias on per feature point basis using point-wise normalization, but also utilizes skip connections to solve the problem of information loss between the codec network layers. Moreover, in order to eliminate gradient vanishing and increase the reasoning times, a hybrid weighted merging method consisting of a hard weight map and a soft weight map is proposed to ensemble the feature maps generated during the whole reconstruction process. Experiments on CelebA, Places2, and Paris StreetView show that the proposed model generates results with a PSNR improvement of 0.3 dB to 1.2 dB compared to other methods. Full article

(This article belongs to the Special Issue Image Processing and Pattern Recognition Based on Deep Learning)

► Show Figures

Figure 1

22 pages, 29391 KB

Open AccessArticle

Progressively Inpainting Images Based on a Forked-Then-Fused Decoder Network

by Shuai Yang, Rong Huang and Fang Han

Sensors 2021, 21(19), 6336; https://doi.org/10.3390/s21196336 - 22 Sep 2021

Cited by 2 | Viewed by 2764

Abstract

Image inpainting aims to fill in corrupted regions with visually realistic and semantically plausible contents. In this paper, we propose a progressive image inpainting method, which is based on a forked-then-fused decoder network. A unit called PC-RN, which is the combination of partial convolution and region normalization, serves as the basic component to construct inpainting network. The PC-RN unit can extract useful features from the valid surroundings and can suppress incompleteness-caused interference at the same time. The forked-then-fused decoder network consists of a local reception branch, a long-range attention branch, and a squeeze-and-excitation-based fusing module. Two multi-scale contextual attention modules are deployed into the long-range attention branch for adaptively borrowing features from distant spatial positions. Progressive inpainting strategy allows the attention modules to use the previously filled region to reduce the risk of allocating wrong attention. We conduct extensive experiments on three benchmark databases: Places2, Paris StreetView, and CelebA. Qualitative and quantitative results show that the proposed inpainting model is superior to state-of-the-art works. Moreover, we perform ablation studies to reveal the functionality of each module for the image inpainting task. Full article

(This article belongs to the Section Intelligent Sensors)

► Show Figures

Figure 1

20 pages, 3432 KB

Open AccessArticle

Multiple-Layer Visibility Propagation-Based Synthetic Aperture Imaging through Occlusion

by Tao Yang, Jing Li, Jingyi Yu, Yanning Zhang, Wenguang Ma, Xiaomin Tong, Rui Yu and Lingyan Ran

Sensors 2015, 15(8), 18965-18984; https://doi.org/10.3390/s150818965 - 4 Aug 2015

Cited by 5 | Viewed by 6087

Abstract

Heavy occlusions in cluttered scenes impose significant challenges to many computer vision applications. Recent light field imaging systems provide new see-through capabilities through synthetic aperture imaging (SAI) to overcome the occlusion problem. Existing synthetic aperture imaging methods, however, emulate focusing at a specific depth layer, but are incapable of producing an all-in-focus see-through image. Alternative in-painting algorithms can generate visually-plausible results, but cannot guarantee the correctness of the results. In this paper, we present a novel depth-free all-in-focus SAI technique based on light field visibility analysis. Specifically, we partition the scene into multiple visibility layers to directly deal with layer-wise occlusion and apply an optimization framework to propagate the visibility information between multiple layers. On each layer, visibility and optimal focus depth estimation is formulated as a multiple-label energy minimization problem. The layer-wise energy integrates all of the visibility masks from its previous layers, multi-view intensity consistency and depth smoothness constraint together. We compare our method with state-of-the-art solutions, and extensive experimental results demonstrate the effectiveness and superiority of our approach. Full article

(This article belongs to the Section Physical Sensors)

► Show Figures

Figure 1

Search Results (7)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (7)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI