Submit to Entropy Review for Entropy Propose a Special Issue

Journal Menu

Journal Browser

Application of Information Theory to Computer Vision and Image Processing II

Print Special Issue Flyer
Special Issue Editors
Special Issue Information
Keywords
Benefits of Publishing in a Special Issue
Related Special Issue
Published Papers

A special issue of Entropy (ISSN 1099-4300). This special issue belongs to the section "Information Theory, Probability and Statistics".

Deadline for manuscript submissions: closed (25 February 2025) | Viewed by 20249

Share This Special Issue

Special Issue Editors

Dr. Wendy Flores-Fuentes

E-Mail Website
Guest Editor

Facultad de Ingeniería, Universidad Autonoma de Baja California, Mexicali B.C. 21280, Mexico
Interests: fourth industrial revolution; artificial intelligence; cybersystems
Special Issues, Collections and Topics in MDPI journals

Dr. Oleg Sergiyenko

E-Mail Website
Guest Editor

Instituto de Ingeniería, Universidad Autónoma de Baja California, Mexicali B.C. 21100, Mexico
Interests: automated metrology; 3D coordinate measurement; robotic navigation; machine vision; simulation of robotic swarm behavior
Special Issues, Collections and Topics in MDPI journals

Prof. Dr. Julio Cesar Rodríguez-Quiñonez

E-Mail
Guest Editor

Facultad de Ingeniería, Universidad Autónoma de Baja California, Mexicali B.C. 21280, Mexico
Interests: machine vision; stereo vision; systems laser; scanner control; digital image processing
Special Issues, Collections and Topics in MDPI journals

Dr. Jesús Elías Miranda-Vega

E-Mail Website
Guest Editor

Tecnológico Nacional de México, IT de Mexicali, Mexicali 21376, México
Interests: machine vision; stereo vision; systems laser; scanner control; analogic and digital processing
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

We are pleased to announce that due to the great success of “Application of Information Theory to Computer Vision and Image Processing”, a new Special Issue titled “Application of Information Theory to Computer Vision and Image Processing II” is open to continue the inclusion of relevant papers of related topics.

The application of information theory to computer vision and image processing has significantly contributed to advancing the understanding and capabilities of computer science. Mathematics methods are applied to signal and image processing for quantifying and obtaining accurate information with enhanced efficiency upon every innovation. Providing valuable tools and techniques for the development of intelligent and adaptive machine vision systems for measuring and analyzing the amount of information contained within a signal and an image, including the entropy theory to estimate the average amount of uncertainty or randomness in a dataset, where a high entropy indicates a higher level of unpredictability, while low entropy suggests a more predictable and structured dataset.

This Special Issue aims to publish information theory, measurement methods, data processing, tools, and techniques for the design and instrumentation used in machine vision systems by the application of computer vision and image processing, for analyzing, processing, and understanding visual data based on principles of information content, redundancy, and statistical properties.

Dr. Wendy Flores-Fuentes
Dr. Oleg Sergiyenko
Prof. Dr. Julio Cesar Rodríguez-Quiñonez
Dr. Jesús Elías Miranda-Vega
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Entropy is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

information theory
entropy and coding theory (data compression, watermark, minimizing data loss, visual information in a more compact form, transmission, storage)
computer vision (identify relevant features and patterns)
machine vision (data analysis and understanding, segmentation, registration, denoising and restoration, object recognition, classification and tracking)
cyber-physical systems
instrumentation
signal and image processing
measurements (3D spatial coordinates, redundancy, statistical properties)
artificial intelligence
applications (navigation, surveillance, facial recognition, medicine, robotics, entertainment, and more)

Benefits of Publishing in a Special Issue

Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue policies can be found here.

Related Special Issue

Application of Information Theory to Computer Vision and Image Processing in Entropy (13 articles)

Published Papers (14 papers)

Download All Papers

Order results

Result details

Show export options Show export options

Select all

Export citation of selected articles as:

Research

19 pages, 2806 KiB

Open AccessArticle

SP-IGAN: An Improved GAN Framework for Effective Utilization of Semantic Priors in Real-World Image Super-Resolution

by Meng Wang, Zhengnan Li, Haipeng Liu, Zhaoyu Chen and Kewei Cai

Entropy 2025, 27(4), 414; https://doi.org/10.3390/e27040414 - 11 Apr 2025

Viewed by 197

Abstract

Single-image super-resolution (SISR) based on GANs has achieved significant progress. However, these methods still face challenges when reconstructing locally consistent textures due to a lack of semantic understanding of image categories. This highlights the necessity of focusing on contextual information comprehension and the acquisition of high-frequency details in model design. To address this issue, we propose the Semantic Prior-Improved GAN (SP-IGAN) framework, which incorporates additional contextual semantic information into the Real-ESRGAN model. The framework consists of two branches. The main branch introduces a Graph Convolutional Channel Attention (GCCA) module to transform channel dependencies into adjacency relationships between feature vertices, thereby enhancing pixel associations. The auxiliary branch strengthens the correlation between semantic category information and regional textures in the Residual-in-Residual Dense Block (RRDB) module. The auxiliary branch employs a pretrained segmentation model to accurately extract regional semantic information from the input low-resolution image. This information is injected into the RRDB module through Spatial Feature Transform (SFT) layers, generating more accurate and semantically consistent texture details. Additionally, a wavelet loss is incorporated into the loss function to capture high-frequency details that are often overlooked. The experimental results demonstrate that the proposed SP-IGAN outperforms state-of-the-art (SOTA) super-resolution models across multiple public datasets. For the X4 super-resolution task, SP-IGAN achieves a 0.55 dB improvement in Peak Signal-to-Noise Ratio (PSNR) and a 0.0363 increase in Structural Similarity Index (SSIM) compared to the baseline model Real-ESRGAN. Full article

(This article belongs to the Special Issue Application of Information Theory to Computer Vision and Image Processing II)

► Show Figures

Figure 1

19 pages, 6626 KiB

Open AccessArticle

Action Recognition with 3D Residual Attention and Cross Entropy

by Yuhao Ouyang and Xiangqian Li

Entropy 2025, 27(4), 368; https://doi.org/10.3390/e27040368 - 31 Mar 2025

Viewed by 268

Abstract

This study proposes a three-dimensional (3D) residual attention network (3DRFNet) for human activity recognition by learning spatiotemporal representations from motion pictures. Core innovation integrates the attention mechanism into the 3D ResNet framework to emphasize key features and suppress irrelevant ones. In each 3D ResNet block, channel and spatial attention mechanisms generate attention maps for tensor segments, which are then multiplied by the input feature mapping to emphasize key features. Additionally, the integration of Fast Fourier Convolution (FFC) enhances the network’s capability to effectively capture temporal and spatial features. Simultaneously, we used the cross-entropy loss function to describe the difference between the predicted value and GT to guide the model’s backpropagation. Subsequent experimental results have demonstrated that 3DRFNet achieved SOTA performance in human action recognition. 3DRFNet achieved accuracies of 91.7% and 98.7% on the HMDB-51 and UCF-101 datasets, respectively, which highlighted 3DRFNet’s advantages in recognition accuracy and robustness, particularly in effectively capturing key behavioral features in videos using both attention mechanisms. Full article

(This article belongs to the Special Issue Application of Information Theory to Computer Vision and Image Processing II)

► Show Figures

Figure 1

18 pages, 1746 KiB

Open AccessArticle

CrackCLIP: Adapting Vision-Language Models for Weakly Supervised Crack Segmentation

by Fengjiao Liang, Qingyong Li, Haomin Yu and Wen Wang

Entropy 2025, 27(2), 127; https://doi.org/10.3390/e27020127 - 25 Jan 2025

Viewed by 813

Abstract

Weakly supervised crack segmentation aims to create pixel-level crack masks with minimal human annotation, which often only differentiate between crack and normal no-crack patches. This task is crucial for assessing structural integrity and safety in real-world industrial applications, where manually labeling the location of cracks at the pixel level is both labor-intensive and impractical. Addressing the challenges of labeling uncertainty, this paper presents CrackCLIP, a novel approach that leverages language prompts to augment the semantic context and employs the Contrastive Language–Image Pre-Training (CLIP) model to enhance weakly supervised crack segmentation. Initially, a gradient-based class activation map is used to generate pixel-level coarse pseudo-labels from a trained crack patch classifier. The estimated coarse pseudo-labels are utilized to fine-tune additional linear adapters, which are integrated into the frozen image encoders of CLIP to adapt the CLIP model to the specialized task of crack segmentation. Moreover, specific textual prompts are crafted for crack characteristics, which are input into the frozen text encoder of CLIP to extract features encapsulating the semantic essence of the cracks. The final crack segmentation is determined by comparing the similarity between text prompt features and visual patch token features. Comparative experiments on the Crack500, CFD, and DeepCrack datasets demonstrate that the proposed framework outperforms existing weakly supervised crack segmentation methods, and the pre-trained vision-language model exhibits strong potential for crack feature learning, thereby enhancing the overall performance and generalization capabilities of the proposed framework. Full article

(This article belongs to the Special Issue Application of Information Theory to Computer Vision and Image Processing II)

► Show Figures

Figure 1

21 pages, 5554 KiB

Open AccessArticle

A Novel Quadrilateral Contour Disentangled Algorithm for Industrial Instrument Reading Detection

by Xiang Li, Changchang Zeng, Yong Yao, Jide Qian, Haiding Zhang, Sen Zhang and Suixian Yang

Entropy 2025, 27(2), 122; https://doi.org/10.3390/e27020122 - 24 Jan 2025

Viewed by 549

Abstract

Instrument reading detection in industrial scenarios poses significant challenges due to reading contour distortion caused by perspective transformation in the instrument images. However, existing methods fail to accurately read the display automatically due to incorrect labeling of the target box vertices, which arises from the vertex entanglement problem. To address these challenges, a novel Quadrilateral Contour Disentangled Detection Network (QCDNet) is proposed in this paper, which utilizes the quadrilateral disentanglement idea. First, a Multi-scale Feature Pyramid Network (MsFPN) is proposed for effective feature extraction to improve model accuracy. Second, we propose a Polar Coordinate Decoupling Representation (PCDR), which models each side of the instrument contour using polar coordinates. Additionally, a loss function for the polar coordinate parameters is designed to aid the PCDR in more effectively decoupling the instrument reading contour. Finally, the experimental results on the instrument dataset demonstrate that QCDNet outperforms existing quadrilateral detection algorithms, with improvements of 4.07%, 1.8%, and 2.89% in Precision, Recall, and F-measure, respectively. These results confirm the effectiveness of QCDNet for instrument reading detection tasks. Full article

(This article belongs to the Special Issue Application of Information Theory to Computer Vision and Image Processing II)

► Show Figures

Figure 1

16 pages, 1633 KiB

Open AccessArticle

Advancing Rice Grain Impurity Segmentation with an Enhanced SegFormer and Multi-Scale Feature Integration

by Xiulin Qiu, Hongzhi Yao, Qinghua Liu, Hongrui Liu, Haozhi Zhang and Mengdi Zhao

Entropy 2025, 27(1), 70; https://doi.org/10.3390/e27010070 - 15 Jan 2025

Viewed by 753

Abstract

During the rice harvesting process, severe occlusion and adhesion exist among multiple targets, such as rice, straw, and leaves, making it difficult to accurately distinguish between rice grains and impurities. To address the current challenges, a lightweight semantic segmentation algorithm for impurities based on an improved SegFormer network is proposed. To make full use of the extracted features, the decoder was redesigned. First, the Feature Pyramid Network (FPN) was introduced to optimize the structure, selectively fusing the high-level semantic features and low-level texture features generated by the encoder. Secondly, a Part Large Kernel Attention (Part-LKA) module was designed and introduced after feature fusion to help the model focus on key regions, simplifying the model and accelerating computation. Finally, to compensate for the lack of spatial interaction capabilities, Bottleneck Recursive Gated Convolution (B-

g^{n}

Conv) was introduced to achieve effective segmentation of rice grains and impurities. Compared with the original model, the improved model’s pixel accuracy (PA) and F1 score increased by 1.6% and 3.1%, respectively. This provides a valuable algorithmic reference for designing a real-time impurity rate monitoring system for rice combine harvesters. Full article

(This article belongs to the Special Issue Application of Information Theory to Computer Vision and Image Processing II)

► Show Figures

Figure 1

18 pages, 41079 KiB

Open AccessArticle

Research on Target Image Classification in Low-Light Night Vision

by Yanfeng Li, Yongbiao Luo, Yingjian Zheng, Guiqian Liu and Jiekai Gong

Entropy 2024, 26(10), 882; https://doi.org/10.3390/e26100882 - 21 Oct 2024

Cited by 2 | Viewed by 1451

Abstract

In extremely dark conditions, low-light imaging may offer spectators a rich visual experience, which is important for both military and civic applications. However, the images taken in ultra-micro light environments usually have inherent defects such as extremely low brightness and contrast, a high noise level, and serious loss of scene details and colors, which leads to great challenges in the research of low-light image and object detection and classification. The low-light night vision image used as the study object in this work has an excessively dim overall picture and very little information about the screen’s features. Three algorithms, HE, AHE, and CLAHE, were used to enhance and highlight the image. The effectiveness of these image enhancement methods is evaluated using metrics such as the peak signal-to-noise ratio and mean square error, and CLAHE was selected after comparison. The target image includes vehicles, people, license plates, and objects. The gray-level co-occurrence matrix (GLCM) was used to extract the texture features of the enhanced images, and the extracted image texture features were used as input to construct a backpropagation (BP) neural network classification model. Then, low-light image classification models were developed based on VGG16 and ResNet50 convolutional neural networks combined with low-light image enhancement algorithms. The experimental results show that the overall classification accuracy of the VGG16 convolutional neural network model is 92.1%. Compared with the BP and ResNet50 neural network models, the classification accuracy was increased by 4.5% and 2.3%, respectively, demonstrating its effectiveness in classifying low-light night vision targets. Full article

(This article belongs to the Special Issue Application of Information Theory to Computer Vision and Image Processing II)

► Show Figures

Figure 1

19 pages, 9600 KiB

Open AccessArticle

A Hierarchical Neural Network for Point Cloud Segmentation and Geometric Primitive Fitting

by Honghui Wan and Feiyu Zhao

Entropy 2024, 26(9), 717; https://doi.org/10.3390/e26090717 - 23 Aug 2024

Cited by 1 | Viewed by 1426

Abstract

Automated generation of geometric models from point cloud data holds significant importance in the field of computer vision and has expansive applications, such as shape modeling and object recognition. However, prevalent methods exhibit accuracy issues. In this study, we introduce a novel hierarchical neural network that utilizes recursive PointConv operations on nested subdivisions of point sets. This network effectively extracts features, segments point clouds, and accurately identifies and computes parameters of regular geometric primitives with notable resilience to noise. On fine-grained primitive detection, our approach outperforms Supervised Primitive Fitting Network (SPFN) by 18.5% and Cascaded Primitive Fitting Network (CPFN) by 11.2%. Additionally, our approach consistently maintains low absolute errors in parameter prediction across varying noise levels in the point cloud data. Our experiments validate the robustness of our proposed method and establish its superiority relative to other methodologies in the extant literature. Full article

(This article belongs to the Special Issue Application of Information Theory to Computer Vision and Image Processing II)

► Show Figures

Figure 1

19 pages, 43879 KiB

Open AccessArticle

3D Data Processing and Entropy Reduction for Reconstruction from Low-Resolution Spatial Coordinate Clouds in a Technical Vision System

by Ivan Y. Alba Corpus, Wendy Flores-Fuentes, Oleg Sergiyenko, Julio C. Rodríguez-Quiñonez, Jesús E. Miranda-Vega, Wendy Garcia-González and José A. Núñez-López

Entropy 2024, 26(8), 646; https://doi.org/10.3390/e26080646 - 30 Jul 2024

Viewed by 1329

Abstract

This paper proposes an advancement in the application of a Technical Vision System (TVS), which integrates a laser scanning mechanism with a single light sensor to measure 3D spatial coordinates. In this application, the system is used to scan and digitalize objects using a rotating table to explore the potential of the system for 3D scanning at reduced resolutions. The experiments undertaken searched for optimal scanning windows and used statistical data filtering techniques and regression models to find a method to generate a 3D scan that was still recognizable with the least amount of 3D points, balancing the number of points scanned and time, while at the same time reducing effects caused by the particularities of the TVS, such as noise and entropy in the form of natural distortion in the resulting scans. The evaluation of the experimentation results uses 3D point registration methods, joining multiple faces from the original volume scanned by the TVS and aligning it to the ground truth model point clouds, which are based on a commercial 3D camera to verify that the reconstructed 3D model retains substantial detail from the original object. This research finds it is possible to reconstruct sufficiently detailed 3D models obtained from the TVS, which contain coarsely scanned data or scans that initially lack high definition or are too noisy. Full article

(This article belongs to the Special Issue Application of Information Theory to Computer Vision and Image Processing II)

► Show Figures

Figure 1

18 pages, 5022 KiB

Open AccessArticle

(HTBNet)Arbitrary Shape Scene Text Detection with Binarization of Hyperbolic Tangent and Cross-Entropy

by Zhao Chen

Entropy 2024, 26(7), 560; https://doi.org/10.3390/e26070560 - 29 Jun 2024

Cited by 4 | Viewed by 1132

Abstract

The existing segmentation-based scene text detection methods mostly need complicated post-processing, and the post-processing operation is separated from the training process, which greatly reduces the detection performance. The previous method, DBNet, successfully simplified post-processing and integrated post-processing into a segmentation network. However, the training process of the model took a long time for 1200 epochs and the sensitivity to texts of various scales was lacking, leading to some text instances being missed. Considering the above two problems, we design the text detection Network with Binarization of Hyperbolic Tangent (HTBNet). First of all, we propose the Binarization of Hyperbolic Tangent (HTB), optimized along with which the segmentation network can expedite the initial convergent speed by reducing the number of epochs from 1200 to 600. Because features of different channels in the same scale feature map focus on the information of different regions in the image, to better represent the important features of all objects in the image, we devise the Multi-Scale Channel Attention (MSCA). Meanwhile, considering that multi-scale objects in the image cannot be simultaneously detected, we propose a novel module named Fused Module with Channel and Spatial (FMCS), which can fuse the multi-scale feature maps from channel and spatial dimensions. Finally, we adopt cross-entropy as the loss function, which measures the difference between predicted values and ground truths. The experimental results show that HTBNet, compared with lightweight models, has achieved competitive performance and speed on Total-Text (F-measure:86.0%, FPS:30) and MSRA-TD500 (F-measure:87.5%, FPS:30). Full article

(This article belongs to the Special Issue Application of Information Theory to Computer Vision and Image Processing II)

► Show Figures

Figure 1

35 pages, 6001 KiB

Open AccessArticle

Lossless and Near-Lossless Compression Algorithms for Remotely Sensed Hyperspectral Images

by Amal Altamimi and Belgacem Ben Youssef

Entropy 2024, 26(4), 316; https://doi.org/10.3390/e26040316 - 5 Apr 2024

Cited by 7 | Viewed by 2961

Abstract

Rapid and continuous advancements in remote sensing technology have resulted in finer resolutions and higher acquisition rates of hyperspectral images (HSIs). These developments have triggered a need for new processing techniques brought about by the confined power and constrained hardware resources aboard satellites. This article proposes two novel lossless and near-lossless compression methods, employing our recent seed generation and quadrature-based square rooting algorithms, respectively. The main advantage of the former method lies in its acceptable complexity utilizing simple arithmetic operations, making it suitable for real-time onboard compression. In addition, this near-lossless compressor could be incorporated for hard-to-compress images offering a stabilized reduction at nearly 40% with a maximum relative error of 0.33 and a maximum absolute error of 30. Our results also show that a lossless compression performance, in terms of compression ratio, of up to 2.6 is achieved when testing with hyperspectral images from the Corpus dataset. Further, an improvement in the compression rate over the state-of-the-art

k^{2}

-raster technique is realized for most of these HSIs by all four variations of our proposed lossless compression method. In particular, a data reduction enhancement of up to 29.89% is realized when comparing their respective geometric mean values. Full article

(This article belongs to the Special Issue Application of Information Theory to Computer Vision and Image Processing II)

► Show Figures

Figure 1

18 pages, 7216 KiB

Open AccessArticle

Style-Enhanced Transformer for Image Captioning in Construction Scenes

by Kani Song, Linlin Chen and Hengyou Wang

Entropy 2024, 26(3), 224; https://doi.org/10.3390/e26030224 - 1 Mar 2024

Cited by 4 | Viewed by 1947

Abstract

Image captioning is important for improving the intelligence of construction projects and assisting managers in mastering construction site activities. However, there are few image-captioning models for construction scenes at present, and the existing methods do not perform well in complex construction scenes. According to the characteristics of construction scenes, we label a text description dataset based on the MOCS dataset and propose a style-enhanced Transformer for image captioning in construction scenes, simply called SETCAP. Specifically, we extract the grid features using the Swin Transformer. Then, to enhance the style information, we not only use the grid features as the initial detail semantic features but also extract style information by style encoder. In addition, in the decoder, we integrate the style information into the text features. The interaction between the image semantic information and the text features is carried out to generate content-appropriate sentences word by word. Finally, we add the sentence style loss into the total loss function to make the style of generated sentences closer to the training set. The experimental results show that the proposed method achieves encouraging results on both the MSCOCO and the MOCS datasets. In particular, SETCAP outperforms state-of-the-art methods by 4.2% CIDEr scores on the MOCS dataset and 3.9% CIDEr scores on the MSCOCO dataset, respectively. Full article

(This article belongs to the Special Issue Application of Information Theory to Computer Vision and Image Processing II)

► Show Figures

Figure 1

16 pages, 27918 KiB

Open AccessArticle

Adaptive Dual Aggregation Network with Normalizing Flows for Low-Light Image Enhancement

by Hua Wang, Jianzhong Cao and Jijiang Huang

Entropy 2024, 26(3), 184; https://doi.org/10.3390/e26030184 - 22 Feb 2024

Cited by 1 | Viewed by 1688

Abstract

Low-light image enhancement (LLIE) aims to improve the visual quality of images taken under complex low-light conditions. Recent works focus on carefully designing Retinex-based methods or end-to-end networks based on deep learning for LLIE. However, these works usually utilize pixel-level error functions to optimize models and have difficulty effectively modeling the real visual errors between the enhanced images and the normally exposed images. In this paper, we propose an adaptive dual aggregation network with normalizing flows (ADANF) for LLIE. First, an adaptive dual aggregation encoder is built to fully explore the global properties and local details of the low-light images for extracting illumination-robust features. Next, a reversible normalizing flow decoder is utilized to model real visual errors between enhanced and normally exposed images by mapping images into underlying data distributions. Finally, to further improve the quality of the enhanced images, a gated multi-scale information transmitting module is leveraged to introduce the multi-scale information from the adaptive dual aggregation encoder into the normalizing flow decoder. Extensive experiments on paired and unpaired datasets have verified the effectiveness of the proposed ADANF. Full article

(This article belongs to the Special Issue Application of Information Theory to Computer Vision and Image Processing II)

► Show Figures

Figure 1

21 pages, 5149 KiB

Open AccessArticle

A Real-Time and Robust Neural Network Model for Low-Measurement-Rate Compressed-Sensing Image Reconstruction

by Pengchao Chen, Huadong Song, Yanli Zeng, Xiaoting Guo and Chaoqing Tang

Entropy 2023, 25(12), 1648; https://doi.org/10.3390/e25121648 - 12 Dec 2023

Viewed by 1538

Abstract

Compressed sensing (CS) is a popular data compression theory for many computer vision tasks, but the high reconstruction complexity for images prevents it from being used in many real-world applications. Existing end-to-end learning methods achieved real time sensing but lack theory guarantee for robust reconstruction results. This paper proposes a neural network called RootsNet, which integrates the CS mechanism into the network to prevent error propagation. So, RootsNet knows what will happen if some modules in the network go wrong. It also implements real-time and successfully reconstructed extremely low measurement rates that are impossible for traditional optimization-theory-based methods. For qualitative validation, RootsNet is implemented in two real-world measurement applications, i.e., a near-field microwave imaging system and a pipeline inspection system, where RootsNet easily saves 60% more measurement time and 95% more data compared with the state-of-the-art optimization-theory-based reconstruction methods. Without losing generality, comprehensive experiments are performed on general datasets, including evaluating the key components in RootsNet, the reconstruction uncertainty, quality, and efficiency. RootsNet has the best uncertainty performance and efficiency, and achieves the best reconstruction quality under super low-measurement rates. Full article

(This article belongs to the Special Issue Application of Information Theory to Computer Vision and Image Processing II)

► Show Figures

Figure 1

17 pages, 5045 KiB

Open AccessArticle

Part-Aware Point Cloud Completion through Multi-Modal Part Segmentation

by Fuyang Yu, Runze Tian, Xuanjun Wang and Xiaohui Liang

Entropy 2023, 25(12), 1588; https://doi.org/10.3390/e25121588 - 27 Nov 2023

Viewed by 1947

Abstract

Point cloud completion aims to generate high-resolution point clouds using incomplete point clouds as input and is the foundational task for many 3D visual applications. However, most existing methods suffer from issues related to rough localized structures. In this paper, we attribute these problems to the lack of attention to local details in the global optimization methods used for the task. Thus, we propose a new model, called PA-NET, to guide the network to pay more attention to local structures. Specifically, we first use textual embedding to assist in training a robust point assignment network, enabling the transformation of global optimization into the co-optimization of local and global aspects. Then, we design a novel plug-in module using the assignment network and introduce a new loss function to guide the network’s attention towards local structures. Numerous experiments were conducted, and the quantitative results demonstrate that our method achieves novel performance on different datasets. Additionally, the visualization results show that our method efficiently resolves the issue of poor local structures in the generated point cloud. Full article

(This article belongs to the Special Issue Application of Information Theory to Computer Vision and Image Processing II)

► Show Figures

Journal Menu

Journal Browser

Application of Information Theory to Computer Vision and Image Processing II

Share This Special Issue

Special Issue Editors

Special Issue Information

Keywords

Benefits of Publishing in a Special Issue

Related Special Issue

Published Papers (14 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI