Submit to Special Issue Submit Abstract to Special Issue Review for Applied Sciences Propose a Special Issue

Journal Menu

Journal Browser

New Insights into Computer Vision and Graphics

Print Special Issue Flyer
Special Issue Editors
Special Issue Information
Keywords
Benefits of Publishing in a Special Issue
Published Papers

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: 30 April 2025 | Viewed by 7665

Share This Special Issue

Special Issue Editor

Dr. Yuanyuan Liu

E-Mail Website
Guest Editor

Department of Information Engineering, China University of Geosciences, Wuhan 430075, China
Interests: computer vision; deep learning; image and video understanding

Special Issue Information

Dear Colleagues,

Application trends, device technologies, and the blurring of boundaries between disciplines are propelling information technology forward. This poses new challenges in the study of visual computing-based interactive graphics processing technology. Therefore, this Special Issue intends to presentation new ideas and experimental discoveries in the field of computer vision and graphics, from its design, service, and theory, to its applications.

Computer vision and graphics focus on the computational processing and applications of visual data. Areas relevant to computer vision and graphics include, but are not limited to, robotics, medical imaging, security and surveillance, gaming and entertainment, education and training, art and design, environmental monitoring, etc. High-speed processing techniques and real-time performance, developing and refining deep learning techniques for computer vision and graphics applications, and explainable AI techniques to improve the transparency and interpretability of AI models are all topics of interest.

This Special Issue will publish high-quality, original research papers in overlapping fields, including the following:

Image processing/analysis;
Computer vision theory and application;
Video and audio encoding;
Motion detection and tracking;
Reconstruction and representation;
Facial and hand gesture recognition;
Rendering techniques;
Matching, inference, and recognition;
Geometric modeling;
3D vision;
Graph-based learning and applications.

Dr. Yuanyuan Liu
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

image processing/analysis
computer vision theory and application
video and audio encoding
motion detection and tracking
reconstruction and representation
facial and hand gesture recognition
rendering techniques
matching, inference, and recognition
geometric modeling
3D vision
graph-based learning and applications

Benefits of Publishing in a Special Issue

Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (5 papers)

Download All Papers

Order results

Result details

Show export options Show export options

Select all

Export citation of selected articles as:

Research

21 pages, 5326 KiB

Open AccessArticle

6-DoF Pose Estimation from Single RGB Image and CAD Model Retrieval Using Feature Similarity Measurement

by Sieun Park, Won-Je Jeong, Mayura Manawadu and Soon-Yong Park

Appl. Sci. 2025, 15(3), 1501; https://doi.org/10.3390/app15031501 - 1 Feb 2025

Cited by 1 | Viewed by 726

Abstract

This study presents six degrees of freedom (6-DoF) pose estimation of an object from a single RGB image and retrieval of the matching CAD model by measuring the similarity between RGB and CAD rendering images. The 6-DoF pose estimation of an RGB object is one of the important techniques in 3D computer vision. However, in addition to 6-DoF pose estimation, retrieval and alignment of the matching CAD model with the RGB object should be performed for various industrial applications such as eXtended Reality (XR), Augmented Reality (AR), robot’s pick and place, and so on. This paper addresses 6-DoF pose estimation and CAD model retrieval problems simultaneously and quantitatively analyzes how much the 6-DoF pose estimation affects the CAD model retrieval performance. This study consists of two main steps. The first step is 6-DoF pose estimation based on the PoseContrast network. We enhance the structure of PoseConstrast by adding variance uncertainty weight and feature attention modules. The second step is the retrieval of the matching CAD model by an image similarity measurement between the CAD rendering and the RGB object. In our experiments, we used 2000 RGB images collected from Google and Bing search engines and 100 CAD models from ShapeNetCore. The Pascal3D+ dataset is used to train the pose estimation network and DELF features are used for the similarity measurement. Comprehensive ablation studies about the proposed network show the quantitative performance analysis with respect to the baseline model. Experimental results show that the pose estimation performance has a positive correlation with the CAD retrieval performance. Full article

(This article belongs to the Special Issue New Insights into Computer Vision and Graphics)

► Show Figures

Figure 1

21 pages, 7424 KiB

Open AccessArticle

Neural Network Ensemble to Detect Dicentric Chromosomes in Metaphase Images

by Ignacio Atencia-Jiménez, Adayabalam S. Balajee, Miguel J. Ruiz-Gómez, Francisco Sendra-Portero, Alegría Montoro and Miguel A. Molina-Cabello

Appl. Sci. 2024, 14(22), 10440; https://doi.org/10.3390/app142210440 - 13 Nov 2024

Viewed by 1295

Abstract

The Dicentric Chromosome Assay (DCA) is widely used in biological dosimetry, where the number of dicentric chromosomes induced by ionizing radiation (IR) exposure is quantified to estimate the absorbed radiation dose an individual has received. Dicentric chromosome scoring is a laborious and time-consuming process which is performed manually in most cytogenetic biodosimetry laboratories. Further, dicentric chromosome scoring constitutes a bottleneck when several hundreds of samples need to be analyzed for dose estimation in the aftermath of large-scale radiological/nuclear incident(s). Recently, much interest has focused on automating dicentric chromosome scoring using Artificial Intelligence (AI) tools to reduce analysis time and improve the accuracy of dicentric chromosome detection. Our study aims to detect dicentric chromosomes in metaphase plate images using an ensemble of artificial neural network detectors suitable for datasets that present a low number of samples (in this work, only 50 images). In our approach, the input image is first processed by several operators, each producing a transformed image. Then, each transformed image is transferred to a specific detector trained with a training set processed by the same operator that transformed the image. Following this, the detectors provide their predictions about the detected chromosomes. Finally, all predictions are combined using a consensus function. Regarding the operators used, images were binarized separately applying Otsu and Spline techniques, while morphological opening and closing filters with different sizes were used to eliminate noise, isolate specific components, and enhance the structures of interest (chromosomes) within the image. Consensus-based decisions are typically more precise than those made by individual networks, as the consensus method can rectify certain misclassifications, assuming that individual network results are correct. The results indicate that our methodology worked satisfactorily in detecting a majority of chromosomes, with remarkable classification performance even with the low number of training samples utilized. AI-based dicentric chromosome detection will be beneficial for a rapid triage by improving the detection of dicentric chromosomes and thereby the dose prediction accuracy. Full article

(This article belongs to the Special Issue New Insights into Computer Vision and Graphics)

► Show Figures

Figure 1

18 pages, 1493 KiB

Open AccessArticle

Hypergraph Position Attention Convolution Networks for 3D Point Cloud Segmentation

by Yanpeng Rong, Liping Nong, Zichen Liang, Zhuocheng Huang, Jie Peng and Yiping Huang

Appl. Sci. 2024, 14(8), 3526; https://doi.org/10.3390/app14083526 - 22 Apr 2024

Viewed by 1846

Abstract

Point cloud segmentation, as the basis for 3D scene understanding and analysis, has made significant progress in recent years. Graph-based modeling and learning methods have played an important role in point cloud segmentation. However, due to the inherent complexity of point cloud data, it is difficult to capture higher-order and complex features of 3D data using graph learning methods. In addition, how to quickly and efficiently extract important features from point clouds also poses a great challenge to the current research. To address these challenges, we propose a new framework, called hypergraph position attention convolution networks (HGPAT), for point cloud segmentation. Firstly, we use hypergraph to model the higher-order relationships among point clouds. Secondly, in order to effectively learn the feature information of point cloud data, a hyperedge position attention convolution module is proposed, which utilizes the hyperedge–hyperedge propagation pattern to extract and aggregate more important features. Finally, we design a ResNet-like module to reduce the computational complexity of the network and improve its efficiency. We have conducted point cloud segmentation experiments on the ShapeNet Part and S3IDS datasets, and the experimental results demonstrate the effectiveness of the proposed method compared with the state-of-the-art ones. Full article

(This article belongs to the Special Issue New Insights into Computer Vision and Graphics)

► Show Figures

Figure 1

13 pages, 12039 KiB

Open AccessArticle

Camera Path Generation for Triangular Mesh Using Toroidal Patches

by Jinyoung Choi, Kangmin Kim, Seongil Kim, Minseok Kim, Taekgwan Nam and Youngjin Park

Appl. Sci. 2024, 14(2), 490; https://doi.org/10.3390/app14020490 - 5 Jan 2024

Viewed by 1346

Abstract

Triangular mesh data structures are principal in computer graphics, serving as the foundation for many 3D models. To effectively utilize these 3D models across diverse industries, it is important to understand the model’s overall shape and geometric features thoroughly. In this work, we introduce a novel method for generating camera paths that emphasize the model’s local geometric characteristics. This method uses a toroidal patch-based spatial data structure, approximating the mesh’s faces within a predetermined tolerance

ϵ

, encapsulating their geometric intricacies. This facilitates the determination of the camera position and gaze path, ensuring the mesh’s key characteristics are captured. During the path construction, we create a bounding cylinder for the mesh, project the mesh’s faces and associated toroidal patches onto the cylinder’s lateral surface, and sequentially select grids of the cylinder containing the highest number of toroidal patches as we traverse the lateral surface. The centers of the selected grids are used as control points for a periodic B-spline curve, which serves as our foundational path. After initial curve generation, we generated camera position and gaze path from the curve by multiplying factors to ensure a uniform camera amplitude. We applied our method to ten triangular mesh models, demonstrating its effectiveness and adaptability across various mesh configurations. Full article

(This article belongs to the Special Issue New Insights into Computer Vision and Graphics)

► Show Figures

Figure 1

28 pages, 4448 KiB

Open AccessArticle

ED²IF²-Net: Learning Disentangled Deformed Implicit Fields and Enhanced Displacement Fields from Single Images Using Pyramid Vision Transformer

by Xiaoqiang Zhu, Xinsheng Yao, Junjie Zhang, Mengyao Zhu, Lihua You, Xiaosong Yang, Jianjun Zhang, He Zhao and Dan Zeng

Appl. Sci. 2023, 13(13), 7577; https://doi.org/10.3390/app13137577 - 27 Jun 2023

Cited by 1 | Viewed by 1514

Abstract

There has emerged substantial research in addressing single-view 3D reconstruction and the majority of the state-of-the-art implicit methods employ CNNs as the backbone network. On the other hand, transformers have shown remarkable performance in many vision tasks. However, it is still unknown whether transformers are suitable for single-view implicit 3D reconstruction. In this paper, we propose the first end-to-end single-view 3D reconstruction network based on the Pyramid Vision Transformer (PVT), called

{ED}^{2} {IF}^{2}

-Net, which disentangles the reconstruction of an implicit field into the reconstruction of topological structures and the recovery of surface details to achieve high-fidelity shape reconstruction.

{ED}^{2} {IF}^{2}

-Net uses a Pyramid Vision Transformer encoder to extract multi-scale hierarchical local features and a global vector of the input single image, which are fed into three separate decoders. A coarse shape decoder reconstructs a coarse implicit field based on the global vector, a deformation decoder iteratively refines the coarse implicit field using the pixel-aligned local features to obtain a deformed implicit field through multiple implicit field deformation blocks (IFDBs), and a surface detail decoder predicts an enhanced displacement field using the local features with hybrid attention modules (HAMs). The final output is a fusion of the deformed implicit field and the enhanced displacement field, with four loss terms applied to reconstruct the coarse implicit field, structure details through a novel deformation loss, overall shape after fusion, and surface details via a Laplacian loss. The quantitative results obtained from the ShapeNet dataset validate the exceptional performance of

{ED}^{2} {IF}^{2}

-Net. Notably,

{ED}^{2} {IF}^{2}

-Net-L stands out as the top-performing variant, exhibiting the highest mean IoU, CD, EMD, ECD-3D, and ECD-2D scores, reaching impressive values of 61.1, 7.26, 2.51, 6.08, and 1.84, respectively. The extensive experimental evaluations consistently demonstrate the state-of-the-art capabilities of

{ED}^{2} {IF}^{2}

-Net in terms of reconstructing topological structures and recovering surface details, all while maintaining competitive inference time. Full article

(This article belongs to the Special Issue New Insights into Computer Vision and Graphics)

► Show Figures

Journal Menu

Journal Browser

New Insights into Computer Vision and Graphics

Share This Special Issue

Special Issue Editor

Special Issue Information

Keywords

Benefits of Publishing in a Special Issue

Published Papers (5 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI