MDPI - Publisher of Open Access Journals

17 pages, 8440 KB

Open AccessArticle

Three-Dimensional Gaussian Style Transfer Method Based on Two-Dimensional Priors and Iterative Optimization

by Weijing Zhang, Xinyu Wang, Haolin Yin, Wei Xing, Huaizhong Lin, Lixia Chen and Lei Zhao

Appl. Sci. 2025, 15(17), 9678; https://doi.org/10.3390/app15179678 (registering DOI) - 3 Sep 2025

To address the limitations of existing optimization-based 3D style transfer methods in terms of visual quality, 3D consistency, and real-time rendering performance, we propose a novel 3D Gaussian scene style transfer method based on 2D priors and iterative optimization. Our approach introduces a [...] Read more.

To address the limitations of existing optimization-based 3D style transfer methods in terms of visual quality, 3D consistency, and real-time rendering performance, we propose a novel 3D Gaussian scene style transfer method based on 2D priors and iterative optimization. Our approach introduces a progressive training pipeline that alternates between fine-tuning the 3D Gaussian field and updating a set of supervised stylized images. By gradually injecting style information into the 3D scene through iterative refinement, the method effectively preserves the geometric structure and spatial coherence across viewpoints. Furthermore, we incorporated a pre-trained stable diffusion model as a 2D prior to guide the style adaptation of the 3D Gaussian representation. The combination of diffusion priors and differentiable 3D Gaussian rendering enables high-fidelity style transfer while maintaining real-time rendering capability. Extensive experiments demonstrate that our method significantly improves the visual quality and multi-view consistency of 3D stylized scenes, offering an effective and efficient solution for real-time 3D scene stylization. Full article

► Show Figures

Figure 1

23 pages, 10211 KB

Open AccessArticle

Potential of Remote Sensing for the Analysis of Mineralization in Geological Studies

by Ilyass-Essaid Lerhris, Hassan Admou, Hassan Ibouh and Noureddine El Binna

Geomatics 2025, 5(3), 40; https://doi.org/10.3390/geomatics5030040 - 1 Sep 2025

Abstract

Multispectral remote sensing offers powerful capabilities for mineral exploration, particularly in regions with complex geological settings. This study investigates the mineralization potential of the Tidili region in Morocco, located between the South Atlasic and Anti-Atlas Major Faults, using Advanced Spaceborne Thermal Emission and [...] Read more.

Multispectral remote sensing offers powerful capabilities for mineral exploration, particularly in regions with complex geological settings. This study investigates the mineralization potential of the Tidili region in Morocco, located between the South Atlasic and Anti-Atlas Major Faults, using Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) imagery to extract hydrothermal alteration zones. Key techniques include band ratio analysis and Principal Components Analysis (PCA), supported by the Crósta method, to identify spectral anomalies associated with alteration minerals such as Alunite, Kaolinite, and Illite. To validate the remote sensing results, field-based geological mapping and mineralogical analysis using X-ray diffraction (XRD) were conducted. The integration of satellite data with ground-truth and laboratory results confirmed the presence of argillic and phyllic alteration patterns consistent with porphyry-style mineralization. This integrated approach reveals spatial correlations between alteration zones and structural features linked to Pan-African and Hercynian deformation events. The findings demonstrate the effectiveness of combining multispectral remote sensing images analysis with field validation to improve mineral targeting, and the proposed methodology provides a transferable framework for exploration in similar tectonic environments. Full article

► Show Figures

Figure 1

20 pages, 17453 KB

Open AccessArticle

Generative Denoising Method for Geological Images with Pseudo-Labeled Non-Matching Datasets

by Huan Zhang, Chunlei Wu, Jing Lu and Wenqi Zhao

Appl. Sci. 2025, 15(17), 9620; https://doi.org/10.3390/app15179620 - 1 Sep 2025

Viewed by 58

Abstract

Accurate prediction of oil and gas reservoirs requires precise river morphology. However, geological sedimentary images are often degraded by scattered non-structural noise from data errors or printing, which distorts river structures and complicates reservoir interpretation. To address this challenge, we propose GD-PND, a [...] Read more.

Accurate prediction of oil and gas reservoirs requires precise river morphology. However, geological sedimentary images are often degraded by scattered non-structural noise from data errors or printing, which distorts river structures and complicates reservoir interpretation. To address this challenge, we propose GD-PND, a generative framework that leverages pseudo-labeled non-matching datasets to enable geological denoising via information transfer. We first construct a non-matching dataset by deriving pseudo-noiseless images via automated contour delineation and region filling on geological images of varying morphologies, thereby reducing reliance on manual annotation. The proposed style transfer-based generative model for noiseless images employs cyclic training with dual generators and discriminators to transform geological images into outputs with well-preserved river structures. Within the generator, the excitation networks of global features integrated with multi-attention mechanisms can enhance the representation of overall river morphology, enabling preliminary denoising. Furthermore, we develop an iterative denoising enhancement module that performs comprehensive refinement through recursive multi-step pixel transformations and associated post-processing, operating independently of the model. Extensive visualizations confirm intact river courses, while quantitative evaluations show that GD-PND achieves slight improvements, with the chi-squared mean increasing by up to 466.0 (approximately 1.93%), significantly enhancing computational efficiency and demonstrating its superiority. Full article

► Show Figures

Figure 1

19 pages, 5315 KB

Open AccessFeature PaperArticle

Style-Aware and Uncertainty-Guided Approach to Semi-Supervised Domain Generalization in Medical Imaging

by Zineb Tissir, Yunyoung Chang and Sang-Woong Lee

Mathematics 2025, 13(17), 2763; https://doi.org/10.3390/math13172763 - 28 Aug 2025

Viewed by 290

Abstract

Deep learning has significantly advanced medical image analysis by enabling accurate, automated diagnosis across diverse clinical tasks such as lesion classification and disease detection. However, the practical deployment of these systems is still hindered by two major challenges: the limited availability of expert-annotated [...] Read more.

Deep learning has significantly advanced medical image analysis by enabling accurate, automated diagnosis across diverse clinical tasks such as lesion classification and disease detection. However, the practical deployment of these systems is still hindered by two major challenges: the limited availability of expert-annotated data and substantial domain shifts caused by variations in imaging devices, acquisition protocols, and patient populations. Although recent semi-supervised domain generalization (SSDG) approaches attempt to address these challenges, they often suffer from two key limitations: (i) reliance on computationally expensive uncertainty modeling techniques such as Monte Carlo dropout, and (ii) inflexible shared-head classifiers that fail to capture domain-specific variability across heterogeneous imaging styles. To overcome these limitations, we propose MultiStyle-SSDG, a unified semi-supervised domain generalization framework designed to improve model generalization in low-label scenarios. Our method introduces a multi-style ensemble pseudo-labeling strategy guided by entropy-based filtering, incorporates prototype-based conformity and semantic alignment to regularize the feature space, and employs a domain-specific multi-head classifier fused through attention-weighted prediction. Additionally, we introduce a dual-level neural-style transfer pipeline that simulates realistic domain shifts while preserving diagnostic semantics. We validated our framework on the ISIC2019 skin lesion classification benchmark using 5% and 10% labeled data. MultiStyle-SSDG consistently outperformed recent state-of-the-art methods such as FixMatch, StyleMatch, and UPLM, achieving statistically significant improvements in classification accuracy under simulated domain shifts including style, background, and corruption. Specifically, our method achieved 78.6% accuracy with 5% labeled data and 80.3% with 10% labeled data on ISIC2019, surpassing FixMatch by 4.9–5.3 percentage points and UPLM by 2.1–2.4 points. Ablation studies further confirmed the individual contributions of each component, and t-SNE visualizations illustrate enhanced intra-class compactness and cross-domain feature consistency. These results demonstrate that our style-aware, modular framework offers a robust and scalable solution for generalizable computer-aided diagnosis in real-world medical imaging settings. Full article

(This article belongs to the Section E1: Mathematics and Computer Science)

► Show Figures

Figure 1

22 pages, 1906 KB

Open AccessArticle

A Style Transfer-Based Fast Image Quality Assessment Method for Image Sensors

by Weizhi Xian, Bin Chen, Jielu Yan, Xuekai Wei, Kunyin Guo, Bin Fang and Mingliang Zhou

Sensors 2025, 25(16), 5121; https://doi.org/10.3390/s25165121 - 18 Aug 2025

Viewed by 516

Abstract

Accurate image quality evaluation is essential for optimizing sensor performance and enhancing the fidelity of visual data. The concept of “image style” encompasses the overall visual characteristics of an image, including elements such as colors, textures, shapes, lines, strokes, and other visual components. [...] Read more.

Accurate image quality evaluation is essential for optimizing sensor performance and enhancing the fidelity of visual data. The concept of “image style” encompasses the overall visual characteristics of an image, including elements such as colors, textures, shapes, lines, strokes, and other visual components. In this paper, we propose a novel full-reference image quality assessment (FR-IQA) method that leverages the principles of style transfer, which we call style- and content-based IQA (SCIQA). Our approach consists of three main steps. First, we employ a deep convolutional neural network (CNN) to decompose and represent images in the deep domain, capturing both low-level and high-level features. Second, we define a comprehensive deep perceptual distance metric between two images, taking into account both image content and style. This metric combines traditional content-based measures with style-based measures inspired by recent advances in neural style transfer. Finally, we formulate a perceptual optimization problem to determine the optimal parameters for the SCIQA model, which we solve via a convex optimization approach. Experimental results across multiple benchmark datasets (LIVE, CSIQ, TID2013, KADID-10k, and PIPAL) demonstrate that SCIQA outperforms state-of-the-art FR-IQA methods. Specifically, SCIQA achieves Pearson linear correlation coefficients (PLCC) of 0.956, 0.941, and 0.895 on the LIVE, CSIQ, and TID2013 datasets, respectively, outperforming traditional methods such as SSIM (PLCC: 0.847, 0.852, 0.665) and deep learning-based methods such as DISTS (PLCC: 0.924, 0.919, 0.855). The proposed method also demonstrates robust generalizability on the large-scale PIPAL dataset, achieving an SROCC of 0.702. Furthermore, SCIQA exhibits strong interpretability, exceptional prediction accuracy, and low computational complexity, making it a practical tool for real-world applications. Full article

(This article belongs to the Special Issue Deep Learning Technology and Image Sensing: 2nd Edition)

► Show Figures

Figure 1

22 pages, 5125 KB

Open AccessArticle

A Steganographic Message Transmission Method Based on Style Transfer and Denoising Diffusion Probabilistic Model

by Yen-Hui Lin, Chin-Pan Huang and Ping-Sheng Huang

Electronics 2025, 14(16), 3258; https://doi.org/10.3390/electronics14163258 - 16 Aug 2025

Viewed by 420

Abstract

This study presents a new steganography method for message transmission based on style transfer and denoising diffusion probabilistic model (DDPM) techniques. Different types of object images are used to represent the messages and are arranged in order from left to right and top [...] Read more.

This study presents a new steganography method for message transmission based on style transfer and denoising diffusion probabilistic model (DDPM) techniques. Different types of object images are used to represent the messages and are arranged in order from left to right and top to bottom to generate a secret image. Then, the style transfer technique is employed to embed the secret image (content image) into the cover image (style image) to create a stego image. To reveal the messages, the DDPM technique is first used to inpaint the secret image from the stego image. Then, the YOLO (You Only Look Once) technique is utilized to detect objects in the secret image for the message decoding. Two security mechanisms are included: one uses object images for the message encoding, and the other hides them in a customizable public image. To obtain the messages, both mechanisms need to be cracked at the same time. Therefore, this method provides highly secure information protection. Experimental results show that our method has good confidential information transmission performance. Full article

(This article belongs to the Special Issue Signal and Image Processing Applications in Artificial Intelligence, 2nd Edition)

► Show Figures

Figure 1

27 pages, 9711 KB

Open AccessArticle

Multi-Scale Cross-Domain Augmentation of Tea Datasets via Enhanced Cycle Adversarial Networks

by Taojie Yu, Jianneng Chen, Zhiyong Gui, Jiangming Jia, Yatao Li, Chennan Yu and Chuanyu Wu

Agriculture 2025, 15(16), 1739; https://doi.org/10.3390/agriculture15161739 - 13 Aug 2025

Viewed by 367

Abstract

To tackle phenotypic variability and detection accuracy issues of tea shoots in open-air gardens due to lighting and varietal differences, this study proposes Tea CycleGAN and a data augmentation method. It combines multi-scale image style transfer with spatial consistency dataset generation. Using Longjing [...] Read more.

To tackle phenotypic variability and detection accuracy issues of tea shoots in open-air gardens due to lighting and varietal differences, this study proposes Tea CycleGAN and a data augmentation method. It combines multi-scale image style transfer with spatial consistency dataset generation. Using Longjing 43 and Zhongcha 108 as cross-domain objects, the generator integrates SKConv and a dynamic multi-branch residual structure for multi-scale feature fusion, optimized by an attention mechanism. A deep discriminator with more conv layers and batch norm enhances detail discrimination. A global–local framework trains on 600 × 600 background and 64 × 64 tea shoots regions, with a restoration-paste strategy to preserve spatial consistency. Experiments show Tea CycleGAN achieves FID scores of 42.26 and 26.75, outperforming CycleGAN. Detection using YOLOv7 sees mAP rise from 73.94% to 83.54%, surpassing Mosaic and Mixup. The method effectively mitigates lighting/scale impacts, offering a reliable data augmentation solution for tea picking. Full article

(This article belongs to the Section Artificial Intelligence and Digital Agriculture)

► Show Figures

Figure 1

23 pages, 15241 KB

Open AccessArticle

Diffusion Model-Based Cartoon Style Transfer for Real-World 3D Scenes

by Yuhang Chen, Haoran Zhou, Jing Chen, Nai Yang, Jing Zhao and Yi Chao

ISPRS Int. J. Geo-Inf. 2025, 14(8), 303; https://doi.org/10.3390/ijgi14080303 - 4 Aug 2025

Viewed by 664

Abstract

Traditional map style transfer methods are mostly based on GAN, which are either overly artistic at the expense of conveying information, or insufficiently aesthetic by simply changing the color scheme of the map image. These methods often struggle to balance style transfer with [...] Read more.

Traditional map style transfer methods are mostly based on GAN, which are either overly artistic at the expense of conveying information, or insufficiently aesthetic by simply changing the color scheme of the map image. These methods often struggle to balance style transfer with semantic preservation and lack consistency in their transfer effects. In recent years, diffusion models have made significant progress in the field of image processing and have shown great potential in image-style transfer tasks. Inspired by these advances, this paper presents a method for transferring real-world 3D scenes to a cartoon style without the need for additional input condition guidance. The method combines pre-trained LDM with LoRA models to achieve stable and high-quality style infusion. By integrating DDIM Inversion, ControlNet, and MultiDiffusion strategies, it achieves the cartoon style transfer of real-world 3D scenes through initial noise control, detail redrawing, and global coordination. Qualitative and quantitative analyses, as well as user studies, indicate that our method effectively injects a cartoon style while preserving the semantic content of the real-world 3D scene, maintaining a high degree of consistency in style transfer. This paper offers a new perspective for map style transfer. Full article

► Show Figures

Figure 1

13 pages, 7106 KB

Open AccessArticle

Multi-Scale Universal Style-Transfer Network Based on Diffusion Model

by Na Su, Jingtao Wang and Yun Pan

Algorithms 2025, 18(8), 481; https://doi.org/10.3390/a18080481 - 4 Aug 2025

Viewed by 421

Abstract

Artistic style transfer aims to transfer the style of an artwork to a photograph while maintaining its original overall content. Although current style-transfer methods have achieved promising results when processing photorealistic images, they often struggle with brushstroke preservation in artworks, especially in styles [...] Read more.

Artistic style transfer aims to transfer the style of an artwork to a photograph while maintaining its original overall content. Although current style-transfer methods have achieved promising results when processing photorealistic images, they often struggle with brushstroke preservation in artworks, especially in styles such as oil painting and pointillism. In such cases, the extracted style and content features tend to include redundant information, leading to issues such as blurred edges and a loss of fine details in the transferred images. To address this problem, this paper proposes a multi-scale general style-transfer network based on diffusion models. The proposed network consists of a coarse style-transfer module and a refined style-transfer module. First, the coarse style-transfer module is designed to perform mainstream style-transfer tasks more efficiently by operating on downsampled images, enabling faster processing with satisfactory results. Next, to further enhance edge fidelity, a refined style-transfer module is introduced. This module utilizes a segmentation component to generate a mask of the main subject in the image and performs edge-aware refinement. This enhances the fusion between the subject’s edges and the target style while preserving more detailed features. To improve overall image quality and better integrate the style along the content boundaries, the output from the coarse module is upsampled by a factor of two and combined with the subject mask. With the assistance of ControlNet and Stable Diffusion, the model performs content-aware edge redrawing to enhance the overall visual quality of the stylized image. Compared with state-of-the-art style-transfer methods, the proposed model preserves more edge details and achieves more natural fusion between style and content. Full article

(This article belongs to the Section Evolutionary Algorithms and Machine Learning)

► Show Figures

Figure 1

31 pages, 11068 KB

Open AccessArticle

Airport-FOD3S: A Three-Stage Detection-Driven Framework for Realistic Foreign Object Debris Synthesis

by Hanglin Cheng, Yihao Li, Ruiheng Zhang and Weiguang Zhang

Sensors 2025, 25(15), 4565; https://doi.org/10.3390/s25154565 - 23 Jul 2025

Cited by 1 | Viewed by 418

Abstract

Traditional Foreign Object Debris (FOD) detection methods face challenges such as difficulties in large-size data acquisition and the ineffective application of detection algorithms with high accuracy. In this paper, image data augmentation was performed using generative adversarial networks and diffusion models, generating images [...] Read more.

Traditional Foreign Object Debris (FOD) detection methods face challenges such as difficulties in large-size data acquisition and the ineffective application of detection algorithms with high accuracy. In this paper, image data augmentation was performed using generative adversarial networks and diffusion models, generating images of monitoring areas under different environmental conditions and FOD images of varied types. Additionally, a three-stage image blending method considering size transformation, a seamless process, and style transfer was proposed. The image quality of different blending methods was quantitatively evaluated using metrics such as structural similarity index and peak signal-to-noise ratio, as well as Depthanything. Finally, object detection models with a similarity distance strategy (SimD), including Faster R-CNN, YOLOv8, and YOLOv11, were tested on the dataset. The experimental results demonstrated that realistic FOD data were effectively generated. The Structural Similarity Index Measure (SSIM) and Peak Signal-to-Noise Ratio (PSNR) of the synthesized image by the proposed three-stage image blending method outperformed the other methods, reaching 0.99 and 45 dB. YOLOv11 with SimD trained on the augmented dataset achieved the mAP of 86.95%. Based on the results, it could be concluded that both data augmentation and SimD significantly improved the accuracy of FOD detection. Full article

(This article belongs to the Special Issue Recent Advances in Imaging Sensors: Integration with Machine Learning and Artificial Intelligence)

► Show Figures

Figure 1

21 pages, 7084 KB

Open AccessArticle

Chinese Paper-Cutting Style Transfer via Vision Transformer

by Chao Wu, Yao Ren, Yuying Zhou, Ming Lou and Qing Zhang

Entropy 2025, 27(7), 754; https://doi.org/10.3390/e27070754 - 15 Jul 2025

Viewed by 465

Abstract

Style transfer technology has seen substantial attention in image synthesis, notably in applications like oil painting, digital printing, and Chinese landscape painting. However, it is often difficult to generate migrated images that retain the essence of paper-cutting art and have strong visual appeal [...] Read more.

Style transfer technology has seen substantial attention in image synthesis, notably in applications like oil painting, digital printing, and Chinese landscape painting. However, it is often difficult to generate migrated images that retain the essence of paper-cutting art and have strong visual appeal when trying to apply the unique style of Chinese paper-cutting art to style transfer. Therefore, this paper proposes a new method for Chinese paper-cutting style transformation based on the Transformer, aiming at realizing the efficient transformation of Chinese paper-cutting art styles. Specifically, the network consists of a frequency-domain mixture block and a multi-level feature contrastive learning module. The frequency-domain mixture block explores spatial and frequency-domain interaction information, integrates multiple attention windows along with frequency-domain features, preserves critical details, and enhances the effectiveness of style conversion. To further embody the symmetrical structures and hollowed hierarchical patterns intrinsic to Chinese paper-cutting, the multi-level feature contrastive learning module is designed based on a contrastive learning strategy. This module maximizes mutual information between multi-level transferred features and content features, improves the consistency of representations across different layers, and thus accentuates the unique symmetrical aesthetics and artistic expression of paper-cutting. Extensive experimental results demonstrate that the proposed method outperforms existing state-of-the-art approaches in both qualitative and quantitative evaluations. Additionally, we created a Chinese paper-cutting dataset that, although modest in size, represents an important initial step towards enriching existing resources. This dataset provides valuable training data and a reference benchmark for future research in this field. Full article

(This article belongs to the Section Multidisciplinary Applications)

► Show Figures

Figure 1

20 pages, 4254 KB

Open AccessArticle

Positional Component-Guided Hangul Font Image Generation via Deep Semantic Segmentation and Adversarial Style Transfer

by Avinash Kumar, Irfanullah Memon, Abdul Sami, Youngwon Jo and Jaeyoung Choi

Electronics 2025, 14(13), 2699; https://doi.org/10.3390/electronics14132699 - 4 Jul 2025

Viewed by 589

Abstract

Automated font generation for complex, compositional scripts like Korean Hangul presents a significant challenge due to the 11,172 characters and their complicated component-based structure. While existing component-based methods for font image generation acknowledge the compositional nature of Hangul, they often fail to explicitly [...] Read more.

Automated font generation for complex, compositional scripts like Korean Hangul presents a significant challenge due to the 11,172 characters and their complicated component-based structure. While existing component-based methods for font image generation acknowledge the compositional nature of Hangul, they often fail to explicitly leverage the crucial positional semantics of its basic elements as initial, middle, and final components, known as Jamo. This oversight can lead to structural inconsistencies and artifacts in the generated glyphs. This paper introduces a novel two-stage framework that directly addresses this gap by imposing a strong, linguistically informed structural principle on the font image generation process. In the first stage, we employ a You Only Look Once version 8 for Segmentation (YOLOv8-Seg) model, a state-of-the-art instance segmentation network, to decompose Hangul characters into their basic components. Notably, this process generates a dataset of position-aware semantic components, categorizing each jamo according to its structural role within the syllabic block. In the second stage, a conditional Generative Adversarial Network (cGAN) is explicitly conditioned on these extracted positional components to perform style transfer with high structural information. The generator learns to synthesize a character’s appearance by referencing the style of the target components while preserving the content structure of a source character. Our model achieves state-of-the-art performance, reducing L1 loss to 0.2991 and improving the Structural Similarity Index (SSIM) to 0.9798, quantitatively outperforming existing methods like MX-Font and CKFont. This position-guided approach demonstrates significant quantitative and qualitative improvements over existing methods in structured script generation, offering enhanced control over glyph structure and a promising approach for generating font images for other complex, structured scripts. Full article

(This article belongs to the Special Issue Applications of Computer Vision, 3rd Edition)

► Show Figures

Figure 1

14 pages, 4561 KB

Open AccessArticle

DBDST-Net: Dual-Branch Decoupled Image Style Transfer Network

by Na Su, Jingtao Wang, Jingjing Zhang, Ying Li and Yun Pan

Information 2025, 16(7), 561; https://doi.org/10.3390/info16070561 - 30 Jun 2025

Viewed by 304

Abstract

The image style transfer task aims to apply the style characteristics of a reference image to a content image, generating a new stylized result. While many existing methods focus on designing feature transfer modules and have achieved promising results, they often overlook the [...] Read more.

The image style transfer task aims to apply the style characteristics of a reference image to a content image, generating a new stylized result. While many existing methods focus on designing feature transfer modules and have achieved promising results, they often overlook the entanglement between content and style features after transfer, making effective separation challenging. To address this issue, we propose a Dual-Branch Decoupled Image Style Transfer Network (DBDST-Net) to better disentangle content and style representations. The network consists of two branches: a Content Feature Decoupling Branch, which captures fine-grained content structures for more precise content separation, and a Style Feature Decoupling Branch, which enhances sensitivity to style-specific attributes. To further improve the decoupling performance, we introduce a dense-regressive loss that minimizes the discrepancy between the original content image and the content reconstructed from the stylized output, thereby promoting the independence of content and style features while enhancing image quality. Additionally, to mitigate the limited availability of style data, we employ the Stable Diffusion model to generate stylized samples for data augmentation. Extensive experiments demonstrate that our method achieves a better balance between content preservation and style rendering compared to existing approaches. Full article

► Show Figures

Figure 1

26 pages, 10083 KB

Open AccessArticle

Product Image Generation Method Based on Morphological Optimization and Image Style Transfer

by Aimin Zhou, Xinle Wang, Yujin Huang, Weitang Wang, Shutao Zhang and Jinyan Ouyang

Appl. Sci. 2025, 15(13), 7330; https://doi.org/10.3390/app15137330 - 30 Jun 2025

Viewed by 345

Abstract

In order to improve the controllability and esthetics of product image generation, from the perspective of design, this study proposes a product image generation method based on morphological optimization, esthetic evaluation, and style transfer. Firstly, based on computational esthetics and principles of visual [...] Read more.

In order to improve the controllability and esthetics of product image generation, from the perspective of design, this study proposes a product image generation method based on morphological optimization, esthetic evaluation, and style transfer. Firstly, based on computational esthetics and principles of visual perception, an esthetic comprehensive evaluation model is constructed and used as the fitness function. The genetic algorithm is employed to build a product morphological optimization design system, obtaining product form schemes with higher esthetic quality. Then, an automobile front-end image dataset is constructed, and a generative adversarial network model is trained. Using the aforementioned product form scheme as the content image and selecting automobile front-end images from the market as the target style image, the content features and style features are extracted by the encoder and input into the generator to generate style-transferred images. The discriminator is utilized for judgment, and through iterative optimization, product image schemes that meet the target style are obtained. Experimental results demonstrate that the model generates product images with good effects, showcasing the feasibility of the method and providing robust technical support for intelligent product image design. Full article

(This article belongs to the Topic Theoretical and Applied Problems in Human-Computer Intelligent Systems)

► Show Figures

Figure 1

28 pages, 11793 KB

Open AccessArticle

Unsupervised Multimodal UAV Image Registration via Style Transfer and Cascade Network

by Xiaoye Bi, Rongkai Qie, Chengyang Tao, Zhaoxiang Zhang and Yuelei Xu

Remote Sens. 2025, 17(13), 2160; https://doi.org/10.3390/rs17132160 - 24 Jun 2025

Cited by 1 | Viewed by 533

Abstract

Cross-modal image registration for unmanned aerial vehicle (UAV) platforms presents significant challenges due to large-scale deformations, distinct imaging mechanisms, and pronounced modality discrepancies. This paper proposes a novel multi-scale cascaded registration network based on style transfer that achieves superior performance: up to 67% [...] Read more.

Cross-modal image registration for unmanned aerial vehicle (UAV) platforms presents significant challenges due to large-scale deformations, distinct imaging mechanisms, and pronounced modality discrepancies. This paper proposes a novel multi-scale cascaded registration network based on style transfer that achieves superior performance: up to 67% reduction in mean squared error (from 0.0106 to 0.0068), 9.27% enhancement in normalized cross-correlation, 26% improvement in local normalized cross-correlation, and 8% increase in mutual information compared to state-of-the-art methods. The architecture integrates a cross-modal style transfer network (CSTNet) that transforms visible images into pseudo-infrared representations to unify modality characteristics, and a multi-scale cascaded registration network (MCRNet) that performs progressive spatial alignment across multiple resolution scales using diffeomorphic deformation modeling to ensure smooth and invertible transformations. A self-supervised learning paradigm based on image reconstruction eliminates reliance on manually annotated data while maintaining registration accuracy through synthetic deformation generation. Extensive experiments on the LLVIP dataset demonstrate the method’s robustness under challenging conditions involving large-scale transformations, with ablation studies confirming that style transfer contributes 28% MSE improvement and diffeomorphic registration prevents 10.6% performance degradation. The proposed approach provides a robust solution for cross-modal image registration in dynamic UAV environments, offering significant implications for downstream applications such as target detection, tracking, and surveillance. Full article

(This article belongs to the Special Issue Advances in Deep Learning Approaches: UAV Data Analysis)

► Show Figures

Graphical abstract

Search Results (184)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (184)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI