Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (8,669)

Search Parameters:
Keywords = image comparison

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
18 pages, 6837 KB  
Article
Experimental Analysis of the Effects of Image Lightness and Chroma Modulation on the Reproduction of Glossiness, Transparency and Roughness
by Hideyuki Ajiki and Midori Tanaka
J. Imaging 2026, 12(4), 159; https://doi.org/10.3390/jimaging12040159 (registering DOI) - 8 Apr 2026
Abstract
Even when an object’s color is accurately reproduced in a colorimetrically reproduced image (CRI), the perceived material appearance does not necessarily match that of the original object. This mismatch remains a challenge for faithfully reproducing real-world appearance in digital media. In this study, [...] Read more.
Even when an object’s color is accurately reproduced in a colorimetrically reproduced image (CRI), the perceived material appearance does not necessarily match that of the original object. This mismatch remains a challenge for faithfully reproducing real-world appearance in digital media. In this study, we investigated how lightness and chroma modulation affect the perception of glossiness, transparency, and roughness. These three attributes were quantitatively correlated with physical surface properties and image features through a direct comparison between objects and images. Observers selected the images that best matched the material appearance of the physical samples for each attribute. Image features derived from the gray-level co-occurrence matrix (GLCM) and surface roughness parameters were analyzed to compare the selected images with the CRI. In the lightness experiment, observers consistently selected images with higher lightness than the CRI, which was accompanied by increased complexity in the luminance distribution. In the chroma experiment, images with higher chroma were preferred; however, changes in GLCM features were negligible. Notably, stimuli with small local luminance differences at the CRI required larger shifts in image features to achieve perceptual matching. These findings indicate that modulating the luminance distribution is crucial for aligning the perceived appearance between physical objects and their digital representations. Full article
(This article belongs to the Section Color, Multi-spectral, and Hyperspectral Imaging)
Show Figures

Figure 1

14 pages, 1403 KB  
Article
Sex Estimation from CT-Derived Craniofacial Measurements in Thai Adults: Comparative Performance of Discriminant Function Analysis, Support Vector Machine, and Random Forest with Forensic Case Application Examples
by Suthat Duangchit, Woranan Kirisattayakul, Prin Twinprai, Naraporn Maikong, Nattaphon Twinprai, Jiratcha Witchathrontrakul, Thongjit Mahajanthavong, Chalermphon Pitirith, Kanokwan Lamai, Phatthiraporn Aorachon, Sararat Innoi, Nareelak Tangsrisakda, Sitthichai Iamsaard and Chanasorn Poodendaen
Forensic Sci. 2026, 6(2), 35; https://doi.org/10.3390/forensicsci6020035 - 8 Apr 2026
Abstract
Background/Objectives: Sex estimation from craniofacial morphology is a fundamental component of biological profile construction in forensic anthropology. Population-specific reference data for Thai individuals derived from computed tomography (CT) remain limited, and direct comparisons between discriminant function analysis (DFA) and machine learning classifiers [...] Read more.
Background/Objectives: Sex estimation from craniofacial morphology is a fundamental component of biological profile construction in forensic anthropology. Population-specific reference data for Thai individuals derived from computed tomography (CT) remain limited, and direct comparisons between discriminant function analysis (DFA) and machine learning classifiers are frequently complicated by inconsistent validation protocols. This study aimed to characterize sexual dimorphism in CT-derived craniofacial measurements, compare the classification performance of DFA, support vector machine (SVM), and random forest (RF) under a unified validation protocol, and demonstrate their practical application in a forensic context. Methods: CT images from 300 Thai adults (150 males, 150 females; age range 20–90 years) were obtained from Srinagarind Hospital, Khon Kaen University. Eight linear craniofacial measurements spanning the cranial vault, facial skeleton, nasal aperture, and orbital region were obtained from each case. DFA, SVM, and RF were developed and compared under a unified leave-one-out cross-validation protocol. Classification performance was assessed using accuracy, AUC, and Matthews correlation coefficient (MCC). Results: Seven of eight measurements exhibited statistically significant sexual dimorphism, with facial breadth and nasal height demonstrating the greatest dimorphism. DFA achieved the highest classification accuracy of 85.7%, AUC of 0.924, and MCC of 0.713, incorporating five measurements into the canonical function. SVM and RF achieved comparable accuracy of 84.7% and 84.0%, respectively. All three classifiers correctly classified both forensic application cases with high confidence. Conclusions: CT-derived craniofacial measurements provide a reliable basis for sex estimation in Thai adults. The convergence of performance across all three classifiers under a unified internal validation protocol strengthens confidence in the internally validated performance estimates. The derived discriminant function equation and saved machine learning models constitute a complementary and immediately applicable toolkit for CT-based forensic sex estimation in the Thai population. Full article
Show Figures

Figure 1

25 pages, 6398 KB  
Article
StageAttn-VTON: Stage-Wise Flow Deformation with Attention for High-Resolution Virtual Try-On
by Li Yao, Wenhui Liang and Yan Wan
Appl. Sci. 2026, 16(7), 3609; https://doi.org/10.3390/app16073609 - 7 Apr 2026
Abstract
Virtual try-on is a key enabling technology for online fashion retail and digital garment visualization. It aims to realistically render a target garment on a person while preserving geometric alignment and fine texture details. Appearance flow-based approaches provide explicit deformation modeling but often [...] Read more.
Virtual try-on is a key enabling technology for online fashion retail and digital garment visualization. It aims to realistically render a target garment on a person while preserving geometric alignment and fine texture details. Appearance flow-based approaches provide explicit deformation modeling but often suffer from texture squeezing and boundary artifacts in challenging scenarios, such as long sleeves and tucked-in garments, especially under high-resolution settings. In this work, we propose StageAttn-VTON (Stage-wise Attentive Virtual Try-On), an appearance flow-based framework that improves structural coherence and visual fidelity through stage-wise deformation modeling. Specifically, garment warping is decomposed into three stages—coarse alignment, local refinement, and non-target region removal—which mitigates the coupling between competing objectives, such as smooth texture preservation and accurate structural alignment. Furthermore, we introduce a self-attention module in the image synthesis stage to enhance global dependency modeling and capture long-range garment–body interactions. Experiments on VITON-HD and the upper-body subset of DressCode demonstrate that StageAttn-VTON achieves consistently strong performance against representative warping-based and diffusion-based baselines. In addition, qualitative comparisons show that the proposed method better alleviates deformation artifacts in challenging regions such as sleeves and waist areas. Full article
12 pages, 1226 KB  
Article
Anatomical Variations in Major Abdominal Aortic Branches and Sex-Related Differences: A Large-Scale Analysis of 1174 Patients
by Oguzhan Tokur and Koray Bingol
Tomography 2026, 12(4), 51; https://doi.org/10.3390/tomography12040051 - 6 Apr 2026
Abstract
Background: This study aims to evaluate the prevalence, spectrum, and coexistence of anatomical variations in the major branches of the abdominal aorta using Multidetector Computed Tomography (MDCT) angiography, with a specific emphasis on analyzing sex-related differences in a large-scale cohort. Methods: A retrospective [...] Read more.
Background: This study aims to evaluate the prevalence, spectrum, and coexistence of anatomical variations in the major branches of the abdominal aorta using Multidetector Computed Tomography (MDCT) angiography, with a specific emphasis on analyzing sex-related differences in a large-scale cohort. Methods: A retrospective analysis was conducted on 1174 patients (63.8% male, 36.2% female; mean age 60.54) who underwent abdominal CT angiography between January 2023 and June 2024. Images were acquired using a 128-slice MDCT scanner and reconstructed for detailed vascular assessment. Statistical comparisons between genders were performed using Chi-square and Fisher–Freeman–Halton tests, with p < 0.05 considered significant. Results: The celiac trunk (93.3%), superior mesenteric artery (SMA) (97.1%), and inferior mesenteric artery (IMA) (98.5%) predominantly showed classical patterns. However, significant sex-related differences were identified. Females exhibited significantly higher rates of classical patterns for the celiac trunk (96.2% vs. 91.7%), IMA (99.1% vs. 98.1%), right hepatic artery (RHA) (91.5% vs. 82.6%), and left hepatic artery (LHA) (95.8% vs. 85.4%). Conversely, males showed a higher prevalence of complex variations, including replaced/accessory hepatic arteries and the absence of the common hepatic artery. The number of right and left renal arteries was similar between sexes and did not show a significant difference, while horseshoe kidney was detected only in males. Conclusions: Abdominal vascular structures adhere to classical anatomy more frequently in females, while males exhibit greater morphological variability. These findings emphasize the necessity of gender-specific preoperative vascular mapping to optimize surgical outcomes and reduce morbidity. Full article
(This article belongs to the Section Cardiovascular Imaging)
Show Figures

Figure 1

17 pages, 27170 KB  
Article
Tests of HgCdTe Photodetectors Performances for Implementation on the MIST-A Instrument
by Chiara Cencia, Eliana La Francesca, Mauro Ciarniello, Andrea Raponi, Fabrizio Capaccioni, Maria Cristina De Sanctis, Simone De Angelis, Michelangelo Formisano, Marco Ferrari, David Biondi, Angelo Boccaccini, Stefania Stefani, Giuseppe Piccioni, Alessandro Mura, Anna Galiano, Leonardo Tommasi, Clorinda Bartolo, Marcella Iuzzolino, Leda Bucciantini, Michele Dami, Giovanni Cossu, Stefano Nencioni, Angelo Olivieri, Eleonora Ammannito, Alessandra Tiberia and Gianrico Filacchioneadd Show full author list remove Hide full author list
Sensors 2026, 26(7), 2250; https://doi.org/10.3390/s26072250 - 5 Apr 2026
Viewed by 180
Abstract
The Middle-Wave Infrared Imaging Spectrometer for Target Asteroids (MIST-A) will be launched in 2028 aboard the Emirates Mission to the Asteroid belt (EMA) and will operate in the 2–5 μm spectral range to study the asteroids’ surface composition and thermo-physical properties. MIST-A’s Optical [...] Read more.
The Middle-Wave Infrared Imaging Spectrometer for Target Asteroids (MIST-A) will be launched in 2028 aboard the Emirates Mission to the Asteroid belt (EMA) and will operate in the 2–5 μm spectral range to study the asteroids’ surface composition and thermo-physical properties. MIST-A’s Optical Head (OH) design is inherited from the Jovian IR Auroral Mapper (JIRAM), from which the instrument also received two spare Hybrid-Thinned Mercury-Cadmium-Telluride (MCT) photodetectors: the Engineering Model EM2 and the Flight Spare FS1. These are tested to assess their performance after a long period of storage. The laboratory setup for testing both detectors consists of a blackbody and a cryostat which houses the focal plane, maintained at temperatures of 85 K, its nominal operative temperature, and 90 K. Two sets of measurements are performed: (1) characterization of the dark current at different integration times (0 ms, 224 ms, 448 ms, 672 ms, 869 ms, 1120 ms); (2) verification of the detectors’ response linearity, measuring a blackbody at different temperatures (from 50 °C to 100 °C), including ambient temperature (25 °C, with the blackbody turned off). The results of these tests confirm that both models are fully operational and allow us to evaluate the consequences of the years of inactivity on their performance. Through a detailed analysis of the detectors’ properties and a comparison study with the results of the sensors’ first characterization performed by their producer in 2009, we come to the conclusion that both instruments are able to fulfill MIST-A’s scientific requirements. The FS1 displays a better performance with respect to the EM2 and for this has been selected as MIST-A’s Flight Model. Full article
(This article belongs to the Special Issue Spectroscopic Sensing for Planetary Exploration and Planetary Defense)
Show Figures

Figure 1

11 pages, 1928 KB  
Article
Characterization of Inferior Rectus Muscle Action in Normal Subjects Using Real-Time Magnetic Resonance Imaging of the Orbit
by Alexander R. Engelmann, Kailash Singh, Jiachen Zhuo, Néha Datta, Alfredo A. Sadun, Michael P. Grant and Shannath L. Merbs
Craniomaxillofac. Trauma Reconstr. 2026, 19(2), 20; https://doi.org/10.3390/cmtr19020020 - 5 Apr 2026
Viewed by 115
Abstract
Orbital floor fractures may cause long-term functional and esthetic impairments. Diplopia due to impaired function of the inferior rectus muscle is frequently an indication for surgical repair, but some cases, such as those where the diagnosis has been delayed or a previous attempt [...] Read more.
Orbital floor fractures may cause long-term functional and esthetic impairments. Diplopia due to impaired function of the inferior rectus muscle is frequently an indication for surgical repair, but some cases, such as those where the diagnosis has been delayed or a previous attempt at repair has been made, may not always be amenable to surgical correction. It is advantageous for the surgeon to know whether the proper function of the inferior rectus muscle can be restored for the purposes of surgical planning and prognostication. The authors hypothesized that real-time MRI could be used to characterize the appearance of the inferior rectus muscle in a way that would facilitate future analysis of inferior rectus function in patients with diplopia due to orbital floor fractures. Real-time MRI was performed on 10 volunteer participants with normal ophthalmic function and orbital anatomy to assess inferior rectus appearance during vertical duction testing. ImageJ software was used to measure and record characteristics of the inferior rectus muscle, viewed in a quasi-sagittal plane. The ratios evaluated included inferior rectus muscle length in upgaze versus downgaze (UDR, mean 1.58) as well as inferior rectus muscle length versus distance from inferior rectus origin to inferior rectus inflection point in upgaze (LIR, mean 1.30) and downgaze (mean 1.20). These values were found to be conserved between orbits and individuals. This data offers quantitative insight regarding inferior rectus muscle appearance across the full arc of vertical gaze in healthy individuals. We plan to use this normative baseline dataset as a comparison for future phases of this project, using real-time MRI to evaluate traumatized orbits with diplopia and derangement of the inferior rectus muscle. Full article
Show Figures

Figure 1

17 pages, 500 KB  
Article
Clinical Factors Associated with hrCT-Confirmed Interstitial Lung Disease in Rheumatoid Arthritis: A Retrospective Case–Control Study
by Oana-Georgiana Dinache, Claudiu C. Popescu, Corina D. Mogoșan, Cătălin Codreanu and Luminița Enache
J. Clin. Med. 2026, 15(7), 2735; https://doi.org/10.3390/jcm15072735 - 4 Apr 2026
Viewed by 113
Abstract
Background/Objectives: Rheumatoid arthritis-associated interstitial lung disease (RA-ILD) is a major contributor to morbidity and mortality in RA, yet early recognition remains challenging in routine care. The study aimed to identify clinical factors associated with hrCT-confirmed RA-ILD using a CT-verified case–control design. Methods [...] Read more.
Background/Objectives: Rheumatoid arthritis-associated interstitial lung disease (RA-ILD) is a major contributor to morbidity and mortality in RA, yet early recognition remains challenging in routine care. The study aimed to identify clinical factors associated with hrCT-confirmed RA-ILD using a CT-verified case–control design. Methods: A single-center retrospective case–control study was designed to include RA patients who underwent chest hrCT in routine care. Cases were patients with ILD on index hrCT (n = 79) and controls were RA patients with hrCT negative for ILD (n = 59). Data were manually abstracted from clinical interview, laboratory testing, RA activity and structural assessment, respiratory examination, pulmonary function tests (PFT), chest radiography, and hrCT. Predictors were extracted from the 12 months preceding the index scan. Univariate comparisons used nonparametric tests or χ2, as appropriate. Prespecified multivariable logistic regression estimated adjusted odds ratios (aORs). Sensitivity analyses included restriction to patients with available pre-index PFT, addition of respiratory examination variables, and a matched conditional logistic regression analysis. Results: In the primary multivariable model, male sex was independently associated with RA-ILD (aOR 5.31, 95% CI 1.91–14.75), and COPD/asthma was also associated (aOR 2.82, 1.05–7.56). Adding dyspnea and Velcro crackles improved discrimination (AUC 0.797 to 0.850); Velcro crackles were independently associated with RA-ILD (aOR 5.11, 1.32–19.73). Findings were directionally similar in sensitivity analyses, though precision decreased in matched models. Conclusions: In this CT-imaged real-world RA cohort, male sex, COPD/asthma, and Velcro crackles were associated with hrCT-confirmed RA-ILD; these findings should be interpreted as preliminary, as they apply to patients selected for imaging and should not be extrapolated to unselected RA populations without validation in larger, multi-center and/or prospective cohorts with systematic ascertainment. Full article
(This article belongs to the Section Immunology & Rheumatology)
Show Figures

Figure 1

23 pages, 2601 KB  
Article
Can Modern Vision Models Understand the Difference Between an Object and a Look-Alike?
by Itay Cohen, Ethan Fetaya and Amir Rosenfeld
AI 2026, 7(4), 132; https://doi.org/10.3390/ai7040132 - 4 Apr 2026
Viewed by 212
Abstract
Recent advances in computer vision have yielded models with strong performance on recognition benchmarks; however, significant gaps remain in comparison with human perception. One subtle ability is to judge whether an image looks like a given object without being an instance of that [...] Read more.
Recent advances in computer vision have yielded models with strong performance on recognition benchmarks; however, significant gaps remain in comparison with human perception. One subtle ability is to judge whether an image looks like a given object without being an instance of that object. We study whether vision–language models such as CLIP capture this distinction. We curated a dataset named RoLA (Real or LookAlike) of real and look-alike exemplars (e.g., toys, statues, drawings, pareidolia) across multiple categories, and first evaluate a prompt-based baseline with paired “real”/“look-alike” prompts. We then estimate a direction in CLIP’s embedding space that moves representations between real and look-alike. Applying this direction to image and text embeddings improves discrimination in cross-modal retrieval on Conceptual 12M, and also enhances captions produced by a CLIP prefix captioner. Full article
(This article belongs to the Section AI Systems: Theory and Applications)
Show Figures

Figure 1

41 pages, 35277 KB  
Article
A Multi-Strategy Improved Seagull Optimization Algorithm for Global Optimization and Artistic Image Segmentation
by Yangyang Jiang
Biomimetics 2026, 11(4), 247; https://doi.org/10.3390/biomimetics11040247 - 3 Apr 2026
Viewed by 254
Abstract
Multilevel threshold image segmentation is a key task in image processing, yet it faces challenges such as low search efficiency in high-dimensional spaces, difficulty in balancing segmentation accuracy and stability, and insufficient adaptability to complex scenes. Existing solutions mainly include traditional thresholding methods [...] Read more.
Multilevel threshold image segmentation is a key task in image processing, yet it faces challenges such as low search efficiency in high-dimensional spaces, difficulty in balancing segmentation accuracy and stability, and insufficient adaptability to complex scenes. Existing solutions mainly include traditional thresholding methods and metaheuristic optimization-based schemes, but they still face limitations in high-dimensional and complex segmentation tasks. The standard Seagull Optimization Algorithm (SOA) suffers from shortcomings including a single exploration mechanism, weak local exploitation capability, and a tendency for population diversity to deteriorate, making it difficult to meet the demands of high-dimensional optimization. To address these issues, this paper proposes a multi-strategy fused improved Seagull Optimization Algorithm (MFISOA), which integrates three strategies: adaptive cooperative foraging, differential evolution-driven exploitation, and centroid opposition-based boundary control. These strategies jointly construct a collaborative optimization framework with dynamic resource allocation, fine local search, and population diversity maintenance, thereby improving global exploration efficiency, local exploitation accuracy, and population stability. To evaluate the optimization performance of MFISOA, numerical simulation experiments were conducted on the CEC2017 and CEC2022 benchmark test suites, and comparisons were made with nine other mainstream advanced algorithms. The results show that MFISOA outperforms the competing algorithms in terms of optimization accuracy, convergence speed, and operational stability. Its superiority is further verified by the Wilcoxon rank-sum test and the Friedman test, with statistical significance (p < 0.05). In the multilevel threshold image segmentation task, using the Otsu criterion as the objective function, MFISOA was tested on nine benchmark images under 4-, 6-, 8-, and 10-threshold segmentation scenarios. The results indicate that MFISOA achieves better performance on metrics such as Structural Similarity Index (SSIM), Peak Signal-to-Noise Ratio (PSNR), and Feature Similarity Index (FSIM), enabling more accurate characterization of image grayscale distribution features and producing higher-quality segmentation results. This study provides an efficient and reliable approach for numerical optimization and multilevel threshold image segmentation. Full article
Show Figures

Figure 1

29 pages, 45971 KB  
Article
Dual-Tracer Imaging and Deep Learning for Real-Time Prediction of Lymph Node Metastasis in cN0 Papillary Thyroid Carcinoma
by Jing Zhou, Yuchen Zhuang, Qian Xiao, Shiying Yang, Zhuolin Dai, Chun Huang, Chang Deng, Lin Chun, Han Gao and Xinliang Su
Cancers 2026, 18(7), 1157; https://doi.org/10.3390/cancers18071157 - 3 Apr 2026
Viewed by 187
Abstract
Background: Occult lymph node metastasis (LNM) occurs in 30–80% of patients with clinically node-negative papillary thyroid carcinoma (cN0-PTC), partly owing to the limited sensitivity of current preoperative nodal assessment, and may contribute to postoperative recurrence. Conventional sentinel lymph node (SLN) biopsy, typically [...] Read more.
Background: Occult lymph node metastasis (LNM) occurs in 30–80% of patients with clinically node-negative papillary thyroid carcinoma (cN0-PTC), partly owing to the limited sensitivity of current preoperative nodal assessment, and may contribute to postoperative recurrence. Conventional sentinel lymph node (SLN) biopsy, typically performed with a single tracer, has limited reliability for detecting occult metastatic nodes, which can result in either overtreatment or undertreatment with lymph node dissection. We aimed to develop a highly accurate multimodal prediction framework to accurately identify second-echelon lymph node metastasis (SeLNM) and non-sentinel lymph node metastasis (NsLNM). Methods: We prospectively enrolled 301 patients with cN0-PTC between April and October 2024, of whom 131 met the inclusion criteria. Intraoperatively, a dual-tracer technique combining carbon nanoparticles and indocyanine green was applied, and near-infrared imaging was used to record the entire SLN visualization process in real time. For each case, a 3 min video clip (150 frames) was captured. Two senior surgeons delineated regions of interest to generate 19,650 mask images. A total of 2048 spatial features and 20 temporal features were extracted, combined with 32 clinical variables, including demographics, ultrasound characteristics, and gene mutation status. Nine deep learning models were developed and evaluated using 10-fold cross-validation. Model performance was quantified using receiver operating characteristic curves, decision curve analysis curves, calibration curves, precision–recall curves, learning curves, and 12 metrics. Statistical comparisons were performed using the DeLong test, and models were further evaluated using a probability-based ranking approach. Shapley Additive Explanations (SHAP) analysis was applied to interpret key predictive features. The primary outcomes were SeLNM and NsLNM, defined based on postoperative histopathology. Results: The Long Short-Term Memory (LSTM) + Transformer model showed the best performance for both prediction tasks, with stable AUCs across training and testing (SeLNM: 0.980/0.982; NsLNM: 0.986/0.983). In the testing set, the model reached the same accuracy for both outcomes (94.7%) and showed strong sensitivity/specificity for SeLNM (94.7%/94.6%) and NsLNM (96.4%/91.5%). SHAP analysis indicated that time-series fluorescence flow features were the most influential predictors, followed by spatial structural features and SLN status. Conclusions: Dual-tracer SLN mapping with deep learning demonstrated encouraging intraoperative prediction of lymph node metastasis with interpretable features in this single-center cohort. Independent multicenter validation and prospective outcome studies are needed before considering clinical adoption. Full article
(This article belongs to the Section Cancer Informatics and Big Data)
Show Figures

Figure 1

17 pages, 1372 KB  
Article
GastroMalign: Vision Transformer-Based Framework for Early Detection and Malignancy-Risk Stratification for High-Risk Gastrointestinal Lesions
by Sri Harsha Boppana, Sachin Sravan Kumar Komati, Medha Sharath, Aditya Chandrashekar, Gautam Maddineni, Raja Chandra Chakinala, Pradeep Yarra and C. David Mintz
J. Clin. Med. 2026, 15(7), 2701; https://doi.org/10.3390/jcm15072701 - 2 Apr 2026
Viewed by 241
Abstract
Background: Current artificial intelligence (AI) systems in gastrointestinal (GI) endoscopy primarily emphasize binary detection or static classification, providing limited support for the graded assessment of malignant potential that underpins clinical decision-making. We developed GastroMalign, a transformer-based framework designed to stratify GI lesions [...] Read more.
Background: Current artificial intelligence (AI) systems in gastrointestinal (GI) endoscopy primarily emphasize binary detection or static classification, providing limited support for the graded assessment of malignant potential that underpins clinical decision-making. We developed GastroMalign, a transformer-based framework designed to stratify GI lesions according to ordinal disease severity while maintaining clinical interpretability, addressing this unmet need in endoscopic risk assessment. Methods: This retrospective development and validation study used the publicly available GastroVision dataset, comprising 8000 de-identified endoscopic still images from the upper and lower gastrointestinal tract, including the esophagus, stomach, duodenum, colon, rectum, and terminal ileum. GastroMalign integrates a Vision Transformer (ViT) encoder with a Sequential Feature Learner that explicitly models ordinal disease severity along a benign-to-malignant spectrum. The framework produces both categorical risk classification and a continuous malignancy risk score. Images were stratified into training (80%), validation (10%), and test (10%) sets. Performance was compared with convolutional neural network (CNN) baselines and a Swin Transformer. Interpretability was assessed using Score-CAM visualizations reviewed by blinded expert endoscopists. Results: On the held-out test set (n = 800 images), GastroMalign achieved an overall accuracy of 80.06%, precision of 79.65%, recall of 80.06%, and F1-score of 79.17%, with a micro-averaged AUC of 0.98. In comparison, ResNet-50 and DenseNet-121 achieved accuracies of 32.42% and 36.77%, respectively, while the Swin Transformer achieved 60.56% accuracy (AUC = 0.93). Ablation analyses demonstrated a 17% absolute reduction in High-Risk lesion recall when the progression-aware module was removed. Continuous malignancy risk scores increased monotonically across ordinal classes, with mean values < 0.18 for Benign and >0.72 for High-Risk/Malignant lesions. Score-CAM visualizations demonstrated 92% overlap with clinician-annotated lesion regions. Conclusions: GastroMalign delivers an interpretable, progression-aware AI framework for GI lesion risk stratification that outperforms existing CNN- and transformer-based models. Clinically, GastroMalign is intended as an adjunct decision-support tool during endoscopic review to standardize lesion risk stratification (benign to malignant spectrum), support management decisions (biopsy vs. resection vs. surveillance), and reduce operator-dependent variability by pairing ordinal risk outputs with interpretable visual explanations. Full article
Show Figures

Figure 1

28 pages, 4737 KB  
Article
Comparative Evaluation of Perceptual Hashing and Deep Embedding Methods for Robust and Efficient Image Deduplication
by Md Firoz Mahmud, Zerin Nusrat and W. David Pan
Electronics 2026, 15(7), 1493; https://doi.org/10.3390/electronics15071493 - 2 Apr 2026
Viewed by 296
Abstract
The rapid growth in large-scale image repositories over the past few years has made exact and near-duplicate images increasingly common, creating substantial redundancy that wastes storage resources and reduces retrieval efficiency in practical systems. Even though perceptual hashing and deep learning are promising [...] Read more.
The rapid growth in large-scale image repositories over the past few years has made exact and near-duplicate images increasingly common, creating substantial redundancy that wastes storage resources and reduces retrieval efficiency in practical systems. Even though perceptual hashing and deep learning are promising deduplication strategies, the lack of standardized benchmarks complicates direct comparison. In this study, we conduct a unified, controlled evaluation of five commonly used methods, including four classical perceptual hashes (AHash, DHash, PHash, and WHash) and a CNN-based embedding model. We evaluate all methods on the UKBench and Amazon Berkeley Objects datasets using identical preprocessing, thresholds, and metrics, which include exact duplicates, near-duplicates, and geometrically transformed duplicates. Our experiments highlight a clear trade-off between speed and robustness. Hashing methods are computationally efficient and effective for exact matches, but perform poorly on near-duplicates and under geometric transformations, whereas the CNN model is significantly more robust across all duplicate types, but comes at a high computational cost. Based on these results, we outline practical recommendations for selecting deduplication strategies in large-scale applications. In addition, our evaluation setup serves as a reproducible baseline for future research in image similarity and large-scale deduplication. Full article
Show Figures

Figure 1

12 pages, 1115 KB  
Article
From ABCD to AI: Assessing the Diagnostic Reliability of MLLMs in Cutaneous Melanoma Screening—A Head-to-Head Comparison
by Răzvan Ioan Andrei, Aniela Roxana Nodiți-Cuc, Silviu Cristian Voinea, Cristian Ioan Bordea and Alexandru Blidaru
Diagnostics 2026, 16(7), 1077; https://doi.org/10.3390/diagnostics16071077 - 2 Apr 2026
Viewed by 228
Abstract
Background: Melanoma remains a leading cause of cancer-related mortality, with early detection being the primary determinant of survival. The emergence of MLLMs offers a potential paradigm shift in accessible screening. However, the diagnostic reliability and safety of these general-purpose models in oncology [...] Read more.
Background: Melanoma remains a leading cause of cancer-related mortality, with early detection being the primary determinant of survival. The emergence of MLLMs offers a potential paradigm shift in accessible screening. However, the diagnostic reliability and safety of these general-purpose models in oncology remain insufficiently characterized. Methods: This study performed a head-to-head comparison of GPT-5, Gemini 3, and Grok 4 to evaluate their efficacy as first-level screening tools for cutaneous melanoma. A retrospective analysis was conducted using a balanced dataset of 100 clinical images (50 histopathologically confirmed benign, 50 malignant) from the ISIC archive. Results: Gemini 3 achieved the highest overall accuracy (71%) and specificity (94%), while Grok 4 demonstrated the highest sensitivity (52%). All models exhibited a critical deficit in sensitivity, missing approximately half of the malignant lesions. Statistical testing revealed no significant performance differences between the models (p > 0.05). Notably, Gemini 3 exhibited severe overconfidence, maintaining a high CI (84.62%) even during false-negative predictions, whereas GPT-5 and Grok 4 showed better calibration with a significant drop in confidence upon incorrect diagnosis. Conclusions: While current MLLMs possess a foundational capacity for dermatological analysis, their unacceptably low sensitivity and potential for overconfident misdiagnosis render them unsafe as standalone screening tools. At present, MLLMs should only be utilized as complementary tools under strict clinical supervision. Full article
(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)
Show Figures

Figure 1

19 pages, 8523 KB  
Article
DAMFusion: Multi-Spectral Image Segmentation via Competitive Query and Boundary Region Attention
by Miao Yu, Xing Lu, Ziyao Yang, Daoxing Gao and Guoqiang Zhong
Remote Sens. 2026, 18(7), 1064; https://doi.org/10.3390/rs18071064 - 2 Apr 2026
Viewed by 229
Abstract
To address the challenges of modal differences in multimodal farmland images and insufficient segmentation accuracy for small targets, this paper proposes a multi-source image fusion branch (DAMFusion) based on modal competitive selection. The branch dynamically selects infrared and visible light features through the [...] Read more.
To address the challenges of modal differences in multimodal farmland images and insufficient segmentation accuracy for small targets, this paper proposes a multi-source image fusion branch (DAMFusion) based on modal competitive selection. The branch dynamically selects infrared and visible light features through the Competitive Query Module (CQM) using Top-K screening, combined with IOU-aware loss optimization to avoid cross-modal interference. The multimodal fusion module (MMFormer) employs cross-modal attention and symmetric mechanisms, enhancing single-modal features through a self-enhancement module and unifying multimodal distributions via linear projection. The Boundary Region Attention Multi-level Fusion Module (BRM) extracts boundary information through feature differencing, strengthens it with spatial attention, and fuses it with shallow features to achieve cross-layer detail recovery. Through the collaborative design of dynamic modal feature selection, cross-modal distribution unification, and boundary region enhancement, DAMFusion effectively solves the problems of multimodal differences and small target segmentation in multispectral images, providing precise feature representation for fine farmland segmentation. Experiments on the OUC-UAV-MSEG dataset show that DAMFusion achieves 93.25% OA, 91.71% F1, and 89.70% mIoU, demonstrating clear advantages over representative comparison methods. In addition, ablation results verify the effectiveness of the proposed modules, where CQM improves OA from 91.00% to 93.25%, confirming the importance of discriminative modality selection before fusion. Full article
(This article belongs to the Section AI Remote Sensing)
Show Figures

Figure 1

17 pages, 771 KB  
Article
MSA-Net: A Deep Learning Network with Multi-Axial Hadamard Attention and Pyramid Pooling for Stroke Microwave Imaging
by Bo Han, Dongliang Li, Xuhui Zhu, Mingshuai Zhang and Peng Li
Algorithms 2026, 19(4), 276; https://doi.org/10.3390/a19040276 - 2 Apr 2026
Viewed by 183
Abstract
Microwave imaging is emerging as an alternative to conventional medical diagnostic techniques. Traditional analytical and numerical methods fail to adequately address these fundamental challenges: they often rely on strict linear approximations or simplified physical models, leading to low reconstruction accuracy, poor robustness, and [...] Read more.
Microwave imaging is emerging as an alternative to conventional medical diagnostic techniques. Traditional analytical and numerical methods fail to adequately address these fundamental challenges: they often rely on strict linear approximations or simplified physical models, leading to low reconstruction accuracy, poor robustness, and limited generalization ability in complex clinical scenarios. As a result, they cannot meet the high-precision requirements of practical stroke microwave imaging. To further improve the accuracy of microwave imaging algorithms in recognizing stroke regions and solving the backscattering problem, this study employs a combination of methods with deep learning. It presents the Multi-Scale Attention Network (MSA-Net) for microwave imaging. The network is based on the EGE-UNet network structure with improved multi-axis Hadamard attention, incorporating null-space pyramid pooling and introducing a deep supervisory mechanism to improve the network performance further. To combine microwave imaging with deep learning, firstly, a large amount of microwave data need to be simulated with HFSS, in which the simulation model is a human brain stroke model constructed by an HFSS simulation system. Secondly, the microwave data obtained from the simulation are converted into a tensor format. Then, the tensor data are input into the MSA-Net neural network, which generates a binary mask image that can be used to detect the size and location of the stroke. This study also prompts the model to converge faster by sparsifying the microwave data to improve training efficiency. The method has been tested using simulation data, and based on the comparison experiments with other networks, MSA-Net is more accurate in detecting the location and the bleed size. The experimental results show that the proposed method is superior for stroke imaging. The experimental results show that the proposed model achieves a 1.08 improvement in peak signal-to-noise ratio and a 0.017 reduction in learned perceptual image block similarity, fully validating the effectiveness of the structural optimization strategy proposed in this paper. Full article
(This article belongs to the Special Issue Algorithms for Computer Aided Diagnosis: 3rd Edition)
Show Figures

Figure 1

Back to TopTop