Applied Sciences

Research

Jump to: Review

20 pages, 16326 KiB

Open AccessArticle

Multiplatform Computer Vision System to Support Physical Fitness Assessments in Schoolchildren

by José Sulla-Torres, Bruno Santos-Pamo, Fabrizzio Cárdenas-Rodríguez, Javier Angulo-Osorio, Rossana Gómez-Campos and Marco Cossio-Bolaños

Appl. Sci. 2024, 14(16), 7140; https://doi.org/10.3390/app14167140 - 14 Aug 2024

Viewed by 1435

Abstract

Currently, the lack of physical activity can lead to health problems, with the increase in obesity in children between 8 and 18 years old being of particular interest because it is a formative stage. One of the aspects of trying to solve this [...] Read more.

Currently, the lack of physical activity can lead to health problems, with the increase in obesity in children between 8 and 18 years old being of particular interest because it is a formative stage. One of the aspects of trying to solve this problem is the need for a standardized, less subjective, and more efficient method of evaluating physical condition in these children compared to traditional approaches. Objective: Develop a multiplatform based on computer vision technology that allows the evaluation of the physical fitness of schoolchildren using smartphones. Methodology: A descriptive cross-sectional study was carried out on schoolchildren aged 8 to 18 years of both sexes. The sample was 228 schoolchildren (128 boys and 108 girls). Anthropometric measurements of weight, height, and waist circumference were evaluated. Body mass index (BMI) was calculated. Four physical tests were evaluated: flexibility (sit and reach), horizontal jump (explosive strength), biceps curl (right arm strength resistance), and sit-ups (abdominal muscle resistance). With the information collected traditionally and by filming the physical tests, a computer vision system was developed to evaluate physical fitness in schoolchildren. Results: The implemented system obtained an acceptable level of precision, reaching 94% precision in field evaluations and a percentage greater than 95% in laboratory evaluations for testing. The developed mobile application also obtained a high accuracy percentage, greater than 95% in two tests and close to 85% in the remaining two. Finally, the Systematic Software Quality Model was used to determine user satisfaction with the presented prototype. Regarding usability, a satisfaction level of 97% and a reliability level of 100% was obtained. Conclusion: Compared to traditional evaluation and computer vision, the proposal was satisfactorily validated. These results were obtained using the Expanded Systematic Software Quality Model, which reached an “advanced” quality level, satisfying functionality, usability, and reliability characteristics. This advance demonstrates that the integration of computer vision is feasible, highly effective in the educational context, and applicable in the evaluations of physical education classes. Full article

(This article belongs to the Special Issue Application of Artificial Intelligence and Computer Vision for Detection and Analysis)

► Show Figures

Figure 1

16 pages, 9828 KiB

Open AccessArticle

An Attention-Based Full-Scale Fusion Network for Segmenting Roof Mask from Satellite Images

by Li Cheng, Zhang Liu, Qian Ma, He Qi, Fumin Qi and Yi Zhang

Appl. Sci. 2024, 14(11), 4371; https://doi.org/10.3390/app14114371 - 22 May 2024

Cited by 2 | Viewed by 1263

Abstract

Accurately segmenting building roofs from satellite images is crucial for evaluating the photovoltaic power generation potential of urban roofs and is a worthwhile research topic. In this study, we propose an attention-based full-scale fusion (AFSF) network to segment a roof mask from the [...] Read more.

Accurately segmenting building roofs from satellite images is crucial for evaluating the photovoltaic power generation potential of urban roofs and is a worthwhile research topic. In this study, we propose an attention-based full-scale fusion (AFSF) network to segment a roof mask from the given satellite images. By developing an attention-based residual ublock, the channel relationship of the feature maps can be modeled. By integrating attention mechanisms in multi-scale feature fusion, the model can learn different weights for features of different scales. We also design a ladder-like network to utilize weakly labeled data, thereby achieving pixel-level semantic segmentation tasks assisted by image-level classification tasks. In addition, we contribute a new roof segmentation dataset, which is based on satellite images and uses the roof as the segmentation target rather than the entire building to further promote the algorithm research of estimating roof area using satellite images. The experimental results on the new roof segmentation dataset, WHU dataset, and IAIL dataset demonstrate the effectiveness of the proposed network. Full article

(This article belongs to the Special Issue Application of Artificial Intelligence and Computer Vision for Detection and Analysis)

► Show Figures

Figure 1

13 pages, 2439 KiB

Open AccessArticle

Robust Airport Surface Object Detection Based on Graph Neural Network

by Wenyi Tang and Hongjue Li

Appl. Sci. 2024, 14(9), 3555; https://doi.org/10.3390/app14093555 - 23 Apr 2024

Cited by 1 | Viewed by 1283

Abstract

Accurate and robust object detection is of critical importance in airport surface surveillance to ensure the security of air transportation systems. Owing to the constraints imposed by a relatively fixed receptive field, existing airport surface detection methods have not yet achieved substantial advancements [...] Read more.

Accurate and robust object detection is of critical importance in airport surface surveillance to ensure the security of air transportation systems. Owing to the constraints imposed by a relatively fixed receptive field, existing airport surface detection methods have not yet achieved substantial advancements in accuracy. Furthermore, these methods are vulnerable to adversarial attacks with carefully crafted adversarial inputs. To address these challenges, we propose the Vision GNN-Edge (ViGE) block, an enhanced block derived from the Vision GNN (ViG). ViGE introduces the receptive field in pixel space and represents the spatial relation between pixels directly. Moreover, we implement an adversarial training strategy with augmented training samples generated by adversarial perturbation. Empirical evaluations on the public remote sensing dataset LEVIR and a manually collected airport surface dataset show that: 1. our proposed method surpasses the original model in precision and robustness; 2. defining the receptive field in pixel space performs better than that on representation space. Full article

(This article belongs to the Special Issue Application of Artificial Intelligence and Computer Vision for Detection and Analysis)

► Show Figures

Figure 1

24 pages, 41135 KiB

Open AccessArticle

MSIE-Net: Associative Entity-Based Multi-Stage Network for Structured Information Extraction from Reports

by Qiuyue Li, Hao Sheng, Mingxue Sheng and Honglin Wan

Appl. Sci. 2024, 14(4), 1668; https://doi.org/10.3390/app14041668 - 19 Feb 2024

Viewed by 1565

Abstract

Efficient document recognition and sharing remain challenges in the healthcare, insurance, and finance sectors. One solution to this problem has been the use of deep learning techniques to automatically extract structured information from paper documents. Specifically, the structured extraction of a medical examination [...] Read more.

Efficient document recognition and sharing remain challenges in the healthcare, insurance, and finance sectors. One solution to this problem has been the use of deep learning techniques to automatically extract structured information from paper documents. Specifically, the structured extraction of a medical examination report (MER) can enhance medical efficiency, data analysis, and scientific research. While current methods focus on reconstructing table bodies, they often overlook table headers, leading to incomplete information extraction. This paper proposes MSIE-Net (multi-stage-structured information extraction network), a novel structured information extraction method, leveraging refined attention transformers and associated entity detection to enhance comprehensive MER information retrieval. MSIE-Net includes three stages. First, the RVI-LayoutXLM (refined visual-feature independent LayoutXLM) targets key information extraction. In this stage, the refined attention accentuates the interaction between different modalities by adjusting the attention score at the current position using previous position information. This design enables the RVI-LayoutXLM to learn more specific contextual information to improve extraction performance. Next, the associated entity detection module, RIFD-Net (relevant intra-layer fine-tuned detection network), identifies each test item’s location within the MER table body. Significantly, the backbone of RIFD-Net incorporates the intra-layer feature adjustment module (IFAM) to extract global features while homing in on local areas, proving especially sensitive for inspection tasks with dense and long bins. Finally, structured post-processing based on coordinate aggregation links the outputs from the prior stages. For the evaluation, we constructed the Chinese medical examination report dataset (CMERD), based on real medical scenarios. MSIE-Net demonstrated competitive performance in tasks involving key information extraction and associated entity detection. Experimental results validate MSIE-Net’s capability to successfully detect key entities in MER and table images with various complex layouts, perform entity relation extraction, and generate structured labels, laying the groundwork for intelligent medical documentation. Full article

(This article belongs to the Special Issue Application of Artificial Intelligence and Computer Vision for Detection and Analysis)

► Show Figures

Figure 1

27 pages, 7905 KiB

Open AccessArticle

An Advanced Fitness Function Optimization Algorithm for Anomaly Intrusion Detection Using Feature Selection

by Sung-Sam Hong, Eun-joo Lee and Hwayoung Kim

Appl. Sci. 2023, 13(8), 4958; https://doi.org/10.3390/app13084958 - 14 Apr 2023

Cited by 4 | Viewed by 2800

Abstract

Cyber-security systems collect information from multiple security sensors to detect network intrusions and their models. As attacks become more complex and security systems diversify, the data used by intrusion-detection systems becomes more dimensional and large-scale. Intrusion detection based on intelligent anomaly detection detects [...] Read more.

Cyber-security systems collect information from multiple security sensors to detect network intrusions and their models. As attacks become more complex and security systems diversify, the data used by intrusion-detection systems becomes more dimensional and large-scale. Intrusion detection based on intelligent anomaly detection detects attacks based on machine-learning classification models, soft computing, and rule sets. Feature-selection methods are used for efficient intrusion detection and solving high-dimensional problems. Optimized feature selection can maximize the detection model performance; thus, a fitness function design is required. We proposed an optimization algorithm-based feature-selection algorithm to improve anomaly-detection performance. We used a genetic algorithm and proposed an advanced fitness function that finds the most relevant feature set, increasing the detection rate, reducing the error rate, and enhancing analysis speed. An improved fitness function for the selection of optimized features is proposed; this function can address overfitting by solving the problem of anomaly-detection performance from imbalanced security datasets. The proposed algorithm outperformed other feature-selection algorithms. It outperformed the PCA and wrapper-DR methods, with 0.99564 at 10%, 0.996455 at 15%, and 0.996679 at 20%. It performed higher than wrapper-DR by 0.95% and PCA by 3.76%, showing higher differences in performance than in detection rates. Full article

(This article belongs to the Special Issue Application of Artificial Intelligence and Computer Vision for Detection and Analysis)

► Show Figures

Figure 1

14 pages, 5694 KiB

Open AccessArticle

Weld Cross-Section Profile Fitting and Geometric Dimension Measurement Method Based on Machine Vision

by Weilong He, Aihua Zhang and Ping Wang

Appl. Sci. 2023, 13(7), 4455; https://doi.org/10.3390/app13074455 - 31 Mar 2023

Cited by 7 | Viewed by 2760

Abstract

A visual measurement method based on a key point detection network is proposed for the difficulty of fitting the cross-sectional profile of ultra-narrow gap welds and the low efficiency and accuracy of manual measurement of geometric parameters. First, the HRnet (High-Resolution Net) key [...] Read more.

A visual measurement method based on a key point detection network is proposed for the difficulty of fitting the cross-sectional profile of ultra-narrow gap welds and the low efficiency and accuracy of manual measurement of geometric parameters. First, the HRnet (High-Resolution Net) key point detection algorithm was used to train the feature point detection model, and 18 profile feature points in a “measurement unit” were extracted. Secondly, the feature point coordinates are transformed from the image coordinate system to the weld coordinate system, and the weld profiles are fitted by the least squares method. Finally, the measurement system is calibrated with a coplanar linear calibration algorithm to perform pixel distance to actual distance conversion for quantitative detection of geometric dimensions. The experimental results show that the accuracy of the proposed method for key point localization is 95.6%, the mean value of the coefficient of determination R-square for curve fitting is greater than 94%, the absolute error of measurement is between 0.06 and 0.15 mm, and the relative error is between 1.27% and 3.12%. The measurement results are more reliable, and the efficiency is significantly improved compared to manual measurement. Full article

(This article belongs to the Special Issue Application of Artificial Intelligence and Computer Vision for Detection and Analysis)

► Show Figures

Figure 1

Review

Jump to: Research

20 pages, 925 KiB

Open AccessReview

Advancing OCR Accuracy in Image-to-LaTeX Conversion—A Critical and Creative Exploration

by Everistus Zeluwa Orji, Ali Haydar, İbrahim Erşan and Othmar Othmar Mwambe

Appl. Sci. 2023, 13(22), 12503; https://doi.org/10.3390/app132212503 - 20 Nov 2023

Cited by 1 | Viewed by 4141

Abstract

This paper comprehensively assesses the application of active learning strategies to enhance natural language processing-based optical character recognition (OCR) models for image-to-LaTeX conversion. It addresses the existing limitations of OCR models and proposes innovative practices to strengthen their accuracy. Key components of this [...] Read more.

This paper comprehensively assesses the application of active learning strategies to enhance natural language processing-based optical character recognition (OCR) models for image-to-LaTeX conversion. It addresses the existing limitations of OCR models and proposes innovative practices to strengthen their accuracy. Key components of this study include the augmentation of training data with LaTeX syntax constraints, the integration of active learning strategies, and the employment of active learning feedback loops. This paper first examines the current weaknesses of OCR models with a particular focus on symbol recognition, complex equation handling, and noise moderation. These limitations serve as a framework against which the subsequent research methodologies are assessed. Augmenting the training data with LaTeX syntax constraints is a crucial strategy for improving model precision. Incorporating symbol relationships, wherein contextual information is considered during recognition, further enriches the error correction. This paper critically examines the application of active learning strategies. The active learning feedback loop leads to progressive improvements in accuracy. This article underlines the importance of uncertainty and diversity sampling in sample selection, ensuring that the dynamic learning process remains efficient and effective. Appropriate evaluation metrics and ensemble techniques are used to improve the operational learning effectiveness of the OCR model. These techniques allow the model to adapt and perform more effectively in diverse application domains, further extending its utility. Full article

(This article belongs to the Special Issue Application of Artificial Intelligence and Computer Vision for Detection and Analysis)

► Show Figures

Figure 1

14 pages, 675 KiB

Open AccessReview

Progress of Machine Vision Technologies in Intelligent Dairy Farming

by Yongan Zhang, Qian Zhang, Lina Zhang, Jia Li, Meian Li, Yanqiu Liu and Yanyu Shi

Appl. Sci. 2023, 13(12), 7052; https://doi.org/10.3390/app13127052 - 12 Jun 2023

Cited by 9 | Viewed by 2921

Abstract

The large-scale and precise intelligent breeding mode for dairy cows is the main direction for the development of the dairy industry. Machine vision has become an important technological means for the intelligent breeding of dairy cows due to its non-invasive, low-cost, and multi-behavior [...] Read more.

The large-scale and precise intelligent breeding mode for dairy cows is the main direction for the development of the dairy industry. Machine vision has become an important technological means for the intelligent breeding of dairy cows due to its non-invasive, low-cost, and multi-behavior recognition capabilities. This review summarizes the recent application of machine vision technology, machine learning, and deep learning in the main behavior recognition of dairy cows. The authors summarized identity recognition technology based on facial features, muzzle prints, and body features of dairy cows; motion behavior recognition technology such as lying, standing, walking, drinking, eating, rumination, estrus; and the recognition of common diseases such as lameness and mastitis. Based on current research results, machine vision technology will become one of the important technological means for the intelligent breeding of dairy cows. Finally, the author also summarized the advantages of this technology in intelligent dairy farming, as well as the problems and challenges faced in the next development. Full article

(This article belongs to the Special Issue Application of Artificial Intelligence and Computer Vision for Detection and Analysis)

► Show Figures

Figure 1

Journal Menu

Journal Browser

Application of Artificial Intelligence and Computer Vision for Detection and Analysis

Share This Special Issue

Special Issue Editors

Special Issue Information

Keywords

Benefits of Publishing in a Special Issue

Published Papers (8 papers)

Research

Review

Further Information

Guidelines

MDPI Initiatives

Follow MDPI