MDPI - Publisher of Open Access Journals

39 pages, 12608 KB

Open AccessArticle

An Audio Augmented Reality Navigation System for Blind and Visually Impaired People Integrating BIM and Computer Vision

by Leonardo Messi, Massimo Vaccarini, Alessandra Corneli, Alessandro Carbonari and Leonardo Binni

Buildings 2025, 15(18), 3252; https://doi.org/10.3390/buildings15183252 - 9 Sep 2025

Since statistics show a growing trend in blindness and visual impairment, the development of navigation systems supporting Blind and Visually Impaired People (BVIP) must be urgently addressed. Guiding BVIP to a desired destination across indoor and outdoor settings without relying on a pre-installed [...] Read more.

Since statistics show a growing trend in blindness and visual impairment, the development of navigation systems supporting Blind and Visually Impaired People (BVIP) must be urgently addressed. Guiding BVIP to a desired destination across indoor and outdoor settings without relying on a pre-installed infrastructure is an open challenge. While numerous solutions have been proposed by researchers in recent decades, a comprehensive navigation system that can support BVIP mobility in mixed and unprepared environments is still missing. This study proposes a novel navigation system that enables BVIP to request directions and be guided to a desired destination across heterogeneous and unprepared settings. To achieve this, the system applies Computer Vision (CV)—namely an integrated Structure from Motion (SfM) pipeline—for tracking the user and exploits Building Information Modelling (BIM) semantics for planning the reference path to reach the destination. Audio Augmented Reality (AAR) technology is adopted for directional guidance delivery due to its intuitive and non-intrusive nature, which allows seamless integration with traditional mobility aids (e.g., white canes or guide dogs). The developed system was tested on a university campus to assess its performance during both path planning and navigation tasks, the latter involving users in both blindfolded and sighted conditions. Quantitative results indicate that the system computed paths in about 10 milliseconds and effectively guided blindfolded users to their destination, achieving performance comparable to that of sighted users. Remarkably, users in blindfolded conditions completed navigation tests with an average deviation from the reference path within the 0.60-meter shoulder width threshold in 100% of the trials, compared to 75% of the tests conducted by sighted users. These findings demonstrate the system’s accuracy in maintaining navigational alignment within acceptable human spatial tolerances. The proposed approach contributes to the advancement of BVIP assistive technologies by enabling scalable, infrastructure-free navigation across heterogeneous environments. Full article

(This article belongs to the Special Issue Selected Papers from the “24th International Conference on Construction Applications of Virtual Reality—CONVR2024”)

► Show Figures

Figure 1

24 pages, 2822 KB

Open AccessArticle

Digitizing the Higaonon Language: A Mobile Application for Indigenous Preservation in the Philippines

by Danilyn Abingosa, Paul Bokingkito, Sittie Noffaisah Pasandalan, Jay Rey Gosnell Alovera and Jed Otano

Informatics 2025, 12(3), 90; https://doi.org/10.3390/informatics12030090 (registering DOI) - 8 Sep 2025

Abstract

This research addresses the critical need for language preservation among the Higaonon indigenous community in Mindanao, Philippines, through the development of a culturally responsive mobile dictionary application. The Higaonon language faces significant endangerment due to generational language shift, limited documentation, and a scarcity [...] Read more.

This research addresses the critical need for language preservation among the Higaonon indigenous community in Mindanao, Philippines, through the development of a culturally responsive mobile dictionary application. The Higaonon language faces significant endangerment due to generational language shift, limited documentation, and a scarcity of educational materials. Employing user-centered design principles and participatory lexicography, this study involved collaboration with tribal elders, educators, and youth to document and digitize Higaonon vocabulary across ten culturally significant semantic domains. Each Higaonon lexeme was translated into English, Filipino, and Cebuano to enhance comprehension across linguistic groups. The resulting mobile application incorporates multilingual search capabilities, offline access, phonetic transcriptions, example sentences, and culturally relevant design elements. An evaluation conducted with 30 participants (15 Higaonon and 15 non-Higaonon speakers) revealed high satisfaction ratings across functionality (4.81/5.0), usability (4.63/5.0), and performance (4.73/5.0). Offline accessibility emerged as the most valued feature (4.93/5.0), while comparative analysis identified meaningful differences in user experience between native and non-native speakers, with Higaonon users providing more critical assessments particularly regarding font readability and performance optimization. The application demonstrates how community-driven technological interventions can support indigenous language revitalization while respecting cultural integrity, intellectual property rights, and addressing practical community needs. This research establishes a framework for ethical indigenous language documentation that prioritizes community self-determination and provides empirical evidence that culturally responsive digital technologies can effectively preserve endangered languages while serving as repositories for cultural knowledge embedded within linguistic systems. Full article

► Show Figures

Figure 1

21 pages, 4263 KB

Open AccessArticle

SA-Encoder: A Learnt Spatial Autocorrelation Representation to Inform 3D Geospatial Object Detection

by Tianyang Chen, Wenwu Tang, Shen-En Chen and Craig Allan

Remote Sens. 2025, 17(17), 3124; https://doi.org/10.3390/rs17173124 - 8 Sep 2025

Abstract

Contextual features play a critical role in geospatial object detection by characterizing the surrounding environment of objects. In existing deep learning-based studies of 3D point cloud classification and segmentation, these features have been represented through geometric descriptors, semantic context (i.e., modeled by an [...] Read more.

Contextual features play a critical role in geospatial object detection by characterizing the surrounding environment of objects. In existing deep learning-based studies of 3D point cloud classification and segmentation, these features have been represented through geometric descriptors, semantic context (i.e., modeled by an attention-based mechanism), global-level context (i.e., through global aggregation), and textural representation (e.g., RGB, intensity, and other attributes). Even though contextual features have been widely explored, spatial contextual features that explicitly capture spatial autocorrelation and neighborhood dependency have received limited attention in object detection tasks. This gap is particularly relevant in the context of GeoAI, which calls for mutual benefits between artificial intelligence and geographic information science. To bridge this gap, this study presents a spatial autocorrelation encoder, namely SA-Encoder, designed to inform 3D geospatial object detection by capturing spatial autocorrelation representation as types of spatial contextual features. The study investigated the effectiveness of such spatial contextual features by estimating the performance of a model trained on them alone. The results suggested that the derived spatial autocorrelation information can help adequately identify some large objects in an urban-rural scene, such as buildings, terrain, and large trees. We further investigated how the spatial autocorrelation encoder can inform model performance in a geospatial object detection task. The results demonstrated significant improvements in detection accuracy across varied urban and rural environments when we compared the results to models without considering spatial autocorrelation as an ablation experiment. Moreover, the approach also outperformed the models trained by explicitly feeding traditional spatial autocorrelation measures (i.e., Matheron’s semivariance). This study showcases the advantage of the adaptiveness of the neural network-based encoder in deriving a spatial autocorrelation representation. This advancement bridges the gap between theoretical geospatial concepts and practical AI applications. Consequently, this study demonstrates the potential of integrating geographic theories with deep learning technologies to address challenges in 3D object detection, paving the way for further innovations in this field. Full article

(This article belongs to the Section AI Remote Sensing)

30 pages, 10155 KB

Open AccessArticle

Interoperable Semantic Systems in Public Administration: AI-Driven Data Mining from Law-Enforcement Reports

by Alexandros Z. Spyropoulos and Vassilis Tsiantos

Computers 2025, 14(9), 376; https://doi.org/10.3390/computers14090376 - 8 Sep 2025

Abstract

The digitisation of law-enforcement archives is examined with the aim of moving from static analogue records to interoperable semantic information systems. A step-by-step framework for optimal digitisation is proposed, grounded in archival best practice and enriched with artificial-intelligence and semantic-web technologies. Emphasis is [...] Read more.

The digitisation of law-enforcement archives is examined with the aim of moving from static analogue records to interoperable semantic information systems. A step-by-step framework for optimal digitisation is proposed, grounded in archival best practice and enriched with artificial-intelligence and semantic-web technologies. Emphasis is placed on semantic data representation, which renders information actionable, searchable, interlinked, and automatically processed. As a proof of concept, a large language model—OpenAI ChatGPT, version o3—was applied to a corpus of narrative police reports, extracting and classifying key entities (metadata, persons, addresses, vehicles, incidents, fingerprints, and inter-entity relationships). The output was converted to Resource Description Framework triples and ingested into a triplestore, demonstrating how unstructured text can be transformed into machine-readable, interoperable data with minimal human intervention. The approach’s challenges—technical complexity, data quality assurance, information-security requirements, and staff training—are analysed alongside the opportunities it affords, such as accelerated access to records, cross-agency interoperability, and advanced analytics for investigative and strategic decision-making. Combining systematic digitisation, AI-driven data extraction, and rigorous semantic modelling ultimately delivers a fully interoperable information environment for law-enforcement agencies, enhancing efficiency, transparency, and evidentiary integrity. Full article

(This article belongs to the Special Issue Advances in Semantic Multimedia and Personalized Digital Content)

► Show Figures

Figure 1

17 pages, 2128 KB

Open AccessArticle

Vision-Based Highway Lane Extraction from UAV Imagery: A Deep Learning and Geometric Constraints Approach

by Jin Wang, Guangjun He, Xiuwang Dai, Feng Wang and Yanxin Zhang

Electronics 2025, 14(17), 3554; https://doi.org/10.3390/electronics14173554 - 6 Sep 2025

Viewed by 210

Abstract

The rapid evolution of unmanned aerial vehicle (UAV) technology and low-altitude economic development have propelled drone applications in critical infrastructure monitoring, particularly in intelligent transportation systems where real-time aerial image processing has emerged as a pressing requirement. We address the pivotal challenge of [...] Read more.

The rapid evolution of unmanned aerial vehicle (UAV) technology and low-altitude economic development have propelled drone applications in critical infrastructure monitoring, particularly in intelligent transportation systems where real-time aerial image processing has emerged as a pressing requirement. We address the pivotal challenge of highway lane extraction from low-altitude UAV perspectives by applying a novel three-stage framework. This framework consists of (1) a deep learning-based semantic segmentation module that employs an enhanced STDC network with boundary-aware loss for precise detection of roads and lane markings; (2) an optimized polynomial fitting algorithm incorporating iterative classification and adaptive error thresholds to achieve robust lane marking consolidation; and (3) a global optimization module designed for context-aware lane generation. Our methodology demonstrates superior performance with 94.11% F1-score and 93.84% IoU, effectively bridging the technical gap in UAV-based lane extraction while establishing a reliable foundation for advanced traffic monitoring applications. Full article

► Show Figures

Figure 1

35 pages, 1992 KB

Open AccessArticle

Integrating Large Language Models into a Novel Intuitionistic Fuzzy PROBID Method for Multi-Criteria Decision-Making Problems

by Ferry Anhao, Amir Karbassi Yazdi, Yong Tan and Lanndon Ocampo

Mathematics 2025, 13(17), 2878; https://doi.org/10.3390/math13172878 - 5 Sep 2025

Viewed by 192

Abstract

As vision and mission statements embody the directions set forth by an organization, their connection to the Sustainable Development Goals (SDGs) must be made explicit to guide overall decision-making in taking strides toward the sustainability agenda. The semantic alignment of these strategic statements [...] Read more.

As vision and mission statements embody the directions set forth by an organization, their connection to the Sustainable Development Goals (SDGs) must be made explicit to guide overall decision-making in taking strides toward the sustainability agenda. The semantic alignment of these strategic statements with the SDGs is investigated in a previous study, although several limitations need further exploration. Thus, this study aims to advance two contributions: (1) utilizing the capabilities of LLMs (Large Language Models) in text semantic analysis and (2) integrating fuzziness into the problem domain by using a novel intuitionistic fuzzy set extension of the PROBID (Preference Ranking On the Basis of Ideal-average Distance) method. First, a systematic approach evaluates the semantic alignment of organizational strategic statements with the SDGs by leveraging the use of LLMs in semantic similarity and relatedness tasks. Second, viewing it as a multi-criteria decision-making (MCDM) problem and recognizing the limitations of LLMs, the evaluations are represented as intuitionistic fuzzy sets (IFSs), which prompted the development of an IF extension of the PROBID method. The proposed IF-PROBID method was then deployed to evaluate the 47 top Philippine corporations. Utilizing ChatGPT 3.5, 7990 prompts with repetitions generated the membership, non-membership, and hesitance scores for each evaluation. Also, we developed a cohort-dependent SDG–vision–mission matrix that categorizes corporations into four distinct classifications. Findings suggest that “highly-aligned” corporations belong to the private and technology sectors, with some in the industrial and real estate sectors. Meanwhile, “weakly-aligned” corporations come from the manufacturing and private sectors. In addition, case-specific insights are presented in this work. The comparative analysis yields a high agreement between the results and those generated by other IF-MCDM extensions. This paper is the first to demonstrate two methodological advances: (1) the integration of LLMs in MCDM problems and (2) the development of the IF-PROBID method that handles the resulting inherently imprecise evaluations. Full article

(This article belongs to the Special Issue Advances in Operations Research: Applications in MCDA, Fuzzy Systems, DEA and Optimization)

► Show Figures

Figure 1

33 pages, 21287 KB

Open AccessArticle

Interactive, Shallow Machine Learning-Based Semantic Segmentation of 2D and 3D Geophysical Data from Archaeological Sites

by Lieven Verdonck, Michel Dabas and Marc Bui

Remote Sens. 2025, 17(17), 3092; https://doi.org/10.3390/rs17173092 - 4 Sep 2025

Viewed by 490

Abstract

In recent decades, technological developments in archaeological geophysics have led to growing data volumes, so that an important bottleneck is now at the stage of data interpretation. The manual delineation and classification of anomalies are time-consuming, and different methods for (semi-)automatic image segmentation [...] Read more.

In recent decades, technological developments in archaeological geophysics have led to growing data volumes, so that an important bottleneck is now at the stage of data interpretation. The manual delineation and classification of anomalies are time-consuming, and different methods for (semi-)automatic image segmentation have been proposed, based on explicitly formulated rulesets or deep convolutional neural networks (DCNNs). So far, these have not been used widely in archaeological geophysics because of the complexity of the segmentation task (due to the low contrast between archaeological structures and background and the low predictability of the targets). Techniques based on shallow machine learning (e.g., random forests, RFs) have been explored very little in archaeological geophysics, although they are less case-specific than most rule-based methods, do not require large training sets as is the case for DCNNs, and can easily handle 3D data. In this paper, we show their potential for geophysical data analysis. For the classification on the pixel level, we use ilastik, an open-source segmentation tool developed in medical imaging. Algorithms for object classification, manual reclassification, post-processing, vectorisation, and georeferencing were brought together in a Jupyter Notebook, available on GitHub (version 7.3.2). To assess the accuracy of the RF classification applied to geophysical datasets, we compare it with manual interpretation. A quantitative evaluation using the mean intersection over union metric results in scores of ~60%, which only slightly increases after the manual correction of the RF classification results. Remarkably, a similar score results from the comparison between independent manual interpretations. This observation illustrates that quantitative metrics are not a panacea for evaluating machine-generated geophysical data interpretation in archaeology, which is characterised by a significant degree of uncertainty. It also raises the question of how the semantic segmentation of geophysical data (whether carried out manually or with the aid of machine learning) can best be evaluated. Full article

(This article belongs to the Special Issue Multiscale and Multitemporal High Resolution Remote Sensing for Archaeology and Heritage: From Research to Preservation)

► Show Figures

Figure 1

31 pages, 1545 KB

Open AccessArticle

The Complexity of eHealth Architecture: Lessons Learned from Application Use Cases

by Annalisa Barsotti, Gerl Armin, Wilhelm Sebastian, Massimiliano Donati, Stefano Dalmiani and Claudio Passino

Computers 2025, 14(9), 371; https://doi.org/10.3390/computers14090371 - 4 Sep 2025

Viewed by 312

Abstract

The rapid evolution of eHealth technologies has revolutionized healthcare, enabling data-driven decision-making and personalized care. Central to this transformation is interoperability, which ensures seamless communication among heterogeneous systems. This paper explores the critical role of interoperability, data management processes, and the use [...] Read more.

The rapid evolution of eHealth technologies has revolutionized healthcare, enabling data-driven decision-making and personalized care. Central to this transformation is interoperability, which ensures seamless communication among heterogeneous systems. This paper explores the critical role of interoperability, data management processes, and the use of international standards in enabling integrated healthcare solutions. We present an overview of interoperability dimensions—technical, semantic, and organizational—and align them with data management phases in a concise eHealth architecture. Furthermore, we examine two practical European use cases to demonstrate the extend of the proposed eHealth architecture, involving patients, environments, third parties, and healthcare providers. Full article

(This article belongs to the Special Issue System-Integrated Intelligence and Intelligent Systems 2025)

► Show Figures

Figure 1

30 pages, 1729 KB

Open AccessArticle

FiCT-O: Modelling Fictional Characters in Detective Fiction from the 19th to the 20th Century

by Enrica Bruno, Lorenzo Sabatino and Francesca Tomasi

Humanities 2025, 14(9), 180; https://doi.org/10.3390/h14090180 - 3 Sep 2025

Viewed by 225

Abstract

This paper proposes a formal descriptive model for understanding the evolution of characters in detective fiction from the 19th to the 20th century, using methodologies and technologies from the Semantic Web. The integration of Digital Humanities within the theory of comparative literature opens [...] Read more.

This paper proposes a formal descriptive model for understanding the evolution of characters in detective fiction from the 19th to the 20th century, using methodologies and technologies from the Semantic Web. The integration of Digital Humanities within the theory of comparative literature opens new paths of study that allow for a digital approach to the understanding of intertextuality through close reading techniques and ontological modelling. In this research area, the variety of possible textual relationships, the levels of analysis required to classify these connections, and the inherently referential nature of certain literary genres demand a structured taxonomy. This taxonomy should account for stylistic elements, narrative structures, and cultural recursiveness that are unique to literary texts. The detective figure, central to modern literature, provides an ideal lens for examining narrative intertextuality across the 19th and 20th centuries. The analysis concentrates on character traits and narrative functions, addressing various methods of rewriting within the evolving cultural and creative context of authorship. Through a comparative examination of a representative sample of detective fiction from the period under scrutiny, the research identifies mechanisms of (meta)narrative recurrence, transformation, and reworking within the canon. The outcome is a formal model for describing narrative structures and techniques, with a specific focus on character development, aimed at uncovering patterns of continuity and variation in diegetic content over time and across different works, adaptable to analogous cases of traditional reworking and narrative fluidity. Full article

(This article belongs to the Special Issue The Interpretation of Fictional Characters in Literary Texts: History of Literary Criticism, Philosophy and Formal Ontologies)

► Show Figures

Figure 1

25 pages, 5491 KB

Open AccessArticle

When BIM Meets MBSE: Building a Semantic Bridge for Infrastructure Data Integration

by Joseph Murphy, Siyuan Ji, Charles Dickerson, Chris Goodier, Sonia Zahiroddiny and Tony Thorpe

Systems 2025, 13(9), 770; https://doi.org/10.3390/systems13090770 - 2 Sep 2025

Viewed by 300

Abstract

The global infrastructure industry is faced with increasing system complexity and requirements driven by the Sustainable Development Goals, technological advancements, and the shift from Industry 4.0 to human-centric 5.0 principles. Coupled with persistent infrastructure investment deficits, these pressures necessitate improved methods for efficient [...] Read more.

The global infrastructure industry is faced with increasing system complexity and requirements driven by the Sustainable Development Goals, technological advancements, and the shift from Industry 4.0 to human-centric 5.0 principles. Coupled with persistent infrastructure investment deficits, these pressures necessitate improved methods for efficient requirements management and validation. While digital twins promise transformative real-time decision-making, reliance on static unstructured data formats inhibits progress. This paper presents a novel framework that integrates Building Information Modelling (BIM) and Model-Based Systems Engineering (MBSE), using Linked Data principles to preserve semantic meaning during information exchange between physical abstractions and requirements. The proposed approach automates a step of compliance validation against regulatory standards explored through a case study, utilising requirements from a high-speed railway station fire safety system and a modified duplex apartment digital model. The workflow (i) digitises static documents into machine-readable MBSE formats, (ii) integrates structured data into dynamic digital models, and (iii) creates foundations for data exchange to enable compliance validation. These findings highlight the framework’s ability to enhance traceability, bridge static and dynamic data gaps, and provide decision-making support in digital twin environments. This study advances the application of Linked Data in infrastructure, enabling broader integration of ontologies required for dynamic decision-making trade-offs. Full article

(This article belongs to the Special Issue Digital and Data-Driven Systems Engineering: Bridging Theory and Practice)

► Show Figures

Figure 1

28 pages, 2320 KB

Open AccessArticle

Fostering Embodied and Attitudinal Change Through Immersive Storytelling: A Hybrid Evaluation Approach for Sustainability Education

by Stefania Palmieri, Giuseppe Lotti, Mario Bisson, Eleonora D’Ascenzi and Claudia Spinò

Sustainability 2025, 17(17), 7885; https://doi.org/10.3390/su17177885 - 2 Sep 2025

Viewed by 414

Abstract

Immersive technologies are increasingly acknowledged as powerful tools in sustainability education, capable of fostering deeper engagement and emotional resonance. This study investigates the potential of 360° VR storytelling to enhance learning through embodied knowledge, attitudinal change, and emotional awareness. Conducted within the EMOTIONAL [...] Read more.

Immersive technologies are increasingly acknowledged as powerful tools in sustainability education, capable of fostering deeper engagement and emotional resonance. This study investigates the potential of 360° VR storytelling to enhance learning through embodied knowledge, attitudinal change, and emotional awareness. Conducted within the EMOTIONAL project, the research explores a first-person narrative told from the perspective of a ceramic object rooted in Italian cultural heritage, designed to facilitate meaningful, affective learning. The present study addresses the following research questions: RQ1 Can 360° VR story-living narrations effectively promote embodied learning and semantic and attitudinal shifts in the context of sustainability education? RQ2 What added insights can be gained from integrating subjective assessments with physiological measures? To this end, a hybrid assessment framework was developed and validated, combining subjective self-report tools (including attitudinal scales, semantic differential analysis, and engagement metrics) with objective physiological measures, specifically Electrodermal Activity (EDA). Sixty participants, including students and entrepreneurs, experienced the immersive narrative, and a subset underwent physiological tracking to evaluate the effectiveness of the experience. The findings show that immersive storytelling can enhance emotional and cognitive engagement, producing shifts in semantic interpretation, self-perceived knowledge, and attitudes toward material culture. A convergence of high emotional engagement, embodied learning, and technology acceptance was observed, although individual differences emerged based on prior experience and disciplinary background. EDA data offered complementary insights, identifying specific moments of heightened arousal during the narrative. The study demonstrates that emotionally driven immersive narratives (supported by integrated assessment methods) can make abstract sustainability values more tangible and personally resonant, thereby fostering more reflective and relational approaches to sustainable consumption and production. Full article

(This article belongs to the Special Issue Fostering Sustainable Education and Enhancing Student Creativity Through Innovative Educational Technologies)

► Show Figures

Figure 1

20 pages, 487 KB

Open AccessArticle

NLP and Text Mining for Enriching IT Professional Skills Frameworks

by Danial Zare, Luis Fernandez-Sanz, Vera Pospelova and Inés López-Baldominos

Appl. Sci. 2025, 15(17), 9634; https://doi.org/10.3390/app15179634 - 1 Sep 2025

Viewed by 390

Abstract

The European e-Competence Framework (e-CF) and the European Skills, Competences, Qualifications and Occupations (ESCO) classification are two key initiatives developed by the European Commission to support skills transparency, mobility, and interoperability across labour and education systems. While e-CF defines essential competences for ICT [...] Read more.

The European e-Competence Framework (e-CF) and the European Skills, Competences, Qualifications and Occupations (ESCO) classification are two key initiatives developed by the European Commission to support skills transparency, mobility, and interoperability across labour and education systems. While e-CF defines essential competences for ICT professionals through a structured framework, it provides only a limited number of illustrative skills and knowledge examples for each competence. In contrast, ESCO offers a rich, multilingual taxonomy of skills and knowledge, each accompanied by a detailed description, alternative labels, and links to relevant occupations. This paper explores the possibility of enriching the e-CF framework by linking it to relevant ESCO ICT skills using text embedding (MPNet) and cosine similarity. This approach allows the extension to 15–25 semantically aligned skills and knowledge items per competence in e-CF, all with full description and officially translated into all EU languages, instead of the present amount of 4–10 brief examples. This significantly improves the clarity, usability, and interpretability of e-CF competences for the various stakeholders. Furthermore, since ESCO terminology serves as the foundation for labour market analysis across the EU, establishing this linkage provides a valuable bridge between the e-CF competence model and real-time labour market intelligence, a connection not available now. The results of this study offer practical insights into the application of semantic technologies to the enhancement and mutual alignment of European ICT skills frameworks. Full article

(This article belongs to the Special Issue Natural Language Processing (NLP) and Applications—2nd Edition)

► Show Figures

Figure 1

19 pages, 2082 KB

Open AccessArticle

Multi-Scale Grid-Based Semantic Surface Point Generation for 3D Object Detection

by Xin-Fu Chen, Chun-Chieh Lee, Jung-Hua Lo, Chi-Hung Chuang and Kuo-Chin Fan

Electronics 2025, 14(17), 3492; https://doi.org/10.3390/electronics14173492 - 31 Aug 2025

Viewed by 343

Abstract

3D object detection is a crucial technology in fields such as autonomous driving and robotics. As a direct representation of the 3D world, point cloud data plays a vital role in feature extraction and geometric representation. However, in real-world applications, point cloud data [...] Read more.

3D object detection is a crucial technology in fields such as autonomous driving and robotics. As a direct representation of the 3D world, point cloud data plays a vital role in feature extraction and geometric representation. However, in real-world applications, point cloud data often suffers from occlusion, resulting in incomplete observations and degraded detection performance. Existing methods, such as PG-RCNN, generate semantic surface points within each Region of Interest (RoI) using a single grid size. However, a fixed grid scale cannot adequately capture multi-scale features. A grid that is too small may miss fine structures—especially problematic when dealing with small or sparse objects—while a grid that is too large may introduce excessive background noise, reducing the precision of feature representation. To address this issue, we propose an enhanced PG-RCNN architecture with a Multi-Scale Grid Attention Module as the core contribution. This module improves the expressiveness of point features by aggregating multi-scale information and dynamically weighting features from different grid resolutions. Using a simple linear transformation, we generate attention weights to guide the model to focus on regions that contribute more to object recognition, while effectively filtering out redundant noise. We evaluate our method on the KITTI 3D object detection validation set. Experimental results show that, compared to the original PG-RCNN, our approach improves performance on the Cyclist category by 2.66% and 2.54% in the Moderate and Hard settings, respectively. Additionally, our approach shows more stable performance on small object detection tasks, with an average improvement of 2.57%, validating the positive impact of the Multi-Scale Grid Attention Module on fine-grained geometric modeling, and highlighting the efficiency and generalizability of our model. Full article

(This article belongs to the Special Issue Digital Signal and Image Processing for Multimedia Technology)

► Show Figures

Figure 1

24 pages, 477 KB

Open AccessSystematic Review

Ontologies for the Reconfiguration of Domestic Living Environments: A Systematic Literature Review

by Daniele Spoladore

Information 2025, 16(9), 752; https://doi.org/10.3390/info16090752 - 29 Aug 2025

Viewed by 293

Abstract

The aging population in Europe and other developed regions is accelerating the demand for adaptable domestic environments that support independent living and care at home. In this context, ontologies offer a promising approach to represent and manage knowledge about built environments, smart technologies, [...] Read more.

The aging population in Europe and other developed regions is accelerating the demand for adaptable domestic environments that support independent living and care at home. In this context, ontologies offer a promising approach to represent and manage knowledge about built environments, smart technologies, and user needs—especially within Ambient Assisted Living (AAL) systems. This paper presents a systematic literature review examining the role of ontologies in the reconfiguration of domestic living spaces, with a focus on their application in design processes and decision support systems. Following the PRISMA methodology, 14 relevant works published between 2000 and 2025 were identified and analyzed. The review explores key aspects such as ontology conceptualization, reuse, engineering methodologies, integration with CAD systems, and validation practices. The results show that research on this topic is fragmented yet growing, with the first contribution dated 2005 and peaks in 2016, 2018, and 2024. Most works (11) were conference papers, with Europe leading the contributions, particularly Italy. Half of the reviewed ontologies were developed “from scratch”, while the rest relied on conceptualizations such as BIM. Ontology reuse was inconsistent: only 50% of works reused existing models (e.g., SAREF, SOSA, BOT, ifcOWL), and few adopted Ontology Design Patterns. While 11 works followed ontology engineering methodologies—mostly custom or established methods such as Methontology or NeOn—stakeholder collaboration was reported in less than 36% of cases. Validation practices were weak: only six studies presented use cases or demonstrators. Integration with CAD systems remains at a prototypical stage, primarily through semantic enrichment and SWRL-based reasoning layers. Remaining gaps include poor ontology accessibility (few provide URLs or W3IDs), limited FAIR compliance, and scarce modeling of end-user needs, despite their relevance for AAL solutions. The review highlights opportunities for collaborative, human-centered ontology development aligned with architectural and medical standards to enable scalable, interoperable, and user-driven reconfiguration of domestic environments. Full article

(This article belongs to the Special Issue Knowledge Representation and Ontology-Based Data Management)

► Show Figures

Graphical abstract

21 pages, 2434 KB

Open AccessArticle

MBFILNet: A Multi-Branch Detection Network for Autonomous Mining Trucks in Dusty Environments

by Fei-Xiang Xu, Di-Long Zhu, Yu-Peng Hu, Rui Zhang and Chen Zhou

Sensors 2025, 25(17), 5324; https://doi.org/10.3390/s25175324 - 27 Aug 2025

Viewed by 442

Abstract

As a critical technology of autonomous mining trucks, object detection directly determines system safety and operational reliability. However, autonomous mining trucks often work in dusty open-pit environments, in which dusty interference significantly degrades the accuracy of object detection. To overcome the problem mentioned [...] Read more.

As a critical technology of autonomous mining trucks, object detection directly determines system safety and operational reliability. However, autonomous mining trucks often work in dusty open-pit environments, in which dusty interference significantly degrades the accuracy of object detection. To overcome the problem mentioned above, a multi-branch feature interaction and location detection network (MBFILNet) is proposed in this study, consisting of multi-branch feature interaction with differential operation (MBFI-DO) and depthwise separable convolution-enhanced non-local attention (DSC-NLA). On one hand, MBFI-DO not only strengthens the extraction of channel-wise semantic features but also improves the representation of salient features of images with dusty interference. On the other hand, DSC-NLA is used to capture long-range spatial dependencies to focus on target-object structural information. Furthermore, a custom dataset called Dusty Open-pit Mining (DOM) is constructed, which is augmented using a cycle-consistent generative adversarial network (CycleGAN). Finally, a large number of experiments based on DOM are conducted to evaluate the performance of MBFILNet in dusty open-pit environments. The results show that MBFILNet achieves a mean Average Precision (mAP) of 72.0% based on the DOM dataset, representing a 1.3% increase compared to the Featenhancer model. Moreover, in comparison with YOLOv8, there is an astounding 2% increase in the mAP based on MBFILNet, demonstrating detection accuracy in dusty open-pit environments can be effectively improved with the method proposed in this paper. Full article

(This article belongs to the Special Issue Application of Advanced Perception Technology in Vehicle Intelligent Control)

► Show Figures

Figure 1

Search Results (1,407)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (1,407)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI