Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (67)

Search Parameters:
Keywords = indoor scene recognition

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
24 pages, 4829 KB  
Article
Home Robot Interaction Based on EEG Motor Imagery and Visual Perception Fusion
by Tie Hua Zhou, Dongsheng Li, Zhiwei Jian, Wei Ding and Ling Wang
Sensors 2025, 25(17), 5568; https://doi.org/10.3390/s25175568 - 6 Sep 2025
Viewed by 991
Abstract
Amid the intensification of demographic aging, home robots based on intelligent technology have shown great application potential in assisting the daily life of the elderly. This paper proposes a multimodal human–robot interaction system that integrates EEG signal analysis and visual perception, aiming to [...] Read more.
Amid the intensification of demographic aging, home robots based on intelligent technology have shown great application potential in assisting the daily life of the elderly. This paper proposes a multimodal human–robot interaction system that integrates EEG signal analysis and visual perception, aiming to realize the perception ability of home robots on the intentions and environment of the elderly. Firstly, a channel selection strategy is employed to identify the most discriminative electrode channels based on Motor Imagery (MI) EEG signals; then, the signal representation ability is improved by combining Filter Bank co-Spatial Patterns (FBCSP), wavelet packet decomposition and nonlinear features, and one-to-many Support Vector Regression (SVR) is used to achieve four-class classification. Secondly, the YOLO v8 model is applied for identifying objects within indoor scenes. Subsequently, object confidence and spatial distribution are extracted, and scene recognition is performed using a Machine Learning technique. Finally, the EEG classification results are combined with the scene recognition results to establish the scene-intention correspondence, so as to realize the recognition of the intention-driven task types of the elderly in different home scenes. Performance evaluation reveals that the proposed method attains a recognition accuracy of 83.4%, which indicates that this method has good classification accuracy and practical application value in multimodal perception and human–robot collaborative interaction, and provides technical support for the development of smarter and more personalized home assistance robots. Full article
(This article belongs to the Section Electronic Sensors)
Show Figures

Figure 1

16 pages, 2233 KB  
Article
Research on Fingerprint Map Construction and Real-Time Update Method Based on Indoor Landmark Points
by Yaning Zhu and Yihua Cheng
Sensors 2025, 25(17), 5473; https://doi.org/10.3390/s25175473 - 3 Sep 2025
Viewed by 525
Abstract
WIFI base stations have full indoor coverage, and the inertial navigation system (INS) is independent and autonomous, with high short-term positioning accuracy. However, errors accumulate over time, and an INS/WIFI combination has become the mainstream research direction regarding indoor positioning technology. The accuracy [...] Read more.
WIFI base stations have full indoor coverage, and the inertial navigation system (INS) is independent and autonomous, with high short-term positioning accuracy. However, errors accumulate over time, and an INS/WIFI combination has become the mainstream research direction regarding indoor positioning technology. The accuracy of WIFI fingerprint maps deteriorates significantly with changes in the environment or time, and there is an urgent need to solve the problem of automatic real-time updating of fingerprint maps. This article addresses the issue that the existing real-time acquisition technology for fingerprint point locations has severely restricted the real-time updating of fingerprint maps. For the first time, landmark points are introduced into the fingerprint map, and landmark point fingerprints are defined to construct a new fingerprint map database structure. A method for automatic recognition of landmark points (turning points) based on inertial technology is proposed, which achieves automatic and accurate collection of landmark point fingerprints and improves the reliability of crowdsourcing data. Real-time automatic monitoring of fingerprint signal fluctuations at landmark points and construction of error models have achieved real-time and accurate updates of fingerprint maps. Real scene experiments have shown that the proposed solution significantly improves the long-term stability and reliability of fingerprint maps. Full article
(This article belongs to the Section Navigation and Positioning)
Show Figures

Figure 1

19 pages, 2591 KB  
Article
A Comprehensive Hybrid Approach for Indoor Scene Recognition Combining CNNs and Text-Based Features
by Taner Uckan, Cengiz Aslan and Cengiz Hark
Sensors 2025, 25(17), 5350; https://doi.org/10.3390/s25175350 - 29 Aug 2025
Viewed by 643
Abstract
Indoor scene recognition is a computer vision task that identifies various indoor environments, such as offices, libraries, kitchens, and restaurants. This research area is particularly significant for applications in robotics, security, and assistance for individuals with disabilities, as it enables the categorization of [...] Read more.
Indoor scene recognition is a computer vision task that identifies various indoor environments, such as offices, libraries, kitchens, and restaurants. This research area is particularly significant for applications in robotics, security, and assistance for individuals with disabilities, as it enables the categorization of spaces and the provision of contextual information. Convolutional Neural Networks (CNNs) are commonly employed in this field. While CNNs perform well in outdoor scene recognition by focusing on global features such as mountains and skies, they often struggle with indoor scenes, where local features like furniture and objects are more critical. In this study, the “MIT 67 Indoor Scene” dataset is used to extract and combine features from both a CNN and a text-based model utilizing object recognition outputs, resulting in a two-channel hybrid model. The experimental results demonstrate that this hybrid approach, which integrates natural language processing and image processing techniques, improves the test accuracy of the image processing model by 8.3%, achieving a notable success rate. Furthermore, this study offers contributions to new application areas in remote sensing, particularly in indoor scene understanding and indoor mapping. Full article
(This article belongs to the Section Sensor Networks)
Show Figures

Figure 1

27 pages, 7905 KB  
Article
SimID: Wi-Fi-Based Few-Shot Cross-Domain User Recognition with Identity Similarity Learning
by Zhijian Wang, Lei Ouyang, Shi Chen, Han Ding, Ge Wang and Fei Wang
Sensors 2025, 25(16), 5151; https://doi.org/10.3390/s25165151 - 19 Aug 2025
Viewed by 529
Abstract
In recent years, indoor user identification via Wi-Fi signals has emerged as a vibrant research area in smart homes and the Internet of Things, thanks to its privacy preservation, immunity to lighting conditions, and ease of large-scale deployment. Conventional deep-learning classifiers, however, suffer [...] Read more.
In recent years, indoor user identification via Wi-Fi signals has emerged as a vibrant research area in smart homes and the Internet of Things, thanks to its privacy preservation, immunity to lighting conditions, and ease of large-scale deployment. Conventional deep-learning classifiers, however, suffer from poor generalization and demand extensive pre-collected data for every new scenario. To overcome these limitations, we introduce SimID, a few-shot Wi-Fi user recognition framework based on identity-similarity learning rather than conventional classification. SimID embeds user-specific signal features into a high-dimensional space, encouraging samples from the same individual to exhibit greater pairwise similarity. Once trained, new users can be recognized simply by comparing their Wi-Fi signal “query” against a small set of stored templates—potentially as few as a single sample—without any additional retraining. This design not only supports few-shot identification of unseen users but also adapts seamlessly to novel movement patterns in unfamiliar environments. On the large-scale XRF55 dataset, SimID achieves average accuracies of 97.53%, 93.37%, 92.38%, and 92.10% in cross-action, cross-person, cross-action-and-person, and cross-person-and-scene few-shot scenarios, respectively. These results demonstrate SimID’s promise for robust, data-efficient indoor identity recognition in smart homes, healthcare, security, and beyond. Full article
(This article belongs to the Special Issue Feature Papers in the 'Sensor Networks' Section 2025)
Show Figures

Figure 1

23 pages, 4467 KB  
Article
Research on Indoor Object Detection and Scene Recognition Algorithm Based on Apriori Algorithm and Mobile-EFSSD Model
by Wenda Zheng, Yibo Ai and Weidong Zhang
Mathematics 2025, 13(15), 2408; https://doi.org/10.3390/math13152408 - 26 Jul 2025
Viewed by 448
Abstract
With the advancement of computer vision and image processing technologies, scene recognition has gradually become a research hotspot. However, in practical applications, it is necessary to detect the categories and locations of objects in images while recognizing scenes. To address these issues, this [...] Read more.
With the advancement of computer vision and image processing technologies, scene recognition has gradually become a research hotspot. However, in practical applications, it is necessary to detect the categories and locations of objects in images while recognizing scenes. To address these issues, this paper proposes an indoor object detection and scene recognition algorithm based on the Apriori algorithm and the Mobile-EFSSD model, which can simultaneously obtain object category and location information while recognizing scenes. The specific research contents are as follows: (1) To address complex indoor scenes and occlusion, this paper proposes an improved Mobile-EFSSD object detection algorithm. An optimized MobileNetV3 with ECA attention is used as the backbone. Multi-scale feature maps are fused via FPN. The localization loss includes a hyperparameter, and focal loss replaces confidence loss. Experiments show that the method achieves stable performance, effectively detects occluded objects, and accurately extracts category and location information. (2) To improve classification stability in indoor scene recognition, this paper proposes a naive Bayes-based method. Object detection results are converted into text features, and the Apriori algorithm extracts object associations. Prior probabilities are calculated and fed into a naive Bayes classifier for scene recognition. Evaluated using the ADE20K dataset, the method outperforms existing approaches by achieving a better accuracy–speed trade-off and enhanced classification stability. The proposed algorithm is applied to indoor scene images, enabling the simultaneous acquisition of object categories and location information while recognizing scenes. Moreover, the algorithm has a simple structure, with an object detection average precision of 82.7% and a scene recognition average accuracy of 95.23%, making it suitable for practical detection requirements. Full article
Show Figures

Figure 1

14 pages, 743 KB  
Article
AD-VAE: Adversarial Disentangling Variational Autoencoder
by Adson Silva and Ricardo Farias
Sensors 2025, 25(5), 1574; https://doi.org/10.3390/s25051574 - 4 Mar 2025
Cited by 1 | Viewed by 1280
Abstract
Face recognition (FR) is a less intrusive biometrics technology with various applications, such as security, surveillance, and access control systems. FR remains challenging, especially when there is only a single image per person as a gallery dataset and when dealing with variations like [...] Read more.
Face recognition (FR) is a less intrusive biometrics technology with various applications, such as security, surveillance, and access control systems. FR remains challenging, especially when there is only a single image per person as a gallery dataset and when dealing with variations like pose, illumination, and occlusion. Deep learning techniques have shown promising results in recent years using VAE and GAN, with approaches such as patch-VAE, VAE-GAN for 3D Indoor Scene Synthesis, and hybrid VAE-GAN models. However, in Single Sample Per Person Face Recognition (SSPP FR), the challenge of learning robust and discriminative features that preserve the subject’s identity persists. To address these issues, we propose a novel framework called AD-VAE, specifically for SSPP FR, using a combination of variational autoencoder (VAE) and Generative Adversarial Network (GAN) techniques. The proposed AD-VAE framework is designed to learn how to build representative identity-preserving prototypes from both controlled and wild datasets, effectively handling variations like pose, illumination, and occlusion. The method uses four networks: an encoder and decoder similar to VAE, a generator that receives the encoder output plus noise to generate an identity-preserving prototype, and a discriminator that operates as a multi-task network. AD-VAE outperforms all tested state-of-the-art face recognition techniques, demonstrating its robustness. The proposed framework achieves superior results on four controlled benchmark datasets—AR, E-YaleB, CAS-PEAL, and FERET—with recognition rates of 84.9%, 94.6%, 94.5%, and 96.0%, respectively, and achieves remarkable performance on the uncontrolled LFW dataset, with a recognition rate of 99.6%. The AD-VAE framework shows promising potential for future research and real-world applications. Full article
(This article belongs to the Topic Applications in Image Analysis and Pattern Recognition)
Show Figures

Figure 1

17 pages, 4703 KB  
Article
Robotics Classification of Domain Knowledge Based on a Knowledge Graph for Home Service Robot Applications
by Yiqun Wang, Rihui Yao, Keqing Zhao, Peiliang Wu and Wenbai Chen
Appl. Sci. 2024, 14(24), 11553; https://doi.org/10.3390/app142411553 - 11 Dec 2024
Cited by 3 | Viewed by 1502
Abstract
The representation and utilization of environmental information by service robots has become increasingly challenging. In order to solve the problems that the service robot platform has, such as high timeliness requirements for indoor environment recognition tasks and the small scale of indoor scene [...] Read more.
The representation and utilization of environmental information by service robots has become increasingly challenging. In order to solve the problems that the service robot platform has, such as high timeliness requirements for indoor environment recognition tasks and the small scale of indoor scene data, a method and model for rapid classification of household environment domain knowledge is proposed, which can achieve high recognition accuracy by using a small-scale indoor scene and tool dataset. This paper uses a knowledge graph to associate data for home service robots. The application requirements of knowledge graphs for home service robots are analyzed to establish a rule base for the system. A domain ontology of the home environment is constructed for use in the knowledge graph system, and the interior functional areas and functional tools are classified. This designed knowledge graph contributes to the state of the art by improving the accuracy and efficiency of service decision making. The lightweight network MobileNetV3 is used to pre-train the model, and a lightweight convolution method with good feature extraction performance is selected. This proposal adopts a combination of MobileNetV3 and transfer learning, integrating large-scale pre-training with fine-tuning for the home environment to address the challenge of limited data for home robots. The results show that the proposed model achieves higher recognition accuracy and recognition speed than other common methods, meeting the work requirements of service robots. With the Scene15 dataset, the proposed scheme has the highest recognition accuracy of 0.8815 and the fastest recognition speed of 63.11 microseconds per sheet. Full article
(This article belongs to the Special Issue Artificial Intelligence in Complex Networks (2nd Edition))
Show Figures

Figure 1

21 pages, 7746 KB  
Article
Multi-Robot Collaborative Mapping with Integrated Point-Line Features for Visual SLAM
by Yu Xia, Xiao Wu, Tao Ma, Liucun Zhu, Jingdi Cheng and Junwu Zhu
Sensors 2024, 24(17), 5743; https://doi.org/10.3390/s24175743 - 4 Sep 2024
Cited by 2 | Viewed by 2630
Abstract
Simultaneous Localization and Mapping (SLAM) enables mobile robots to autonomously perform localization and mapping tasks in unknown environments. Despite significant progress achieved by visual SLAM systems in ideal conditions, relying solely on a single robot and point features for mapping in large-scale indoor [...] Read more.
Simultaneous Localization and Mapping (SLAM) enables mobile robots to autonomously perform localization and mapping tasks in unknown environments. Despite significant progress achieved by visual SLAM systems in ideal conditions, relying solely on a single robot and point features for mapping in large-scale indoor environments with weak-texture structures can affect mapping efficiency and accuracy. Therefore, this paper proposes a multi-robot collaborative mapping method based on point-line fusion to address this issue. This method is designed for indoor environments with weak-texture structures for localization and mapping. The feature-extraction algorithm, which combines point and line features, supplements the existing environment point feature-extraction method by introducing a line feature-extraction step. This integration ensures the accuracy of visual odometry estimation in scenes with pronounced weak-texture structure features. For relatively large indoor scenes, a scene-recognition-based map-fusion method is proposed in this paper to enhance mapping efficiency. This method relies on visual bag of words to determine overlapping areas in the scene, while also proposing a keyframe-extraction method based on photogrammetry to improve the algorithm’s robustness. By combining the Perspective-3-Point (P3P) algorithm and Bundle Adjustment (BA) algorithm, the relative pose-transformation relationships of multi-robots in overlapping scenes are resolved, and map fusion is performed based on these relative pose relationships. We evaluated our algorithm on public datasets and a mobile robot platform. The experimental results demonstrate that the proposed algorithm exhibits higher robustness and mapping accuracy. It shows significant effectiveness in handling mapping in scenarios with weak texture and structure, as well as in small-scale map fusion. Full article
(This article belongs to the Section Navigation and Positioning)
Show Figures

Figure 1

20 pages, 5333 KB  
Article
Indoor Scene Classification through Dual-Stream Deep Learning: A Framework for Improved Scene Understanding in Robotics
by Sultan Daud Khan and Kamal M. Othman
Computers 2024, 13(5), 121; https://doi.org/10.3390/computers13050121 - 14 May 2024
Cited by 12 | Viewed by 2261
Abstract
Indoor scene classification plays a pivotal role in enabling social robots to seamlessly adapt to their environments, facilitating effective navigation and interaction within diverse indoor scenes. By accurately characterizing indoor scenes, robots can autonomously tailor their behaviors, making informed decisions to accomplish specific [...] Read more.
Indoor scene classification plays a pivotal role in enabling social robots to seamlessly adapt to their environments, facilitating effective navigation and interaction within diverse indoor scenes. By accurately characterizing indoor scenes, robots can autonomously tailor their behaviors, making informed decisions to accomplish specific tasks. Traditional methods relying on manually crafted features encounter difficulties when characterizing complex indoor scenes. On the other hand, deep learning models address the shortcomings of traditional methods by autonomously learning hierarchical features from raw images. Despite the success of deep learning models, existing models still struggle to effectively characterize complex indoor scenes. This is because there is high degree of intra-class variability and inter-class similarity within indoor environments. To address this problem, we propose a dual-stream framework that harnesses both global contextual information and local features for enhanced recognition. The global stream captures high-level features and relationships across the scene. The local stream employs a fully convolutional network to extract fine-grained local information. The proposed dual-stream architecture effectively distinguishes scenes that share similar global contexts but contain different localized objects. We evaluate the performance of the proposed framework on a publicly available benchmark indoor scene dataset. From the experimental results, we demonstrate the effectiveness of the proposed framework. Full article
(This article belongs to the Special Issue Recent Advances in Autonomous Vehicle Solutions)
Show Figures

Figure 1

55 pages, 12486 KB  
Review
Methods and Applications of Space Understanding in Indoor Environment—A Decade Survey
by Sebastian Pokuciński and Dariusz Mrozek
Appl. Sci. 2024, 14(10), 3974; https://doi.org/10.3390/app14103974 - 7 May 2024
Cited by 2 | Viewed by 2508
Abstract
The demand for digitizing manufacturing and controlling processes has been steadily increasing in recent years. Digitization relies on different techniques and equipment, which produces various data types and further influences the process of space understanding and area recognition. This paper provides an updated [...] Read more.
The demand for digitizing manufacturing and controlling processes has been steadily increasing in recent years. Digitization relies on different techniques and equipment, which produces various data types and further influences the process of space understanding and area recognition. This paper provides an updated view of these data structures and high-level categories of techniques and methods leading to indoor environment segmentation and the discovery of its semantic meaning. To achieve this, we followed the Systematic Literature Review (SLR) methodology and covered a wide range of solutions, from floor plan understanding through 3D model reconstruction and scene recognition to indoor navigation. Based on the obtained SLR results, we identified three different taxonomies (the taxonomy of underlying data type, of performed analysis process, and of accomplished task), which constitute different perspectives we can adopt to study the existing works in the field of space understanding. Our investigations clearly show that the progress of works in this field is accelerating, leading to more sophisticated techniques that rely on multidimensional structures and complex representations, while the processing itself has become focused on artificial intelligence-based methods. Full article
(This article belongs to the Special Issue IoT in Smart Cities and Homes, 2nd Edition)
Show Figures

Figure 1

15 pages, 3624 KB  
Article
A Multi-Modal Foundation Model to Assist People with Blindness and Low Vision in Environmental Interaction
by Yu Hao, Fan Yang, Hao Huang, Shuaihang Yuan, Sundeep Rangan, John-Ross Rizzo, Yao Wang and Yi Fang
J. Imaging 2024, 10(5), 103; https://doi.org/10.3390/jimaging10050103 - 26 Apr 2024
Cited by 6 | Viewed by 4752
Abstract
People with blindness and low vision (pBLV) encounter substantial challenges when it comes to comprehensive scene recognition and precise object identification in unfamiliar environments. Additionally, due to the vision loss, pBLV have difficulty in accessing and identifying potential tripping hazards independently. Previous assistive [...] Read more.
People with blindness and low vision (pBLV) encounter substantial challenges when it comes to comprehensive scene recognition and precise object identification in unfamiliar environments. Additionally, due to the vision loss, pBLV have difficulty in accessing and identifying potential tripping hazards independently. Previous assistive technologies for the visually impaired often struggle in real-world scenarios due to the need for constant training and lack of robustness, which limits their effectiveness, especially in dynamic and unfamiliar environments, where accurate and efficient perception is crucial. Therefore, we frame our research question in this paper as: How can we assist pBLV in recognizing scenes, identifying objects, and detecting potential tripping hazards in unfamiliar environments, where existing assistive technologies often falter due to their lack of robustness? We hypothesize that by leveraging large pretrained foundation models and prompt engineering, we can create a system that effectively addresses the challenges faced by pBLV in unfamiliar environments. Motivated by the prevalence of large pretrained foundation models, particularly in assistive robotics applications, due to their accurate perception and robust contextual understanding in real-world scenarios induced by extensive pretraining, we present a pioneering approach that leverages foundation models to enhance visual perception for pBLV, offering detailed and comprehensive descriptions of the surrounding environment and providing warnings about potential risks. Specifically, our method begins by leveraging a large-image tagging model (i.e., Recognize Anything Model (RAM)) to identify all common objects present in the captured images. The recognition results and user query are then integrated into a prompt, tailored specifically for pBLV, using prompt engineering. By combining the prompt and input image, a vision-language foundation model (i.e., InstructBLIP) generates detailed and comprehensive descriptions of the environment and identifies potential risks in the environment by analyzing environmental objects and scenic landmarks, relevant to the prompt. We evaluate our approach through experiments conducted on both indoor and outdoor datasets. Our results demonstrate that our method can recognize objects accurately and provide insightful descriptions and analysis of the environment for pBLV. Full article
(This article belongs to the Special Issue Image and Video Processing for Blind and Visually Impaired)
Show Figures

Figure 1

16 pages, 5701 KB  
Article
An Indoor 3D Positioning Method Using Terrain Feature Matching for PDR Error Calibration
by Xintong Chen, Yuxin Xie, Zihan Zhou, Yingying He, Qianli Wang and Zhuming Chen
Electronics 2024, 13(8), 1468; https://doi.org/10.3390/electronics13081468 - 12 Apr 2024
Cited by 6 | Viewed by 1584
Abstract
Pedestrian Dead Reckoning (PDR) is a promising algorithm for indoor positioning. However, the accuracy of PDR degrades due to the accumulated error, especially in multi-floor buildings. This paper introduces a three-dimensional (3D) positioning method based on terrain feature matching to reduce the influence [...] Read more.
Pedestrian Dead Reckoning (PDR) is a promising algorithm for indoor positioning. However, the accuracy of PDR degrades due to the accumulated error, especially in multi-floor buildings. This paper introduces a three-dimensional (3D) positioning method based on terrain feature matching to reduce the influence of accumulated errors in multi-floor scenes. The proposed calibration method involves two steps: motion pattern recognition and position matching-based calibration. The motion pattern recognition aims to detect different motion patterns, i.e., taking the stairs or horizontal walking, from the streaming data. Then, stair entrances and corridor corners are matched with transition points of motion patterns and pedestrian turning points, respectively. After matching, calibration is performed to eliminate the accumulated errors. By carrying out experiments on a two-floor closed-loop path with a walking distance about 145 m, it is shown that this method can effectively reduce the accumulated error of PDR, achieving accurate 3D positioning. The average error is reduced from 6.60 m to 1.37 m. Full article
Show Figures

Figure 1

30 pages, 1424 KB  
Review
A Review of Sensing Technologies for Indoor Autonomous Mobile Robots
by Yu Liu, Shuting Wang, Yuanlong Xie, Tifan Xiong and Mingyuan Wu
Sensors 2024, 24(4), 1222; https://doi.org/10.3390/s24041222 - 14 Feb 2024
Cited by 29 | Viewed by 14399
Abstract
As a fundamental issue in robotics academia and industry, indoor autonomous mobile robots (AMRs) have been extensively studied. For AMRs, it is crucial to obtain information about their working environment and themselves, which can be realized through sensors and the extraction of corresponding [...] Read more.
As a fundamental issue in robotics academia and industry, indoor autonomous mobile robots (AMRs) have been extensively studied. For AMRs, it is crucial to obtain information about their working environment and themselves, which can be realized through sensors and the extraction of corresponding information from the measurements of these sensors. The application of sensing technologies can enable mobile robots to perform localization, mapping, target or obstacle recognition, and motion tasks, etc. This paper reviews sensing technologies for autonomous mobile robots in indoor scenes. The benefits and potential problems of using a single sensor in application are analyzed and compared, and the basic principles and popular algorithms used in processing these sensor data are introduced. In addition, some mainstream technologies of multi-sensor fusion are introduced. Finally, this paper discusses the future development trends in the sensing technology for autonomous mobile robots in indoor scenes, as well as the challenges in the practical application environments. Full article
(This article belongs to the Special Issue Advanced Sensing and Control Technologies for Autonomous Robots)
Show Figures

Figure 1

22 pages, 7517 KB  
Article
Hybrid 3D Reconstruction of Indoor Scenes Integrating Object Recognition
by Mingfan Li, Minglei Li, Li Xu and Mingqiang Wei
Remote Sens. 2024, 16(4), 638; https://doi.org/10.3390/rs16040638 - 8 Feb 2024
Cited by 1 | Viewed by 3054
Abstract
Indoor 3D reconstruction is particularly challenging due to complex scene structures involving object occlusion and overlap. This paper presents a hybrid indoor reconstruction method that segments the room point cloud into internal and external components, and then reconstructs the room shape and the [...] Read more.
Indoor 3D reconstruction is particularly challenging due to complex scene structures involving object occlusion and overlap. This paper presents a hybrid indoor reconstruction method that segments the room point cloud into internal and external components, and then reconstructs the room shape and the indoor objects in different ways. We segment the room point cloud into internal and external points based on the assumption that the room shapes are composed of some large external planar structures. For the external, we seek for an appropriate combination of intersecting faces to obtain a lightweight polygonal surface model. For the internal, we define a set of features extracted from the internal points and train a classification model based on random forests to recognize and separate indoor objects. Then, the corresponding computer aided design (CAD) models are placed in the target positions of the indoor objects, converting the reconstruction into a model fitting problem. Finally, the indoor objects and room shapes are combined to generate a complete 3D indoor model. The effectiveness of this method is evaluated on point clouds from different indoor scenes with an average fitting error of about 0.11 m, and the performance is validated by extensive comparisons with state-of-the-art methods. Full article
(This article belongs to the Special Issue Point Cloud Processing with Machine Learning)
Show Figures

Graphical abstract

18 pages, 5430 KB  
Article
Three-Dimensional Indoor Positioning Scheme for Drone with Fingerprint-Based Deep-Learning Classifier
by Shuzhi Liu, Houjin Lu and Seung-Hoon Hwang
Drones 2024, 8(1), 15; https://doi.org/10.3390/drones8010015 - 9 Jan 2024
Cited by 4 | Viewed by 3081
Abstract
Unmanned aerial vehicles (UAVs) hold significant potential for various indoor applications, such as mapping, surveillance, navigation, and search and rescue operations. However, indoor positioning is a significant challenge for UAVs, owing to the lack of GPS signals and the complexity of indoor environments. [...] Read more.
Unmanned aerial vehicles (UAVs) hold significant potential for various indoor applications, such as mapping, surveillance, navigation, and search and rescue operations. However, indoor positioning is a significant challenge for UAVs, owing to the lack of GPS signals and the complexity of indoor environments. Therefore, this study was aimed at developing a Wi-Fi-based three-dimensional (3D) indoor positioning scheme tailored to time-varying environments, involving human movement and uncertainties in the states of wireless devices. Specifically, we established an innovative 3D indoor positioning system to meet the localisation demands of UAVs in indoor environments. A 3D indoor positioning database was developed using a deep-learning classifier, enabling 3D indoor positioning through Wi-Fi technology. Additionally, through a pioneering integration of fingerprint recognition into wireless positioning technology, we enhanced the precision and reliability of indoor positioning through a detailed analysis and learning process of Wi-Fi signal features. Two test cases (Cases 1 and 2) were designed with positioning height intervals of 0.5 m and 0.8 m, respectively, corresponding to the height of the test scene for positioning simulation and testing. With an error margin of 4 m, the simulation accuracies for the (X, Y) dimension reached 94.08% (Case 1) and 94.95% (Case 2). When the error margin was 0 m, the highest simulation accuracies for the H dimension were 91.84% (Case 1) and 93.61% (Case 2). Moreover, 40 real-time positioning experiments were conducted in the (X, Y, H) dimension. In Case 1, the average positioning success rates were 50.8% (Margin-0), 72.9% (Margin-1), and 81.4% (Margin-2), and the corresponding values for Case 2 were 52.4%, 74.5%, and 82.8%, respectively. The results demonstrated that the proposed method can facilitate 3D indoor positioning based only on Wi-Fi technologies. Full article
(This article belongs to the Special Issue Drones Navigation and Orientation)
Show Figures

Figure 1

Back to TopTop