Next Article in Journal
Integrating Root Morphology Based on Whole-Pullout Test of Model Roots: A Case Study
Next Article in Special Issue
Bayesian-Based Approach for the Thermographic Measurement of Flow Transition on Wind Turbine Rotor Blades
Previous Article in Journal
Assessing the Biomechanical, Kinematic, and Force Distribution Properties of the Foot Following Tarsometatarsal Joint Arthrodesis: A Systematic Review
Previous Article in Special Issue
Infrared Thermal Imaging Analysis in Screening for Toddler’s Fracture: A Proof-of-Concept Study
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Thermal, Multispectral, and RGB Vision Systems Analysis for Victim Detection in SAR Robotics

Centro de Automática y Robótica (CSIC-UPM), Universidad Politécnica de Madrid—Consejo Superior de Investigaciones Científicas, 28006 Madrid, Spain
*
Authors to whom correspondence should be addressed.
Appl. Sci. 2024, 14(2), 766; https://doi.org/10.3390/app14020766
Submission received: 30 November 2023 / Revised: 14 January 2024 / Accepted: 15 January 2024 / Published: 16 January 2024
(This article belongs to the Special Issue Recent Progress in Infrared Thermography)

Abstract

:
Technological advancements have facilitated the development of sophisticated vision systems, integrating optical sensors with artificial vision and machine learning techniques to create applications in different fields of robotics. One such field is Search and Rescue (SAR) robotics, which has historically played a significant role in assisting brigades following post-disaster events, particularly in exploration phases and, crucially, in victim identification. The importance of employing these systems in victim identification lies in their functionality under challenging conditions, enabling the capture of information across different light spectrum ranges (RGB, Thermal, Multispectral). This article proposes an innovative comparative analysis that scrutinizes the advantages and limitations of three sensor types in victim detection. It explores contemporary developments in the state-of-the-art and proposes new metrics addressing critical aspects, such as functionality in specific scenarios and the analysis of environmental disturbances. For the indoor and outdoor testing phase, a quadrupedal robot has been equipped with these cameras. The primary findings highlight the individual contributions of each sensor, particularly emphasizing the efficacy of the infrared spectrum for the thermal camera and the Near Infrared and Red Edge bands for the multispectral camera. Ultimately, following system evaluations, detection precisions exceeding 92% and 86%, respectively, were achieved.

1. Introduction

The past decade has witnessed significant technological advancements in perception systems, with a particular focus on vision-based systems applied in the domain of field robotics. Notably, a considerable impetus has been observed in applications focused on vision sensors, as highlighted in [1,2,3]. These kinds of sensors leverage different light spectrum ranges to generate environmental data, thereby facilitating the extraction of valuable information essential for robotic mission applications. Sensors incorporated in RGB, thermal, and multispectral cameras are the most widely utilized in outdoor robotics, particularly the first two sensors, which are commonly applied in the Search and Rescue (SAR) domain [4,5]. In contrast, the latter (multispectral) exhibits extensive applicability in precision agriculture [6,7]; recent studies have demonstrated its utility in SAR robotics.
Within the spectrum of applied advancements in SAR, the contemporary state of the art directs attention toward perspectives primarily centered on the general identification of people [8,9,10]. One of the principal goals inherent in Search and Rescue missions is to optimize the preservation of lives within the briefest conceivable timeframe. The initial hours after a disaster are characterized by heightened criticality, with the likelihood of locating survivors being at its peak during this period. Nonetheless, owing to the abrupt occurrence of these events, achieving the full preparedness of rescue teams for prompt deployment within a few hours and at a designated location poses a formidable challenge. Rescue teams have saved victims even ten days after a catastrophe, as exemplified by the earthquake in Turkey in February 2023. Notably, three individuals were rescued alive 248 h after the event, as reported by CNN [11]. Furthermore, the challenging conditions in post-disaster environments impede the visual identification of victims during the initial assessment by a first-responder. These adverse conditions encompass scenarios such as victims being entirely or partially concealed by debris, the presence of immobile or unconscious victims, low or no illumination in the surroundings [12], potential gas leaks, and electrical hazards [13].
In the state of the art, metrics are proposed for validating SAR missions, such as the one described by Jacoff et al. [14], named AAAVRoboCupRescue, which involves parameters such as the number of located victims, the number of robots and operators involved, and the accuracy, following Equation (1). The current metrics for assessing the quality of a detection method primarily rely on the number of individuals found [15]. In the works by Katsamenis et al. [16,17] and De Oliveira et al. [17], teh authors present approaches related to person detection but do not establish a comparison or specific metrics for validating the quality and implication of sensors in detections. While current methods score the missions considering victim detections, the current state of the art lacks a methodology to evaluate and compare the factors influencing this detection, especially by analyzing the three types of visual sensors (RGB, Thermal, and Multispectral).
R o b o t R e s c u e S c o r e = ( V i c t i m s P o i n t s N u m b e r O f R o b o t s ( 1 + N u m b e r O f O p e r a t o r s ) 3 A v e r a g e A c c u r a c y
Combining different sensory sources has shown high effectiveness in extracting relevant information from analyzed elements, as seen in the work by Corradino et al., where they combine radar and optical satellite imagery to map lava flows [18]. It is noteworthy in the state-of-the-art literature that using RGB–thermal cameras is prevalent. Yet, the incorporation of multispectral bands simultaneously for this purpose has not been explored. This work takes as a basis the foundation for victim detection based on the works conducted by the authors in the domains of RGB [19], Thermal [20,21], and Multispectral [22].

2. Related Works

2.1. Context and Historical Evolution

The utilization of visible information captured by cameras is pivotal across various domains of robotics and other scientific disciplines for the analysis of processes, states, and decision-making. Presently, in search and rescue robotics, a substantial portion of the utilized information originates from the visible spectrum of light (RGB images) and a segment from the infrared spectrum (thermal images). However, the multispectral spectrum has been relatively underexplored in this realm of robotics, particularly in identifying victims during search and rescue missions.
Figure 1 shows three distinct spectral ranges of light, specifically RGB [400 nm–700 nm], thermal infrared [8 μ m–14 μ m], and multispectral (Red Edge [690 nm–730 nm] and Near-Infrared [NIR] [750 nm–2.5 μ m]), as described previously.
Notably, the human-perceptible visible light spectrum (RGB range) is considerably narrower than the infrared or ultraviolet spectra, which are perceptible by certain animals such as snakes or insects. Due to its widespread application in agriculture, the multispectral range is particularly distinctive in victim detection. The selection of these spectral ranges is informed by the anticipated directions of future research, as delineated in contemporary studies. It aligns with the specifications of commercially available portable data acquisition instruments (images) compatible with robot payloads less than 10 kg.
A comprehensive examination of the research works over the past two decades is necessary for scientific research on RGB, Thermal, and Multispectral sensors within search and rescue operations. A literature search was conducted using the online database Google Scholar to generate a chronological graph depicting the evolution of publications (Figure 2). The graph identifies three types of histograms: yellow, grey, and blue. These histograms were generated as follows: the search criteria for each type of sensor defined by colour were the respective phrases ’Multispectral sensors in search and rescue’, ’Thermal sensors in search and rescue’, and ’RGB sensors in search and rescue’. An annual search was performed in the specific interval field; for example, for the year 2000, the interval considered was [1999–2000], generating 23 data for each sensor type. The mentioned data were entered into the histogram generation function of Microsoft Excel to perform visual analysis and corresponding graphical representation, resulting in Figure 2. Simultaneously, second-degree polynomial trend curves were applied to define the increasing trend in the evolution of the analyzed works.
During the early 2000s, the scientific output in these three fields was notably modest. However, a remarkable exponential growth trend has been observed in the last two decades. This surge in research activity can be attributed to a confluence of factors, with a significant emphasis on the scientific community’s growing interest in exploring applications related to search and rescue. Additionally, advancements in technology, particularly in the realm of cost reduction for both sensors and data processing equipment, have played a pivotal role in fostering this upward trajectory.
A particular observation pertains to the distribution of research output among the three sensor types. Thermal-sensor-related works (represented in grey in Figure 2) stand out as the most abundant, surpassing studies in the RGB (blue bar) domain by a substantial ratio of over 2:1. Meanwhile, studies in the RGB field outnumber those related to multispectral images (orange bar) by approximately 20%. This distribution can be rationalized a priori, considering the challenging conditions in SAR environments, especially poor or nonexistent illumination. Consequently, the materials’ thermal-emission-based information emerges as a special resource for rescuers, facilitating initial inferences regarding environmental conditions, leak detection, victim identification, and other critical aspects.

2.2. Vision Sensors in Search and Rescue Robotics

In this section, the most relevant developments in the state of the art related to the three types of sensors applied to identifying victims in SAR robotics are compiled, as detailed in Table 1.

3. Materials and Methods

3.1. Robotic System and Processing

A quadrupedal robot equipped with hardware–software instrumentation was employed to develop the mission indoors and outdoors. This instrumentation enables both data collection and onboard processing using the ROS framework. The utilized instruments are outlined in Table 2, providing details on the robotic system’s characteristics and specifications for the cameras employed in the process. Specifically, the spectral ranges of light acquisition for these cameras are described.
The detection of victims relies on convolutional neural network models (Thermal [20,21], RGB [19], Multi-spectral [22]) from the authors’ previous developments. These models have been integrated into the proposed system to generate a centralized and robust detection by fusing three inferences and a subsequent comparative analysis.
Figure 3 illustrates the developed structure, which includes a Command Station for monitoring the robot’s navigation process. This station sends various user-defined points of interest for navigation, and the robot explores these points reactively. Reactive navigation involves reaching the designated point, capturing three types of data, processing the images through the three convolutional neural network models to generate predictions, establishing a redundant filtering mechanism across the three systems, determining valid detections, and subsequently placing indicators on the map generated by the robot during the mission’s progression.

3.2. Field Tests

As test scenarios, wherein diverse missions for victim identification will be executed to undertake the proposed comparison, indoor and outdoor environments with conditions akin to a post-disaster setting have been explicitly delineated. Within these environments, individuals simulating victims have been strategically positioned.
Outdoor experiments were conducted in collaboration with the Spanish Military Emergency Unit (UME) during the “XVI Jornadas Internacionales de la Universidad de Málaga sobre Seguridad, Emergencias y Catástrofes”. Together with the organization, and leveraging their expertise, we meticulously recreated environmental conditions, allowing for the extrapolation of results to realistic situations. Additionally, the standards for such experiments were dictated by the National Institute of Standards and Technology (NIST) [41], which was also considered for indoor tests, defining parameters such as the type of debris, ground obstacles, etc. The experiments were conducted repetitively under the same environmental conditions to minimise error variation for each evaluation aspect.
In the first scenario, a series of victims were strategically placed in a tunnel, requiring identification by various participating robotic teams. Subsequently, these identifications were verified through specialized canine units. Figure 4a illustrates a panoramic view of the tunnel exit, featuring emergency response teams, a helicopter, and a vehicle positioned for other simulation exercises. On the other hand, Figure 4b depicts an indoor scenario within the Center for Automation and Robotics facilities, where primary phase system validation tests were conducted before outdoor experiments.

3.3. Algorithms and Evaluation Metrics

A system was devised to evaluate the proposed method, which stems from developing a series of missions for processing the proposed victim detection system through the three proposed vision modes. The outcome aims to maximize the accurate identification of victims in an explored area, placing potential “refined” locations on an informed environmental map. In this manner, the markings serve as reference points for first responders to prioritize their attention to those areas.

3.3.1. Implemented Algorithm

The implemented algorithm governing the system operates on a sequential modular synergy for decision-making. The first step took the operator-defined destination points; collision-free trajectories were calculated using an RRT (Rapidly Exploring Random Tree) planner. Position estimation was acquired through sensor fusion, combining data from the Inertial Measurement Unit (IMU) and lidar system. Through this stage, the robot navigated through the environment to capture images.
The localization was based on SLAM, including an EKF (Extended Kalman Filter), which uses Lidar and IMU, as GPS, in such conditions (indoors), cannot accurately estimate positions. On the other hand, the assignment of inspection points (x, y) was carried out through the interface of the informed map generated (described in Figure 3). The navigation to each point was autonomous, with dynamic obstacle avoidance facilitated by the Lidar system. The path planning, considering known starting and destination points and the environment map, employed an RRT planner and the A* search algorithm.
On the other hand, vision systems undergo a series of stages, from data acquisition to the identification of victims. Raw images are captured and preprocessed using computer vision techniques to eliminate environmental noise and enhance their quality before entering the neural network. For this purpose, techniques such as erosion, dilation, and Gaussian filtering were applied to the images. In the case of multispectral images, they are matrix-combined using an operation defined by the Victim Detection Index (VDIN). Once processed by the CNN, detection results, such as bounding boxes, labels, and precision, are incorporated into each resulting image. These processed data are stored through logs to generate the final mission percentages at the end of the operation.
Pseudocode Algorithm 1 delineates the functionality of the implemented system, starting from the exploration phase through zones designated by the operator, the acquisition of visual data from the environment, the processing of three types of images, and the evaluation of system performance.

3.3.2. CNN-Based Algorithm

In this section, the CNN-based Detection Algorithm is introduced. For the evaluation phase of the proposed method, an embedded subroutine was developed within Algorithm 2. This subroutine identified victims using three types of images through convolutional neural networks. Specifically, the architecture of YOLOv8 was employed due to its notable features: its high inference speed and high precision rate in detection. The version of YOLOv8 utilized was ’m’, chosen for its balance between accuracy and inference time compared to the ’n’ and ’s’ versions. While the ’x’ and ’l’ versions marginally increase accuracy, the processing time hinders real-time inference.
As a preliminary step to detection, a preprocessing phase was conducted on the multispectral images. Though the multispectral camera outputs RGB channels as well, for the purposes of this study, channels relevant to the victim detection index (Red, Green, Near Infrared) from Equation (2), as previously proposed by the authors in [22], were utilized. Although both the thermal camera and the multispectral camera produce an output image as a matrix with normalized intensities ranging from 0 to 255, unlike RGB, which combines three channels, when sending this data as inputs to the CNN, as the network requires three channels for processing, for the first two cases, the channel was replicated three times.
I n V D = ρ G R E ρ R E D ρ N I R ρ G R E + ρ R E D
Algorithm 1 Victim detection and robotic exploration system.
  1:
Data :
  2:
i m R G B RGB image [ 640 × 480 ]
  3:
i m T h e r m Thermal image [ 640 × 480 ]
  4:
i m M u l t Multi-spectral image [ 1500 × 1500 ]
  5:
R o b o t p o s e ( X Y Z , r o l l p i t c h y a w ) SLAM-based
  6:
E n v s t a t e 2 D m a p
  7:
Result :
  8:
E v a l p e r f o r m a n c e e r r o r [ R G B , T H E R M , M U L T ]
  9:
function Victim_Identification( i m R G B , i m T h e r m , i m M u l t ) ▹ CNN vision-detection
10:
     i m M u l t V D I N ( i m M u l t ) // Multisp. victim index conversion
11:
     C N N b a s e d a l g o r i t h m i m R G B , i m T h e r m , i m M u l t
12:
    return [ V i c t i m s [ 1 n , b b o x , m A P ] ]
13:
end function
14:
function Goal_Inspection( x , y )           ▹ Map Point Inspection
15:
     r o b o t _ n a v i g a t i o n _ s t a c k R R T ( x , y )
16:
     r o t a t e   90 , 180 , 270 , 360
17:
    return [ i m a g e s , p o s e ]
18:
end function
19:
while  m i s s i o n _ o n  do                     ▹ Main Loop
20:
    send ( G o a l _ I n s p e c t i o n ( x , y ) ) User Interface
21:
    eval ( V i c t i m _ I d e n t i f i c a t i o n ( i m R G B , T h e r m , M u l t ) )
22:
    if  V i c t i m s [ ] not n u l l  then
23:
        for  k 1 in V i c t i m s [ ]  do
24:
           get [ α , β , . . . . , τ ] based on m A P
25:
           Total victims ← v i c t i m T h e r m v i c t i m R G B v i c t i m M u l t in Zone inspected
26:
        end for
27:
        eval e r r o r (True Victims Number, Total victims)
28:
    end if
29:
     s c o r e _ S A R and s c o r e _ G e n e r a l [ α , β , . . . . , τ ]
30:
    set p o s e _ m a r k [ 1 , . . , n ] in 2 D _ M a p
31:
end while
The pre-training stage of the three convolutional neural network models involved data preprocessing. In this process, images were labelled according to the labels in the list, utilizing various labels to account for scenarios where victims may be partially obscured. In the case of rescuers, an additional label for legs was included, considering the robot’s limited visibility field.
  • Victim, victim leg, victim torso, victim arm, victim head;
  • Rescuer, rescuer leg.
Before training, data augmentation was performed to enhance the model’s robustness against disturbances commonly encountered in outdoor environments. This involved applying morphological operations to the images, specifically rotation (20%), brightness modification (30%), and contrast adjustment (30%). The datasets were divided into training (70%), validation (20%), and test (10%) sets, available in the repositories: RGB (https://drive.upm.es/s/xPKDp5Xyh1HTHWA, accessed on 14 January 2024) (total = 1454, 2064 × 1544 px), Multispectral (https://drive.upm.es/s/xPKDp5Xyh1HTHWA, accessed on 14 January 2024) (total = 1454, 2064 × 1544 px), and Thermal 1 (https://mega.nz/fm/26RjCCiQ, accessed on 14 January 2024)–Thermal 2 (https://drive.upm.es/s/xPKDp5Xyh1HTHWA, accessed on 14 January 2024) (total = 3750, 1920 × 1080 px). The neural network training parameters were set to 190 epochs, a batch size 4, and 63,000 iterations. The newly trained models were used to infer new images following the Pseudocode Algorithm 2.
Algorithm 2 CNN-based algorithm.
  1:
Input: Trained CNN models M 1 , 2 , 3
  2:
I RGB RGB image [ 640 × 480 ]
  3:
I thermal Thermal image [ 640 × 480 ]
  4:
I multispectral Multi-spectral image [ 1500 × 1500 ]
  5:
ground truth labels G T
  6:
Output: Number of victims N victims
  7:
Detection precision P detection
  8:
IoU metric
  9:
Best detection metric
10:
Position information
11:
Initialize N victims 0 , N correct 0 , I o U sum 0 , B e s t metric 0
12:
Inference on New Data:
13:
     For each image i RGB - thermal - multisp in I n f e r e n c e D a t a :
14:
          Obtain prediction from M 1 , 2 , 3
15:
          Store prediction, associated probability, and predicted bounding box
16:
          If prediction is “victim” and corresponding G T i is also “victim” and not a “rescuer”:
17:
               N correct N correct + 1
18:
               Calculate IoU between predicted box and ground truth box
19:
               I o U sum I o U sum + I o U
20:
               If probability is greater than current B e s t metric :
21:
                    B e s t metric probability
22:
                    Store position information of the best detection
23:
Categorization of Results:
24:
     Calculate P detection N correct Total number of victims in G T
25:
     Calculate average IoU I o U ¯ I o U sum N correct
26:
Return:  N victims , P detection , I o U ¯ , B e s t metric , Position information
Confusion matrices were employed to highlight the precision of class detection in order to evaluate the trained models. Figure 5 shows the three confusion matrices obtained by evaluating the trained models. The key findings reveal that the main diagonals exhibited high values, emphasizing the correct functionality of the models, with Thermal, Multispectral, and RGB models ranked in descending order of effectiveness.

3.3.3. Proposed Metrics for Method Analysis

A set of metrics is proposed, as described in Table 3, to assess the effectiveness of each vision system in victim detection. Following the state of the art and the authors’ framework, these metrics encompass the primary evaluation criteria in the field of search and rescue for each system individually, addressing both functional conditions and the general performance of the proposed detection system.
In a general mode, a direct evaluation of three types of sensors is proposed for the generic conditions present in diverse environments, incorporating normalized coefficients [0–100] by functionality efficacy, as delineated by Equation (3) (Thermal–RGB–Multi). Given the generality of the application framework, the time parameter was not considered in this instance, and the coefficients relating to victims focused generically on objects.
S c o r e G e n e r a l ( T h e r m a l R G B M u l t i ) = α + β + γ + + ϵ + θ + ϕ + ω + λ + τ
On the other hand, in a SAR approach, systems are evaluated concerning their correct functionality in each envisaged scenario. Similarly, specific parameters, such as processing time, are negatively penalized because time is a critical parameter in exploration. In contrast, others, such as robustness in changing light conditions, contribute more significantly to the total score. Likewise, critical situations, such as in identifying a concealed victim, are of particular interest, as is the ability to identify victims in poor lighting conditions, a recurrent situation in post-disaster environments.
The expressions encapsulating the weighting relationship for the coefficients are summarized in Equation (4) for each type of image detection. The modified coefficients were established following the conducted experiments, assigning greater weight to those deemed more pertinent in Search and Rescue operations.
S c o r e S A R ( T h e r m a l R G B M u l t ) = α + 1.75 β + γ + ( 100 δ ) + 0.8 ϵ + 2 θ + ϕ + ω + λ + 1.5 τ

4. Results and Discussion

4.1. Mission Execution in Indoor–Outdoor Environments

To evaluate this study, the proposed metrics ( α , , τ ), and the overall system presented, a series of missions were conducted indoors and outdoors, as synthesized in Figure 6. This figure illustrates various instances of traversal in different environments over distinct time intervals and perspectives on the progression of data collection phases.
Figure 6b–e depict the traversal sequence for the indoor scenario, spanning from time 0 to a median time of 200 s. A blue circle denotes the robot’s position at each instant. In this scenario configuration, 20 tests were conducted, with five victims distributed throughout, including covered and partially covered cases. On the other hand, Figure 6e–h showcase the tunnel exploration carried out during the “XVI Jornadas Internacionales de la Universidad de Málaga sobre Seguridad, Emergencias y Catástrofes”. This exercise was executed once in a single pass due to the logistical complexity involved. The figures highlight the individual explorations conducted by ARTU-R and Spot robots.
Figure 6i,k provide the perspective captured by a UAV during the mission’s development. In contrast, Figure 6j,l depict the corresponding thermal footprints captured by the aerial vehicle. Finally, Figure 6m,p illustrate the robot capturing images of a person acting as a victim for processing. These images were acquired from various perspectives assigned by the operator.

4.2. System Performance Evaluation

4.2.1. Evaluation of Victim Identification in SAR Missions Performed

Figure 7 illustrates the detections conducted in the missions of the scenarios presented in Figure 6. The figure highlights a notably high precision rate, exceeding an average of 87%, for identifying both victims and first responders.
Figure 7a,d showcase the detection results with a high average precision rate exceeding 91% for indoor scenarios, employing thermal and multispectral images. Notably, the system demonstrates the effective detection of torsos and legs. On the other hand, Figure 7b,c,e pertain to outdoor scenarios, emphasizing bounding boxes that highlight various body parts of identified victims, achieving a precision rate exceeding fifty percent.

4.2.2. Individual Evaluation of Systems Using the Proposed Metrics

The metrics outlined in Table 3 have been evaluated based on victim detection quality data in the conducted missions. The evaluation considers the mean precision values (mAP) for inferences under the ten specified conditions across repetitions of missions indoors and subsequent evaluations outdoors.
Figure 8 compiles the normalized percentage values from 0 to 100 obtained for each of the three types of images, according to the ten indices established in this study. This figure provides an intuitive and straightforward overview of the strengths of each image type relative to its counterparts. Significant differentials were observed, particularly in scenarios involving fully covered victims and poor light conditions, where thermal cameras exhibit a considerable advantage. In contrast, in aspects such as heat sources or summer weather conditions, RGB and multispectral images are better alternatives.
Concerning indoor, outdoor, and changing light conditions, the three cameras’ mean effectiveness values remain within a similar range. However, for scenarios involving partially covered victims or those wearing “camouflage” clothing that may be confused with the surroundings, the ranges differ moderately, with differences of up to 20 per cent.
The rounded mean values corresponding to the radial Figure 8, relative to the coefficients of situational analysis for environmental conditions, are synthesized in Table 4 under the Metrics Analysis section. Here, the results of Equation (3) are also presented, representing the values for each image type in a generic sense and specifically according to Equation (4) for the Search and Rescue case.
The highest to lowest calculated scores for generic detection scenarios in exploration missions are Thermal 755, Multispectral 714, and RGB 662. The variation between extremes is 93 points. A similar situation arises with the SAR Score, where the order remains unchanged. Still, the specific incidence difference for Search and Rescue is much higher (with a difference of up to 222 points), considering the weighting factors for the proposed coefficients.
In all three cases, as well as for experiments conducted indoors and outdoors, the best individual performer is the thermal range for victim detection, followed by multispectral and RGB.

4.2.3. Combined Evaluation of Systems Using the Proposed Metrics

Figure 9 presents a boxplot diagram of the percentage values of victim detection in each configuration of the coefficients ( α , , τ ). The most significant differences were observed among the coefficients related to ϵ and θ , highlighting their pronounced impact on victim identification. In contrast, τ and γ exhibit a minimal influence on their respective parameters of identifying victims with either type of camera.
These findings underscore the sensitivity of the detection outcomes to variations in specific coefficients, particularly those associated with environmental and contextual considerations. The coefficients β , ϵ , δ , and θ emerge as critical factors influencing the performance of victim detection algorithms, suggesting the need for the careful consideration and optimization of these parameters in the designed system.
On the other hand, the evaluation of processing time, measured in frames per second (fps) for the three cases using the YOLOv8-m version, demonstrates a performance ranging from approximately 26 to 28 fps for inference through the convolutional neural network (CNN) on thermal and RGB images, respectively. Meanwhile, the multispectral range exhibits a rate of 8 fps, attributable to the preprocessing and size of images across different spectral ranges. Although the processing times in the first two cases approach so-called real-time processing, the last case is an exception. To compensate for this latency, a pause sequence is executed, during which the robot remains stationary in the sample acquisition zone to gather and process data, effectively overlaying this time latency.
Although individual victim detection systems tend to perform well in specific cases, the results demonstrate that a system cannot be generalized for all scenarios. Therefore, once the individual impact of each coefficient is known, it is necessary to establish a robust and redundant system for victim detection. Figure 10 illustrates the maximized outcome of the areas covered by the implemented redundant system, reaching a success rate of up to 93.5% (following Table 4) in potential search and rescue scenarios.

4.2.4. Discussion

The approach to victim detection, and the development of a substantial comparison among the three types of images to achieve this purpose, has allowed the establishment of criteria for examining the strengths and weaknesses for each image type under specific functionality conditions. While, individually, the systems exhibit acceptable functionality in particular conditions, the proposed synergy and selective use of image types based on environmental conditions result in a detection accuracy exceeding 93%, as observed in the conducted experiments.
In this context, the optimal functionality conditions for the different sensors are as follows: thermal sensors perform well under changing light conditions, detect individuals regardless of clothing colour, and effectively identify partially or fully covered victims in low-light conditions. Multispectral sensors demonstrate strength in scenarios involving heat sources in summer and fire conditions, both indoors and outdoors. Finally, RGB sensors excel in processing time.
Regarding the state of the art, no similar work has been found that compares the three image types by evaluating environment metrics and functionality; most approaches to victim detection in search and rescue environments are carried out with thermal cameras due to the applicability of computer vision techniques for the segmentation of the thermal information of a victim and their environment, in addition to the commercial availability of equipment oriented to this task. However, there are significant advances in using multispectral perception cameras to detect biometric characteristics. This may be useful for feature extraction oriented to victim detection in the non-visible spectrum for post-disaster environments.
Likewise, almost all works focus on integrating sensors in UAVs from the top view plane, working on datasets. Along these lines, references [22,38] propose a different approach based on detecting and localizing victims online from UGVs in navigation areas with low-light conditions. Table 1 provides a detailed overview of the most relevant works and presents a comparison concerning the methods and techniques employed in their development.

5. Conclusions

In the context of this study, it is deduced that the spectral ranges of light with the most significant impact on the detection of victims are primarily Green (GRE), Red (RED), Near-Infrared (NIR), and Infrared. Regarding the former two, their significance lies in their substantial contribution of environmental information. On the other hand, the latter two are noteworthy for their ability to capture information that eludes human visual perception.
The comparative analysis undertaken for victim detection across three spectral ranges has facilitated the identification of key parameters with varying degrees of impact on system precision. Notably, θ (indicating a totally covered victim), β (corresponding to poor light conditions), and ϵ (indicating the presence of heat sources) have emerged as influential factors significantly affecting victim identification. In contrast, τ (associated with changing light conditions) and γ (with indoor settings) exhibit minimal impact, independent of their respective parameters, in identifying victims with either type of camera. The prominent influence of these identified factors accentuates the critical importance of careful consideration and optimization during system design.
Regarding thermal imaging dominance under low-light conditions, these images outperform RGB and multispectral methods in victim detection, leveraging their reliance on thermal footprints. The method detects individuals hidden by slender obstructions, demonstrating superior obstacle penetration compared to RGB-based techniques.
As thermal imaging relies on a single measurement spectrum, Multispectral’s broader spectrum combinations enhance its adaptability and robustness in diverse scenarios. Additionally, the Multispectral method demonstrates robust detection in challenging conditions, such as scenarios involving less distinctive clothing, where RGB methods face limitations. In scenarios with heat sources or incidents like fires, the combined use of RGB and Multispectral methods offers a substantial advantage, surpassing the performance of the thermal method.
As prospective avenues for future research, exploring potential novel indices could be contemplated through integrating the diverse bands captured by the three types of sensors. These indices could be iteratively generated through repetitive loops, leveraging their operational versatility as matrices. Alternatively, optimization and error minimization techniques, facilitated by machine learning, could be applied to extract distinguishing elements from the new images. This approach aims to assess their relevance in the detection of victims.

Author Contributions

Conceptualization, A.B., C.C.U., D.O. and J.d.C.; methodology, A.B. and C.C.U.; software, C.C.U. and D.O.; validation, C.C.U. and D.O.; formal analysis, C.C.U. and A.B.; investigation, C.C.U., A.B., D.O. and J.d.C.; resources, A.B. and J.d.C.; data curation, C.C.U. and D.O.; writing—original draft preparation, C.C.U., D.O. and J.d.C.; writing—review and editing, C.C.U., A.B. and D.O.; visualization, A.B. and J.d.C.; supervision, A.B. and J.d.C.; project administration, A.B. and J.d.C.; funding acquisition, A.B. and J.d.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research has been possible thanks to the financing of TASAR (Team of Advanced Search And Rescue Robots), funded by “Proyectos de I+D+i del Ministerio de Ciencia, Innovacion y Universidades” (PID2019-105808RB-I00) and “Proyecto CollaborativE Search And Rescue robots 709 (CESAR)” (PID2022-142129OB-I00) founded by MCIN/AEI/10.13039/501100011033 and “ERDF A 710 way of making Europe”.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Acknowledgments

We extend a special thanks to the colleagues at the LAENTIEC Laboratory at the University of Malaga who helped with some of the DJI Mavic drone shots.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
RGBRed Green Blue
CNNConvolutional Neural Network
ROSRobot Operating System
SARSearch and Rescue
IRInfrared
NIRNear-infrared
NISTNational Institute of Standards and Technology
UAVUnmanned Aerial Vehicle
UGVUnmanned Ground Vehicle
TASARTeam of Advanced Search And Rescue Robots
FPSFrames Per Second
VDINVictim Detection Index

References

  1. Adamkiewicz, M.; Chen, T.; Caccavale, A.; Gardner, R.; Culbertson, P.; Bohg, J.; Schwager, M. Vision-Only Robot Navigation in a Neural Radiance World. IEEE Robot. Autom. Lett. 2022, 7, 4606–4613. [Google Scholar] [CrossRef]
  2. Wilson, A.N.; Gupta, K.A.; Koduru, B.H.; Kumar, A.; Jha, A.; Cenkeramaddi, L.R. Recent Advances in Thermal Imaging and its Applications Using Machine Learning: A Review. IEEE Sens. J. 2023, 23, 3395–3407. [Google Scholar] [CrossRef]
  3. Zhang, H.; Lee, S. Robot Bionic Vision Technologies: A Review. Appl. Sci. 2022, 12, 7970. [Google Scholar] [CrossRef]
  4. Rizk, M.; Bayad, I. Human Detection in Thermal Images Using YOLOv8 for Search and Rescue Missions. In Proceedings of the 2023 Seventh International Conference on Advances in Biomedical Engineering (ICABME), Beirut, Lebanon, 12–13 October 2023; pp. 210–215. [Google Scholar]
  5. Lai, Y.L.; Lai, Y.K.; Yang, K.H.; Huang, J.C.; Zheng, C.Y.; Cheng, Y.C.; Wu, X.Y.; Liang, S.Q.; Chen, S.C.; Chiang, Y.W. An unmanned aerial vehicle for search and rescue applications. J. Phys. Conf. Ser. 2023, 2631, 012007. [Google Scholar] [CrossRef]
  6. Deng, L.; Mao, Z.; Li, X.; Hu, Z.; Duan, F.; Yan, Y. UAV-based multispectral remote sensing for precision agriculture: A comparison between different cameras. ISPRS J. Photogramm. Remote. Sens. 2018, 146, 124–136. [Google Scholar] [CrossRef]
  7. Blekanov, I.; Molin, A.; Zhang, D.; Mitrofanov, E.; Mitrofanova, O.; Li, Y. Monitoring of grain crops nitrogen status from uav multispectral images coupled with deep learning approaches. Comput. Electron. Agric. 2023, 212, 108047. [Google Scholar] [CrossRef]
  8. AlAli, Z.T.; Alabady, S.A. A survey of disaster management and SAR operations using sensors and supporting techniques. Int. J. Disaster Risk Reduct. 2022, 82, 103295. [Google Scholar] [CrossRef]
  9. Karasawa, T.; Watanabe, K.; Ha, Q.; Tejero-De-Pablos, A.; Ushiku, Y.; Harada, T. Multispectral object detection for autonomous vehicles. In Proceedings of the Thematic Workshops of ACM Multimedia 2017, Mountain View, CA, USA, 23–27 October 2017; pp. 35–43. [Google Scholar]
  10. Sharma, K.; Doriya, R.; Pandey, S.K.; Kumar, A.; Sinha, G.R.; Dadheech, P. Real-Time Survivor Detection System in SaR Missions Using Robots. Drones 2022, 6, 219. [Google Scholar] [CrossRef]
  11. Haq, H. Three Survivors Pulled Alive from Earthquake Rubble in Turkey, More Than 248 Hours after Quake. 2023. Available online: https://edition.cnn.com/2023/02/16/europe/turkey-syria-earthquake-rescue-efforts-intl/index.html (accessed on 14 January 2024).
  12. Pal, N.; Sadhu, P.K. Post Disaster Illumination for Underground Mines. TELKOMNIKA Indones. J. Electr. Eng. 2015, 13, 425–430. [Google Scholar]
  13. Safapour, E.; Kermanshachi, S. Investigation of the Challenges and Their Best Practices for Post-Disaster Reconstruction Safety: Educational Approach for Construction Hazards. In Proceedings of the Transportation Research Board 99th Annual Conference, Washington, DC, USA, 12–16 January 2020. [Google Scholar]
  14. Jacoff, A.; Messina, E.; Weiss, B.; Tadokoro, S.; Nakagawa, Y. Test arenas and performance metrics for urban search and rescue robots. In Proceedings of the 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453), Las Vegas, NV, USA, 27–31 October 2003 2003; Volume 4, pp. 3396–3403. [Google Scholar]
  15. Kleiner, A.; Brenner, M.; Bräuer, T.; Dornhege, C.; Göbelbecker, M.; Luber, M.; Prediger, J.; Stückler, J.; Nebel, B. Successful search and rescue in simulated disaster areas. In RoboCup 2005: Robot Soccer World Cup IX 9; Springer: Berlin/Heidelberg, Germany, 2006; pp. 323–334. [Google Scholar]
  16. Katsamenis, I.; Protopapadakis, E.; Voulodimos, A.; Dres, D.; Drakoulis, D. Man Overboard Event Detection from RGB and Thermal Imagery: Possibilities and Limitations. In Proceedings of the 13th ACM International Conference on PErvasive Technologies Related to Assistive Environments, PETRA ’20, New York, NY, USA, 30 June–3 July 2020. [Google Scholar]
  17. De Oliveira, D.C.; Wehrmeister, M.A. Using Deep Learning and Low-Cost RGB and Thermal Cameras to Detect Pedestrians in Aerial Images Captured by Multirotor UAV. Sensors 2018, 18, 2244. [Google Scholar] [CrossRef]
  18. Corradino, C.; Bilotta, G.; Cappello, A.; Fortuna, L.; Del Negro, C. Combining Radar and Optical Satellite Imagery with Machine Learning to Map Lava Flows at Mount Etna and Fogo Island. Energies 2021, 14, 197. [Google Scholar] [CrossRef]
  19. Cruz Ulloa, C.; Garcia, M.; del Cerro, J.; Barrientos, A. Deep Learning for Victims Detection from Virtual and Real Search and Rescue Environments. In ROBOT2022: Fifth Iberian Robotics Conference; Tardioli, D., Matellán, V., Heredia, G., Silva, M.F., Marques, L., Eds.; Springer: Cham, Switzerland, 2023; pp. 3–13. [Google Scholar]
  20. Cruz Ulloa, C.; Prieto Sánchez, G.; Barrientos, A.; Del Cerro, J. Autonomous Thermal Vision Robotic System for Victims Recognition in Search and Rescue Missions. Sensors 2021, 21, 7346. [Google Scholar] [CrossRef]
  21. Ulloa, C.C.; Llerena, G.T.; Barrientos, A.; del Cerro, J. Autonomous 3D Thermal Mapping of Disaster Environments for Victims Detection. In Robot Operating System (ROS): The Complete Reference; Koubaa, A., Ed.; Springer International Publishing: Cham, Switzerland, 2023; Volume 7, pp. 83–117. [Google Scholar]
  22. Ulloa, C.C.; Garrido, L.; del Cerro, J.; Barrientos, A. Autonomous victim detection system based on deep learning and multispectral imagery. Mach. Learn. Sci. Technol. 2023, 4, 015018. [Google Scholar] [CrossRef]
  23. Sambolek, S.; Ivasic-Kos, M. Automatic person detection in search and rescue operations using deep CNN detectors. IEEE Access 2021, 9, 37905–37922. [Google Scholar] [CrossRef]
  24. Lee, H.W.; Lee, K.O.; Bae, J.H.; Kim, S.Y.; Park, Y.Y. Using Hybrid Algorithms of Human Detection Technique for Detecting Indoor Disaster Victims. Computation 2022, 10, 197. [Google Scholar] [CrossRef]
  25. Lygouras, E.; Santavas, N.; Taitzoglou, A.; Tarchanidis, K.; Mitropoulos, A.; Gasteratos, A. Unsupervised human detection with an embedded vision system on a fully autonomous UAV for search and rescue operations. Sensors 2019, 19, 3542. [Google Scholar] [CrossRef]
  26. Domozi, Z.; Stojcsics, D.; Benhamida, A.; Kozlovszky, M.; Molnar, A. Real time object detection for aerial search and rescue missions for missing persons. In Proceedings of the SOSE 2020—IEEE 15th International Conference of System of Systems Engineering, Budapest, Hungary, 2–4 June 2020; pp. 519–524. [Google Scholar]
  27. Quan, A.; Herrmann, C.; Soliman, H. Project vulture: A prototype for using drones in search and rescue operations. In Proceedings of the 15th Annual International Conference on Distributed Computing in Sensor Systems, DCOSS 2019, Santorini Island, Greece, 29–31 May 2019; pp. 619–624. [Google Scholar]
  28. Perdana, M.I.; Risnumawan, A.; Sulistijono, I.A. Automatic Aerial Victim Detection on Low-Cost Thermal Camera Using Convolutional Neural Network. In Proceedings of the 2020 International Symposium on Community-Centric Systems, CcS 2020, Tokyo, Japan, 23–26 September 2020. [Google Scholar]
  29. Arrazi, M.H.; Priandana, K. Development of landslide victim detection system using thermal imaging and histogram of oriented gradients on E-PUCK2 Robot. In Proceedings of the 2020 International Conference on Computer Science and Its Application in Agriculture, ICOSICA 2020, Bogor, Indonesia, 16–17 September 2020; pp. 2–7. [Google Scholar]
  30. Gupta, M. A Fusion of Visible and Infrared Images for Victim Detection. In High Performance Vision Intelligence: Recent Advances; Nanda, A., Chaurasia, N., Eds.; Springer: Singapore, 2020; pp. 171–183. [Google Scholar]
  31. Seits, F.; Kurmi, I.; Bimber, O. Evaluation of Color Anomaly Detection in Multispectral Images for Synthetic Aperture Sensing. Eng 2022, 3, 541–553. [Google Scholar] [CrossRef]
  32. Dawdi, T.M.; Abdalla, N.; Elkalyoubi, Y.M.; Soudan, B. Locating victims in hot environments using combined thermal and optical imaging. Comput. Electr. Eng. 2020, 85, 106697. [Google Scholar] [CrossRef]
  33. Dong, J.; Ota, K.; Dong, M. UAV-Based Real-Time Survivor Detection System in Post-Disaster Search and Rescue Operations. IEEE J. Miniaturization Air Space Syst. 2021, 2, 209–219. [Google Scholar] [CrossRef]
  34. Zou, X.; Peng, T.; Zhou, Y. UAV-Based Human Detection with Visible-Thermal Fused YOLOv5 Network. IEEE Trans. Ind. Inform. 2023, 1–10. [Google Scholar] [CrossRef]
  35. Wang, X.; Zhao, L.; Wu, W.; Jin, X. Dynamic Neural Network Accelerator for Multispectral detection Based on FPGA. In Proceedings of the International Conference on Advanced Communication Technology, ICACT, Pyeongchang, Republic of Korea, 19–22 February 2023; pp. 345–350. [Google Scholar]
  36. McGee, J.; Mathew, S.J.; Gonzalez, F. Unmanned Aerial Vehicle and Artificial Intelligence for Thermal Target Detection in Search and Rescue Applications. In Proceedings of the 2020 International Conference on Unmanned Aircraft Systems, ICUAS 2020, Athens, Greece, 1–4 September 2020; pp. 883–891. [Google Scholar]
  37. Goian, A.; Ashour, R.; Ahmad, U.; Taha, T.; Almoosa, N.; Seneviratne, L. Victim localization in USAR scenario exploiting multi-layer mapping structure. Remote. Sens. 2019, 11, 2704. [Google Scholar] [CrossRef]
  38. Petříček, T.; Šalanský, V.; Zimmermann, K.; Svoboda, T. Simultaneous exploration and segmentation for search and rescue. J. Field Robot. 2019, 36, 696–709. [Google Scholar] [CrossRef]
  39. Gallego, A.J.; Pertusa, A.; Gil, P.; Fisher, R.B. Detection of bodies in maritime rescue operations using unmanned aerial vehicles with multispectral cameras. J. Field Robot. 2019, 36, 782–796. [Google Scholar] [CrossRef]
  40. Qi, F.; Zhu, M.; Li, Z.; Lei, T.; Xia, J.; Zhang, L.; Yan, Y.; Wang, J.; Lu, G. Automatic Air-to-Ground Recognition of Outdoor Injured Human Targets Based on UAV Bimodal Information: The Explore Study. Appl. Sci. 2022, 12, 3457. [Google Scholar] [CrossRef]
  41. NIST, National Institute of Standards and Technology: Gaithersburg, MD, USA, 2022. Available online: https://www.nist.gov/ (accessed on 14 January 2024).
Figure 1. Ranges of light in the visible and non-visible spectrum. Source: Authors.
Figure 1. Ranges of light in the visible and non-visible spectrum. Source: Authors.
Applsci 14 00766 g001
Figure 2. Evolution Publications on Knowledge Web over the last two decades in the context of RGB/Thermal/Multi-spectral sensors for search and rescue. Source: authors.
Figure 2. Evolution Publications on Knowledge Web over the last two decades in the context of RGB/Thermal/Multi-spectral sensors for search and rescue. Source: authors.
Applsci 14 00766 g002
Figure 3. Layout of subsystems connections. Source: authors.
Figure 3. Layout of subsystems connections. Source: authors.
Applsci 14 00766 g003
Figure 4. Experimental environments developed indoors and outdoors. Source: authors.
Figure 4. Experimental environments developed indoors and outdoors. Source: authors.
Applsci 14 00766 g004
Figure 5. Confusion matrix for CNN-trained models. Source: Authors.
Figure 5. Confusion matrix for CNN-trained models. Source: Authors.
Applsci 14 00766 g005
Figure 6. Indoors–outdoors exploration process at the University of Malaga and Center for Automation and Robotics facilities. Source: authors.
Figure 6. Indoors–outdoors exploration process at the University of Malaga and Center for Automation and Robotics facilities. Source: authors.
Applsci 14 00766 g006
Figure 7. Evaluation of three vision methods for victim detection in different scenarios. Source: authors.
Figure 7. Evaluation of three vision methods for victim detection in different scenarios. Source: authors.
Applsci 14 00766 g007
Figure 8. Comparative radial graph of the different coefficients that evaluate the indication of mission success for each light spectrum range. Source: authors.
Figure 8. Comparative radial graph of the different coefficients that evaluate the indication of mission success for each light spectrum range. Source: authors.
Applsci 14 00766 g008
Figure 9. Evaluation of the errors in the mean values of the proposed coefficients in Figure 8. Source: authors.
Figure 9. Evaluation of the errors in the mean values of the proposed coefficients in Figure 8. Source: authors.
Applsci 14 00766 g009
Figure 10. Approach to the maximized area, according to the best values of the proposed coefficients in Figure 8. Source: authors.
Figure 10. Approach to the maximized area, according to the best values of the proposed coefficients in Figure 8. Source: authors.
Applsci 14 00766 g010
Table 1. Comparison of state-of-the-art methods for victims/people SAR tasks using vision sensors.
Table 1. Comparison of state-of-the-art methods for victims/people SAR tasks using vision sensors.
WorkSensor TypeCamera ModelSpectral ResolutionCapture SystemEnvironmentProcessing TechniquesApplication
[23]RGBDJI Phantom 4A Camera3 (RGB Bands)UAVOutdoorsTransfer Learning of pre-trained CNNSearch and rescue—People detection
[24]RGBCCTV RGB Images3 (RGB Bands)-IndoorsHybrid Human Detection combining YOLO and RetinaNetSearch and rescue—Indoor disaster victims
[25]RGBGoPro3 (RGB Bands)UAVMaritimeVision-based neural network controller for autonomous landing on human targetsSearch and rescue—Help to human victim
[26]RGBSkywalker 1680 FPV camera3 (RGB Bands)UAVWildernessSingle-Shot MultiBox Detector (SSD) Network to detect humanSearch and rescue—Human detection
[27]RGBDji Phantom 4 Pro V2.0 camera3 (RGB Bands)UAVOutdoorsImage processing and Yolo v3 detectionSearch and rescue—Body detection
[20]THERMALOptris Pi6401 (IR)UGV (quadruped robot)OutdoorsThermal image processing and deep learning techniques for body detectionSearch and rescue—Victim detection
[28]THERMALFLIR Lepton 31 (IR)UAVOutdoorsSingle-Shot Multi-Box Detector (SSD) and Mobile Net for feature extractionSearch and rescue—Human detection
[29]THERMALFLIR Camera 2.01 (IR)Educational robotIndoorsSVM classification with linear kernel and HOG featureBody detection
[30]RGB-THERMALFLIR E604 (red, green, blue, IR)User carrying cameraIndoors–OutdoorsSkin detection and classification through feature extraction and SVM algorithmSearch and rescue—Victim detection
[31]RGB-THERMALDataset from () captured with Flir Vue Pro and RGB camera4 (red, green, blue, IR)UAVOutdoorsAirborne Optical SectioningSearch and rescue—Body detection
[32]RGB-THERMALFLIR Lepton 34 (red, green, blue, IR)UAVOutdoorsImage blending, matching, and processing of thermal imagesSearch and rescue—Victims localization
[33]RGB-THERMALZenmuse XT2 (FLIR Tau 2 and RGB camera)4 (red, green, blue, IR)UAVOutdoorsDeep Learning Techniques for UAV thermal image pedestrian detectionSearch and rescue—Human detection
[34]RGB-THERMALKAIST Multiespectral Pedestrian Dataset captured with PointGrey Flea3 and FLIR-A354 (red, green, blue, IR)UAVOutdoorsVisible–thermal fusion strategy processing and detection with Yolov5Search and rescue—Body detection
[35]RGB-THERMALLLIV Dataset, captured with HIKVISION DS-2TD8166BJZFY-75H2F/V24 (red, green, blue, IR)Security camerasOutdoorsDynamic neural network, replacing backbone of Yolov5, adopting Differential Modality-Aware Fusion Module (DMAF)Search and rescue—Body detection
[36]RGB-THERMALFLIR Tau 2 6404 (red, green, blue, IR)UAVOutdoorsThermal image processing and deep learning for detection (Darknet-53 NN)Search and rescue—Body detection
[37]RGB-THERMALROS Simulated Camera4 (red, green, blue, IR)UAVOutdoorsSingle-Shot Multi-Box Detector (SSD) for RGB, blob detector for thermal, and wireless localization of victim phoneSearch and rescue—Victim localization
[38]RGB-THERMALPoint Gray Ladybug 3 and Micro Epsilon thermolMager TIM 160 thermal camera4 (red, green, blue, IR)UGVOutdoorsHuman/background segmentation of the 3D voxel map and simultaneous control of thermal camera using multimodal CNN modelsSearch and rescue—Victim localization
[22]MULTISPECTRALMicasense Altum7 (blue, green, red, red edge, near-IR, LWIR thermal infrared)UGV (quadruped robot)OutdoorsConvolutional Neural Network applied to multispectral imagerySearch and rescue—Victim detection
[39]MULTISPECTRALMicasense RedEye6 (blue, green, red, red edge, near-IR)UAVMaritimeConvolutional Neural Network applied to multispectral imagerySearch and rescue—Body detection
[40]MULTISPECTRALFoxtech MS6006 (blue, green, red, red edge, near-IR)UAVOutdoorsMultispectral and bio radar bimodal information-based human recognition using respiration rate and image processing with decision treesSearch and rescue—Recognition of human target
[9]MULTISPECTRALLogicool HD webcam, Nippon Avionics, InfReC R500, Nippon Avionics, InfRecH8000, Xenics, Xeva-1.7-3207 (blue, green, red, MIR, near-IR, FIR)Cart for data acquisitionOutdoorsMultispectral ensemble for using multispectral images for object detectionAutonomous vehicles—Body detection
AuthorsRGB Thermal MultispectralRealSense D435i OptrisPi640 Altum Micasense7 (blue, green, red, red edge, near-IR, LWIR thermal infrared)Quadruped RobotIndoors OutdoorsReal-time processing with Convolutional Neural NetworksSearch and Rescue—victim detection
Table 2. Materials for the proposed system implementation.
Table 2. Materials for the proposed system implementation.
ItemComponentDescription
1ARTU-R(A1 rescue Task UPM Robot) Quadrupedal Robot with instrumentation and embedded systems (Nvidia Jetson Xavier).
2RealSense D435iRGB-D Camera
B = [450–495 nm]
G = [495–570 nm]
R = [620–750 nm]
3Optris Pi640iThermal Camera
[8 μ m–14 μ m]
4MicaSense AltumMulti-spectral Camera
B = [440–510 nm]
G = [520–590 nm]
R = [630–685 nm]
Red Edge = [690–730 nm]
NIR = [750 nm–2.5 μ m]
Table 3. Quantitative metrics proposed for the evaluation of vision systems.
Table 3. Quantitative metrics proposed for the evaluation of vision systems.
Proposed Individual Evaluation Metrics
α Outdoors
β Poor Light Conditions
γ Indoors
δ Processing Time
ϵ Heat Sources Presence
θ Totally covered victim
ϕ Partially covered victim
ω Clothes colour
λ Summer/Fire conditions
τ Changing Light Conditions
Table 4. Metrics obtained from the coefficients proposed for the type of image in the three ranges of the light spectrum.
Table 4. Metrics obtained from the coefficients proposed for the type of image in the three ranges of the light spectrum.
ParameterThermal RangeRGB RangeMultispectral Range
Metrics Analysis α 89.397.195.4
β 97.254.667.4
γ 92.193.494.8
δ 97.198.267.5
ϵ 31.778.188.4
θ 95.112.713.8
ϕ 96.789.289.4
ω 93.175.590.7
λ 68.184.785.2
τ 94.286.593.4
General Score755662714
SAR Score966744839
Indoors ExperimentsVictims detection success rate %92.484.786.5
Outdoors ExperimentsVictims detection success rate %91.179.481.2
Time EvaluationInference time
f.p.s26288
Individual area covered %85.276.078.1
Total Covered Area of Analysis %93.5
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Cruz Ulloa, C.; Orbea, D.; del Cerro, J.; Barrientos, A. Thermal, Multispectral, and RGB Vision Systems Analysis for Victim Detection in SAR Robotics. Appl. Sci. 2024, 14, 766. https://doi.org/10.3390/app14020766

AMA Style

Cruz Ulloa C, Orbea D, del Cerro J, Barrientos A. Thermal, Multispectral, and RGB Vision Systems Analysis for Victim Detection in SAR Robotics. Applied Sciences. 2024; 14(2):766. https://doi.org/10.3390/app14020766

Chicago/Turabian Style

Cruz Ulloa, Christyan, David Orbea, Jaime del Cerro, and Antonio Barrientos. 2024. "Thermal, Multispectral, and RGB Vision Systems Analysis for Victim Detection in SAR Robotics" Applied Sciences 14, no. 2: 766. https://doi.org/10.3390/app14020766

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop