Polarimetric Imaging for Robot Perception: A Review

Taglione, Camille; Mateo, Carlos; Stolz, Christophe

doi:10.3390/s24144440

Open AccessReview

Polarimetric Imaging for Robot Perception: A Review

by

Camille Taglione

^1,*

,

Carlos Mateo

^2,*

and

Christophe Stolz

¹

Vibot, ImViA UR 7535, Université de Bourgogne, 12 Rue de la Fonderie, 71200 Le Creusot, France

²

ICB UMR CNRS 6303, Université de Bourgogne, 9 Avenue Alain Savary, 21078 Dijon, France

^*

Authors to whom correspondence should be addressed.

Sensors 2024, 24(14), 4440; https://doi.org/10.3390/s24144440

Submission received: 11 May 2024 / Revised: 22 June 2024 / Accepted: 26 June 2024 / Published: 9 July 2024

(This article belongs to the Section Sensors and Robotics)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

In recent years, the integration of polarimetric imaging into robotic perception systems has increased significantly, driven by the accessibility of affordable polarimetric sensors. This technology complements traditional color imaging by capturing and analyzing the polarization characteristics of light. This additional information provides robots with valuable insights into object shape, material composition, and other properties, ultimately enabling more robust manipulation tasks. This review aims to provide a comprehensive analysis of the principles behind polarimetric imaging and its diverse applications within the field of robotic perception. By exploiting the polarization state of light, polarimetric imaging offers promising solutions to three key challenges in robot vision: Surface segmentation; depth estimation through polarization patterns; and 3D reconstruction using polarimetric data. This review emphasizes the practical value of polarimetric imaging in robotics by demonstrating its effectiveness in addressing real-world challenges. We then explore potential applications of this technology not only within the core robotics field but also in related areas. Through a comparative analysis, our goal is to elucidate the strengths and limitations of polarimetric imaging techniques. This analysis will contribute to a deeper understanding of its broad applicability across various domains within and beyond robotics.

Keywords:

computer vision; polarimetric; physic-based imaging; robotic perception

1. Introduction

The polarization of light, while not immediately apparent, is an important natural phenomenon. It manifests in different scenarios, such as the use of polarized sunglasses, LCD screens, medical imaging, and stress detection in materials. Analyzing this phenomenon often involves using Fresnel’s theory to understand how natural light reflects off surfaces [1,2,3]. Building upon this theory, numerous approaches have been developed, including the Jones calculus, Stokes parameters [4], Mueller calculus [5], Fourier analysis [6], least squares fitting [7], and more recently, Monte Carlo simulation [8] and machine learning [9]. These methodologies aim to extract diverse information from polarized light, facilitating its application across various domains. In robotic perception, polarimetric imaging is becoming increasingly important and continues to evolve. By leveraging the complex polarization state of light, polarimetric imaging enhances robots’ perceptual abilities, allowing them to detect subtle details that may be missed by traditional imaging methods. Polarization is gaining popularity in robotic perception, thanks to the proliferation of division-of-focal-plane (DoFP) sensors and advancements in deep learning techniques [10,11,12]. Unlike traditional cameras, polarimetric imaging offers a more profound understanding of the environment by analyzing the polarization properties of light. This reveals detailed information about surfaces, materials, and object features, enhancing robotic perception capabilities. Consequently, the polarimetric imaging community has explored the relationship between surface normals and light reflection to estimate shape distributions. Moreover, research in surface feature extraction [13,14,15] has further enriched the understanding of polarimetric imaging. A key focus of this community lies in 3D reconstruction, leveraging polarimetry to estimate 3D shapes by decoding polarization-encoded information in light [13,16,17,18,19,20,21]. By revealing subtleties such as surface texture, material composition, and object contours, polarimetric imaging empowers robots to attain a comprehensive understanding of their surroundings.

Polarimetric imaging holds potential for material discrimination. Research in estimating illumination and indices of refraction for different materials has gained interest [7]. Robots leverage the unique optical characteristics of materials to navigate environments more precisely. Furthermore, polarimetric imaging acts as a shield against visual distortion caused by reflections and glare. This technology helps robots equipped with polarimetric sensors to maintain clear visibility, even in difficult lighting situations. As a result, they can make better-informed decisions and perform tasks more efficiently. There exists a concerted effort within the polarimetric imagining community to refine image conditions, exemplified by endeavors to enhance contrast in images [22] for improved object visibility and ease of recognition. Some methods [23] focus on the suppression of reflections in images with high-specularity scenes. Furthermore, the utility of polarimetric imaging extends to image segmentation [24,25,26,27], an important aspect within the field of robotics, where precise delineation of objects is essential for tasks such as autonomous navigation and object manipulation. The efficacy of robots in object recognition and tracking experiences a substantial augmentation through the application of polarimetric imaging. By discerning the unique polarization characteristics associated with objects, robots can discern nuanced differences between similar surfaces, thereby refining the precision of their detection and tracking algorithms. Notably, polarimetric imaging, with its ability to capture and dissect the polarization state of light interacting with transparent or semi-transparent objects [13,24,28], emerges as a tool for obtaining detailed insights into their internal structure and composition. Unlike conventional imaging methodologies, which may encounter challenges in this regard, polarimetric imaging adeptly exploits phenomena such as birefringence and dichroism.

Incorporating polarimetric imaging directly into robotic systems promises to enhance the accuracy and reliability of current technological methods. This advancement drives robots toward higher levels of autonomy and adaptability across many applications. In the study of linear deformable object (LDO) perception, Zhaole et al. [29], integrating polarimetry, improve 3D reconstruction beyond RGB-D images, enhancing reliability and detail for better manipulation by robotic systems. Mora et al. [30] introduced a laser scanning method for identifying reflective surfaces. Integrating polarimetric imaging could improve its accuracy, helping in reflection data analysis, indoor environment modeling, and optimizing robot movements for safety and efficiency. Huy et al. [31] highlighted the importance of imaging and acoustics in underwater perception. Integrating polarimetry could enhance results by improving the signal-to-noise ratio (SNR) and object identification in low-visibility conditions. Wang et al. [32] could improve safety monitoring in low-light conditions by integrating polarimetric imaging. Even when incorporating polarimetric imaging has a lot of potential, with today’s technology we find challenging situations. For example, applications like [33] suffer from difficulties because of the low level of reflectance present on surfaces typically present in construction. But this application can still benefit polarimetry through 3D reconstruction of the walls to help detect cracks.

Polarimetric imaging finds itself important for applications across diverse domains such as remote sensing and underwater exploration. By dealing with surface reflections, polarimetric imaging develops the horizons of underwater navigation and marine ecosystem exploration, facilitating tasks ranging from the detection of submerged objects to ecological monitoring. Moreover, the impact of polarimetric imaging extends to the domain of bio-medicine, playing an important role in dealing with the intricate nature of biological complexity. By increasing contrast in tissues and structures, polarimetric techniques facilitate precise visualization in medical imaging applications. These advancements are driving progress in tissue analysis and diagnostic imaging. Instances of image segmentation utilizing polarimetry are common in medical imaging [5,34,35], facilitating the identification and classification of various tissue conditions, including healthy and cancerous tissues [35,36]. Bees orient themselves with sunlight polarization, and fish detect prey or avoid predators in low-visibility environments. Robotics research aims to replicate these abilities, with robots using polarization sensors for navigation [37,38]. In aviation, polarimetry could enhance guidance systems, especially in low-visibility conditions [39]. In addition, polarimetric imaging is a powerful tool for extracting structural and content information from biological tissues, complementing the extraction of material properties [35,40,41]. A more extended review of different state-of-the-art applications is shown in Table 1.

In the field of robotic perception, polarimetric imaging is currently rated at level 6 on the TRL (Technology Readiness Level) scale, or level 5 depending on the application. This means that the technology has been demonstrated in relevant environments, close to real operating conditions, or is at the prototype stage but has not yet reached sufficient maturity for widespread commercial use. At this stage, the community continues to refine the technology, resolve technical challenges, and validate its performance through practical field trials.

The rest of the paper is structured as follows. First, we discuss the fundamentals of polarization in Section 2. Then, in Section 3, Section 4 and Section 5, we review the state of the art on three concrete robotic perception problems improved with polarimetry: shape characterization for object recognition and manipulation; semantic segmentation for scene understanding and environment mapping; and pose estimation for object localization and manipulation planning in robotic applications. In addition, Section 6 summarizes the state-of-the-art applications in the robotics field using polarimetry. Next, we examine the performance of different state-of-the-art methods on segmentation and shape characterization in Section 7, and summarize current challenges in Section 8. Finally, we discuss the current solutions (Section 9) and conclude (Section 10) this review.

2. Principles of Polarization

Polarization imaging offers a rich source of information for computer visual applications, enabling advancements in scene understanding, material discrimination, and imaging under challenging conditions. Integrating polarization data with traditional imaging modalities leads to more robust and accurate computer vision systems across various domains. Wave light, as an electromagnetic wave, can be represented using the two perpendicular fields, electric and magnetic. Polarimetry represents the study of the oscillation of the orientation of the electromagnetic fields. Thus, exploiting the wave light properties we can obtain information like material properties or shape distribution. By definition, polarization describes the path traced by a light beam (random, linear, circular, or elliptical) in analogy with the modification of orientation and intensity of the electric field through time. Using the Fresnel equation, we can link material properties to the light that reflects from it [2].

The Fresnel theory is the first approach proposed to manage polarized light, an approach based on the Malus Law from 1808. This theory allowed us to understand for the first time the polarization phenomenon by studying the difference between the light before and after reflection on a surface. Reflections can be diffused in all directions, diffuse, or in a single direction, specular. Both components have different polarimetric properties. But concretely, the specular component is, in general terms, more polarized than the diffuse one.

2.1. Stokes Parameters

Stokes vectors are the most common approach to represent the state parameters of light polarization. Unlike the Jones vector, this mathematical representation can describe both partial and completely polarized light. Physically [60], a specific light beam can be described using the four-parameter vector

S = {[S_{0} S_{1} S_{2} S_{3}]}^{⊺}

. According to Stokes in the work [4], a light beam can be modelled by the expression

I (α, ϕ) = \frac{1}{2} (S_{0} + S_{1} cos 2 α + S_{2} sin 2 α cos ϕ - S_{3} sin 2 α sin ϕ),

which is written in terms of the four-parameter vector S. This equation describes the intensity of light observed through a linear polarizer rotated by an angle

α

with respect to a reference axis and a retarder with a phase shift

ϕ

. Thus, according to this model, the total intensity of light is encoded in the parameter

S_{0}

; the quantity of light polarized either horizontal or vertical is encoded in

S_{1}

;

S_{2}

describes the light that is linearly polarized at

45^{\circ}

or

135^{\circ}

; and

S_{3}

describes the light that is circularly polarized. As Garcia et al., in [61], concluded, if the retardance

ϕ

is set to either

0^{\circ}

or an angle of

90^{\circ}

and the angle of linear polarization

α

is set to

[0^{\circ} 45^{\circ} 90^{\circ} 135^{\circ}]

, the Stokes vector can be uniquely computed as

S = \{\begin{matrix} S_{0} = \frac{1}{2} (I (0^{\circ}, ϕ) + I (45^{\circ}, ϕ) + I (90^{\circ}, ϕ) + I (135^{\circ}, ϕ)), ϕ \in [0^{\circ}, 180^{\circ}] \\ S_{1} = I (0^{\circ}, 0^{\circ}) - I (90^{\circ}, 0^{\circ}) \\ S_{2} = I (45^{\circ}, 0^{\circ}) - I (135^{\circ}, 0^{\circ}) \\ S_{3} = I (135^{\circ}, 90^{\circ}) - I (45^{\circ}, 90^{\circ}) \end{matrix},

(1)

where

S_{0}

and

S_{1}

represent the total and the difference of light intensity, respectively, between both components.

S_{2}

and

S_{3}

relate the phase difference between the horizontal and vertical components, providing information about the linear and elliptical polarization of the light.

2.2. Light Characterization Using Stokes Parameters

The three quantities determined from the Stokes parameter representation are the intensity, the degree of polarization (DoP), and the angle of polarization (AoP). While the intensity is codified directly by the first parameter

S_{0}

, DoP and AoP are computed as follows:

\{\begin{matrix} DoP = \frac{\sqrt{S_{1}^{2} + S_{2}^{2} + S_{3}^{2}}}{S_{0}} \\ AoP = \frac{1}{2} arctan \frac{S_{2}}{S_{1}} \end{matrix}

(2)

DoP represents the proportion of the light that is polarized, where 0 indicates unpolarized and 1 represents linearly polarized. The degree of polarization can be used to obtain information about the material from which the light is reflected. In fact, if we create a map of the image size that for each pixel gives the degrees of polarization, then we can study the deformation of surfaces in the captured scene. AoP is the projection of the angle between the direction of the electric field and the angle of the polarizer onto the image plane (cf. Section 2.3).

2.3. Polarimetric Image Acquisition

The architecture of a polarimetric sensor consists of two key components: a light source emitting polarized light; and a polarization system with a linear polarizer adjusting light polarization. The interaction alters light polarization based on surface properties. Polarimetric sensors capture less light intensity compared to classical cameras due to the filtering process during acquisition. However, they provide complementary information on material properties. To leverage the strengths of different modalities, various approaches integrate polarimetric imaging with color imaging [26,51,62,63], shading [64,65], multispectral imaging [66,67], and depth sensing for object shape refinement [68,69]. Rotating a polarizer in front of a color camera was previously used to capture polarization information [70] but faced challenges such as time delays and complex acquisition setups, limiting real-time applications.

In recent years, polarization sensors gained popularity due to advancements in focal plane division sensors (DoFP) [10,11,12]. These sensors, comprised of micro-polarizers arranged in a grid, enable capturing multiple images of a scene with different polarizer angles in a single shot, without the need for internal mechanical rotation. This design allows for compactness and versatility in capturing images across various polarizations. The sensor’s architecture typically includes four Bayer filters corresponding to different polarizer angles, facilitating image acquisition. Depending on the combination of Bayer filters, the camera can be classified as standard color, gray polarimetric, or color and polarimetric. However, utilizing these sensors necessitates a pre-treatment phase known as demosaicking. The process described involves separating information from raw images using twelve channels, each representing polarizer angles and colors. Demosaicking is then applied to interpolate sparse images, filling gaps and ensuring uniform size. Various techniques like linear interpolation and machine learning, including deep learning and GANs, are utilized for this interpolation. Additionally, some methods leverage polarization properties or utilize adaptive filters for more accurate results. This improves traditional sensors because it captures scene images with a wider range of polarization angles, including commonly used angles like {0°, 45°, 90°, 135°}, as well as less common ones such as {0°, 30°, 60°, 90°} [15]. These diverse images enable the extraction of Stokes parameters (Section 2.2).

2.4. Weaknesses and Drawbacks

Polarimetry, like any measurement technique, faces challenges and uncertainties. These are caused by instrumental effects, noise, incomplete polarization description, and external factors like ambient light or interference [71]. Imperfect components and misalignment can contribute to ambiguity, as can failures during depolarization. Additionally, complex materials or samples with multiple scattering events or anisotropic properties can further complicate measurements. Understanding the sample’s nature is crucial for accurate interpretation in polarimetry.

The effect called “pi ambiguity” is relevant, relating to the cyclic nature of polarization measurements, which exhibit periodic behavior. This ambiguity effect poses problems for accurately determining the true polarization state of light, as several solutions may exist for a given set of measurements [72]. For example, by understanding this nature the calculus of surfaces via normal vectors can be obtained through the Snell–Descartes laws [73] of reflection and refraction. These laws link the angle of incidence

θ_{i}

, the angle of reflection of light on a surface

θ_{r c}

, the angle of refraction through the surface

θ_{r f}

, and the normal to the surface (as reference frame).

\begin{matrix} \begin{matrix} | θ_{i} | = | θ_{r c} | \\ n_{i} sin θ_{i} = n_{r f} sin θ_{r f} \end{matrix} \end{matrix}

(3)

where

n_{i}

is the index of refraction of the first medium and

n_{r f}

is the index of refraction of the second medium. But the method [15] used to obtain the normal vector depends on the calculation of zenithal and azimuthal angles, which involve a non-bijective function. This results in an ambiguous solution which provides two possible values for the azimuth angle. To resolve the ambiguity about the azimuth angle, several methods have been developed. Atkinson et al. [74] used shadow as an additional modality and diffuse reflections, Morel et al. [28] studied metal surfaces with active lighting, they constrained the zenith angle to solve the ambiguity in the azimuth angle. Stolz et al. [75] used multispectral imaging with the degree of polarization to resolve the ambiguity in the azimuth. Zhao et al. [76] developed a quadruple-polarized hybrid SAR system to resolve the azimuth ambiguity. Garcia et al. [61] studied the circular component of polarization to disambiguate the zenith angle, while Hwang et al. [16] used ellipsometry to resolve the azimuth ambiguity.

When calculating polarization angles and degrees using the Stokes vector, another ambiguity comes into play. The diffuse/specular ambiguity describes the direction of polarization that aligns parallel or perpendicular to the plane of incidence, depending on the dominant reflection (diffuse or specular). In diffuse cases, the degree of polarization (DoP) increases with the elevation angle, and the angle of polarization (AoP) aligns with the azimuth angle. In specular cases, the DoP increases up to the Brewster angle, after which it decreases. Diffuse reflectance involves scattering inside and transmission outside the surface, while the specular component is a direct reflection from the surface’s micro-facets. These components have a perpendicular angle of attack, and specular areas generally have a higher DoP than diffuse areas.

3. Image Segmentation Using Polarimetry

Segmentation is an important task of robot perception, particularly in domains like autonomous driving and robotic manipulation, where understanding the environment is also important for safe and effective operation. In autonomous driving, segmentation is integral to the perception system, enabling vehicles to interpret their surroundings in real time. Accurate identification of obstacles allows autonomous vehicles to navigate complex urban environments autonomously. Similarly, in robotic manipulation, segmentation is basic for robots to perceive and interact with objects efficiently. Whether in industrial automation or household robotics, robots must recognize and manipulate objects precisely. Segmentation aids in isolating objects from their backgrounds, facilitating tasks such as grasping, manipulation, sorting, and packing.

Often, segmentation suffers when dealing with transparent objects, reflective surfaces (water, glass, metal), and varying lighting conditions. Polarimetric imaging offers a promising solution by capturing both the intensity and polarization properties of light, providing valuable insights into object properties. Exploiting the polarization properties of light to extract detailed scene information, polarimetric imaging captures multiple polarization states, providing clues to deal with the segmentation of this kind of complex situation.

Traditional methods, such as thresholding, edge detection, region growth, active contours, Markov random fields, blob detection, region-based segmentation and machine learning with hand-crafted features, used hand-crafted rules and algorithms to identify and separate objects or regions of interest in images. Although these techniques are still used in scenarios with limited data or specific constraints, deep learning has become the predominant approach for image segmentation. Deep learning methods, with their ability to automatically learn features from data, have generally outperformed traditional techniques, particularly on large, complex datasets.

Most studies on segmentation tasks utilize the U-Net architecture [77], initially popularized in medical image segmentation. It comprises an encoder and decoder, enabling accurate segmentation. The encoder reduces spatial dimensions via convolutional and pooling layers, while the decoder increases them through oversampling and concatenation, refining the results. Notably, the architecture incorporates skip connections, linking the contracting and expansive paths, facilitating precise localization and enhancing model accuracy.

The U-Net architecture [25,27], renowned for its segmentation capabilities, is used in the processing of complex scenes with high accuracy and compatibility with data augmentation. Based on earlier models such as funnel convolution networks (FCNs) and fully convolutional networks (FCNs), U-Net overcomes the limitations of small-object processing while retaining spatial information, enabling multi-scale feature learning and facilitating real-time applications. Although other convolutional neural network (CNN) methods [24,49] have been used for segmentation tasks, U-Net remains the preferred choice due to its superior efficiency. Recent advances, such as attention mechanisms [24,51], have been incorporated into segmentation networks to improve performance by focusing on specific parts of the image. These mechanisms, borrowed from recurrent neural networks, transform neural networks to achieve better segmentation results [26].

Segment Anything [78] represents a big advancement in the field of image segmentation, allowing for precise identification of objects with minimal user intervention. A foundational deep learning model, SAM (Segment Anything Model), can interpret various types of prompts, including points, bounding boxes, and text descriptions, making segmentation accessible, intuitive and adaptable. This progress in segmentation technology holds promise for many applications, driving forward the capabilities of computer vision systems toward greater versatility and efficiency. This foundation can benefit from polarimetry, as can see in Figure 1, reducing segmentation errors by ignoring reflections and ensuring more precise results.

3.1. State-of-the-Art Methods in Polarimetric Segmentation

Here, we focus on the research in real-world application that utilize polarimetric imaging for segmentation in outdoor scenes for autonomous vehicles and segmentation of transparent objects for robotic manipulation tasks. Thus, we review works that exploit the potential of polarimetry to realize segmentation tasks.

Blanchon et al. [79] propose a new semantic segmentation approach based on a two-axis neural network architecture exploiting polarization cues. By exploiting these cues, the CNN is able to discern surface and material information, which are the two main objectives of their framework. The authors propose the regularization of surface normals as a means to delineate non-Lambertian regions, achieved via the incorporation of polarimetric features. To implement this concept, the authors utilize the projection of Stokes parameters onto the Poincaré sphere [80,81], enabling the extraction of key metrics such as intensity, AoP, and DoP. In addition, Blanchon et al. adopt Fresnel [3] equations to delineate the complex interaction between light and material properties, while modeling the polarization state of light in the scene based on the hue–saturation–luminance (HSL) framework.

Zhang et al. [26] proposed a cross-modal fusion method (Figure 2) for semantic segmentation, leveraging transformers. They employ parallel backbones for RGB and supplementary X-modal inputs, integrating cross-modal feature rectification modules and feature fusion modules. These rectify noisy data and uncertainties, utilizing attention-driven channel-wise feature modules and convolution for spatial weight maps. The method facilitates global information flow through a symmetric dual-path structure and cross-attention technique, ending in a final output feature map combining both modal routes. In both cases [26,79], the degree of linear polarization (DoLP) and the angle of linear polarization (AoLP) were used as representations of polarization data. The polarimetric datasets that were used by the authors were collected using a multi-modal vision sensor.

Kalra et al. [24] tackle the challenge of object instance segmentation for transparent objects, a task fraught with difficulties such as artefacts arising from anisotropic properties, light reflection, scattering, and the absence of texture. Leveraging polarimetric imaging, they introduce additional information to navigate these challenges effectively. Their methodology focuses on merging color and polarimetric data optimally to achieve accurate segmentation, utilizing the Mask R-CNN architecture widely employed for segmentation tasks. Their approach involves defining three input channels: gray scale images representing intensity; polarization angle; and degree of polarization. By integrating these modalities efficiently, Kalra et al. aim to enhance segmentation performance, particularly in challenging scenarios involving transparent objects.

While Yu et al. [82] push the boundaries of polarimetric image analysis by introducing a novel approach aimed at simplifying their architecture and eliminating the need for spatial attention patterns. Departing from recent methodologies [26,79], they adopt a polarimetric monochromatic imaging strategy, which allows for the separation of reflection and refraction within the image. Their primary contribution lies in the development of a transparent object segmentation network, which leverages extended interiors and enhanced edges while exploring various polarimetric representations and assessing their efficacy as network inputs. They notably emphasize the limitations of traditional representations like AoP and DoP, which are prone to noise and ambiguity; see Section 2.4.

3.2. Datasets for Polarimetric Image Segmentation

Different to more conventional modalities. Polarimetric-based robot perception has a lack of datasets in the literature. This is mainly due to two reasons, polarimetric imaging is a relatively recent modality in robotics, and the challenges that present polarimetric imaging in unstructured environments. But, we find already some initial approximations for the creation of polarimetric imaging datasets, like the two datasets presented in [25,51].

Refs. [25,27] introduced the Polabot dataset, tailored for polarimetric imaging of outdoor scenes, primarily intended for specular area segmentation. The dataset comprises 177 color images and their corresponding polarimetric equivalents, all captured using a multi-modal system featuring four cameras, including two color cameras with varying angles for stereo vision, a near-infrared camera, and a polarization camera. Despite the limited number of images, the authors proposed a data augmentation technique [83] to enhance the dataset’s utility for deep neural network applications.

Another popular dataset was developed by Xiang et al. [51]; it focuses on outdoor scenes for autonomous driving, utilizing an integrated multi modal sensor including polarization, RGB, infrared, and depth sensors. Their system captures color, infrared, polarization, and monocular depth images, with software providing direct output of stereo depth and surface normal information. In contrast to Polabot, their dataset offers higher resolution images (1024 × 1224) and consists of 394 images, expandable via augmentation techniques.

Creating a polarimetric dataset for autonomous driving involves several key steps. It begins with selecting suitable polarimetric cameras mounted on a vehicle to provide a 360-degree view. Planning diverse routes covering various driving scenarios and synchronizing cameras with LiDAR, radar, and RGB sensors are crucial. Post-collection data calibration and annotation are needed.

In this study, we opted to utilize the Polabot dataset, specifically focusing on the smaller-resolution variant. This decision was made to decrease the time and resources required for the evaluation process. Moreover, the Polabot dataset’s smaller resolution is highly compatible with the chosen augmentation method, which was developed by the same authors of the dataset. This compatibility ensures a better integration and potentially more accurate evaluation results, leveraging the augmentation techniques specifically tailored for this dataset.

4. Shape Characterization

Shape characterization is both a valuable tool and a challenge, encompassing tasks such as depth estimation, surface normal estimation, and 3D reconstruction. In robot perception, accurate shape characterization enables robots to comprehend and interact with their environment effectively, facilitating tasks such as navigation, object recognition, and manipulation.

4.1. Depth from Polarimetry

Traditionally, there has been many studies on recovering depth images from the geometry of either a stereo pair camera configuration or exploiting the geometry of structured light [84]. But these methods lack robustness most of the time when they are dealing with bad light conditions, textureless scenes, the presence of transparencies, reflections, etc. To address this problem, studies have recently focused on the potential of learning approaches [85]. Although they present potential benefits for dealing with situations like bad light conditioning, or textureless surfaces, they suffer from difficulties when dealing with other light interactions like reflectances or transparencies. Another way to improve on the limitations of traditional methods is to exploit other visual technologies, such as LiDAR [86]. But, although it is a technology that gives more accurate measurements, it is costly and weather-sensitive. In recent years, polarimetric imaging has emerged as a promising alternative for depth estimation in robotics [87] because the analysis of the polarization state of light reflected or scattered by objects in the scene helps deduce distances. By comparing these variations with known polarization patterns, accurate depth measurements can be obtained. This fact, is being also studied as a potential complementary modality [88], offering enhanced accuracy and robustness in certain scenarios.

A first approach to distance (depth) estimation using polarimetry was presented in [89], which focused on two parametric models involving electromagnetic plane waves. But these initial methods were impractical because of the need for controlling the scene and for prior knowledge of the material’s polarization characteristics. More recently, approaches like [68] utilize DoP and AoP, with calibration establishing a depth-to-polarization mapping. But still, linear polarized light sources tends to be necessary to calculate light depolarization post-interaction with the scene, as polarization changes with surface interactions.

To avoid the need of structured environments recent approaches propose the use of deep learning architectures for obtaining depth through polarization. These include recurrent neural networks (RNNs), which excel at processing sequential data but struggle with occlusion [90,91]. Fully convolutional networks (FCNs) process inputs of any size and produce outputs of the same size [92,93]. Encoder–decoder networks [94,95] are also employed for this task. Residual networks (ResNets) utilize residual connections for improved information flow between layers [96,97]. Siamese networks compare two inputs and generate a similarity score [98,99]. Additionally, radial basis function networks (RBFNs) [100,101] and attention modules have been explored for depth estimation through polarization [91,95].

4.2. Normal from Polarimetry

Normal estimation is a component of 3D shape characterization, fundamental in the field of robotics for tasks such as object manipulation, navigation, or scene understanding. Traditional methods for normal estimation, analogous to depth estimation, often rely on depth sensors or stereo vision, which may struggle in challenging lighting conditions or with transparent or specular surfaces. Normal-from-polarization is a method in computer vision and 3D imaging where normal information is extracted from polarized light [102] by analyzing measurements such as DoP and AoP, similar to depth estimation.

Today, deep learning schemes have emerged as the dominant approach. These include CNNs, which have demonstrated [103,104] robustness against variations such as rotation, occlusion, and different lighting conditions. However, CNNs may encounter limitations when dealing with time-series data.

On the other hand, radial basis function networks (RBFNs) have been employed to learn the non-linear relationship between polarization data and surface normals [105,106]. RBFNs utilize functions like Gaussian, multi-quadric, and inverse quadric as activation functions. While they offer fast training and perform well with non-linear problems, they are sensitive to input scaling, which can lead to issues if the input is not properly normalized.

Kirchengast et al. [107] proposed a LiDAR-based surface normal estimation method that can be enhanced by integrating polarimetric imagery. This improvement enhances point cloud accuracy, material differentiation, and overall LiDAR modeling robustness in diverse environments.

4.3. Three-Dimensional Reconstruction by Polarization

In the line of depth and normal estimation, works in the literature have addressed the problem of shape characterization is 3D reconstruction. This involves recovering the spatial structures of 3D objects from multiple 2D images or sensor measurements. Such 3D reconstruction is usually represented using voxel-based space [108,109], point cloud-based approaches [110], or multi-view based approaches [17,18,63]. But different to the depth and normal estimation, the problem of 3D reconstruction was rarely treated in previous works.

But with the advancements in polarimetric imaging, studies on 3D reconstruction have started to appear, making it a promising tool for various robotic applications, including autonomous navigation, object recognition, and scene understanding [14,15,21,28,63,75,111] because polarimetry offers a solution for capturing depth information in challenging lighting conditions like low light or glare [22,112,113]. Polarimetric imaging captures light polarization, revealing object orientation and surface characteristics, as discussed in the above subsections. Integrated with traditional methods, it enhances 3D model creation by providing more surface property information. This improves surface normal estimation, helping quality control and detailed 3D modeling, and enhances material classification. Polarimetry also offers robustness in challenging environments, vital for autonomous navigation in adverse weather. Integrating polarimetric imaging with traditional methods can enhance depth estimation accuracy and performance in textureless regions (transparent surfaces). Combining LiDAR with polarimetric data can create more detailed and reliable environmental maps.

Recent advancements in 3D reconstruction leverage the power of neural networks to exploit information encoded in polarized light. This field investigates the application of CNNs [114,115] for this purpose. While CNNs excel in non-linear mapping, their generalization capabilities can be limited [114]. As an alternative, generative adversarial networks (GANs) have demonstrated exceptional performance in generating 3D models directly from polarization images [19,116]. Notably, GANs exhibit robustness to noise and incomplete data [19,116] but face difficulties in making the method converge. For scenarios involving dynamic scenes, recurrent neural networks (RNNs) offer a compelling solution due to their effectiveness in temporal modeling [117,118]. Finally, a less standard architecture, deep belief networks (DBNs), holds promise for 3D reconstruction tasks, particularly those utilizing unlabeled data, by virtue of their ability to learn hierarchical representations from the input [119].

4.4. State-of-the-Art Methods for Shape Characterization

The discussed methods for normal estimation vary, with some employing non-learning techniques like non-linear algorithms [120,121], while others opt for learning-based approaches, utilizing various architectures [14,63,122].

Lei et al. [122] propose a method that enables scene-level normal estimation (Figure 3) from a single polarization image. Their approach combines data-driven learning with physics-based priors, allowing for robust normal estimation across various lighting conditions and scene types, both indoors and outdoors (Figure 4). By incorporating a multi-head self-attention module and viewing encoding (close to Nerf [123,124,125]), the method addresses challenges such as handling errors caused by scene complexity. They leverage intrinsic parameters of polarization cameras to determine viewing directions, which aids in resolving local ambiguities in polarization cues. Their network architecture, based on U-Net, processes raw polarization images to extract intensity, DOP, and AOP information.

Dave et al. [63] introduced Polarization-Aided Neural Decomposition Of Radiance Architecture (PANDORA), a method for reconstructing (Figure 5) object geometry and appearance from multiple images. This polarimetric inverse rendering approach utilizes implicit neural representations, offering a compact representation of signals. Polarimetric inverse rendering is a computational technique used to recover the physical properties of a scene, such as surface material properties and lighting conditions, from polarimetric observations collected by imaging sensors. Object shape and reflectance characteristics are modeled using coordinate-based neural networks, while incident lighting is represented through an implicit network with directional embedding. An important disadvantage of this method is that it requires specific conditions during acquisition, such as non-polarized incident lighting and opaque objects composed of dielectric materials.

Deschrainte et al. [14] developed a method (Figure 6) which estimates the 3D shape of objects along with spatially varying reflectance properties such as diffuse and specular albedo maps and specular roughness maps. They utilized polarimetric information extracted from the scene using a standard method, Section 2.3, and acquired images under frontal flash illumination from a single view direction. Their method employed an improved U-Net architecture to extract information from polarization images and two explicit cues: a reflectance cue representing normalized diffuse color and a shape cue in the form of a normalized Stokes map. The Stokes map calculates normalized changes in reflectance when viewed through different polarization filter orientations. Their network architecture employs an encoder–decoder structure with three specialized branches focusing on different aspects of shape and appearance: diffuse and specular albedo, specular roughness, and surface normal and depth.

Yu et al. [121] propose a technique for extracting surface height from polarimetric data. Unlike traditional methods, their approach focuses on non-linear optimization through direct minimization of disparities between predicted and observed intensities across all pixels and polarizer angles. It accommodates various illumination, reflectance, and polarization models, including a variant that eliminates the need for illumination and albedo information by utilizing image ratios. Surface height serves as the unknown variable in a non-linear least squares optimization approach. By measuring and optimizing the difference between actual data and projections based on predicted surface height, the authors achieve direct surface height estimation. They utilize intensity, AoP, and DoP as representations of polarized light and employ vector ratios between observed intensities for direct surface height estimation. The authors utilize a probabilistic polarization model and employ the best linear unbiased estimator to minimize error between observed ratios and model predictions.

Smith et al. [120] proposed a method to estimate surface height from a single polarization image. Their approach involves solving linear equations to estimate height, addressing local ambiguity in surface normal azimuth angles. This method allows for the estimation of spatially variable albedo or illumination solely from the polarization image, reducing the need for known lighting and albedo information. By converting polarization and shading constraints into linear equations, they provide globally optimal height estimation. The authors investigated the use of ambiguous surface normals and unpolarized intensity data for calculating illumination and per-pixel albedo, leveraging standard representations of polarized light [26,50,79] such as intensity, AoP, and DoP. They utilized a sparse QR solver algorithm to determine the optimal height map.

4.5. Datasets for Polarimetric Shape Estimation

As mentioned above, datasets associated with shape characterization methods generally impose more strict acquisition requirements than those used for segmentation with polarimetric imaging. Control over various conditions, like incident lighting, background settings, and diverse viewing angles, becomes imperative in order to ensure the accuracy and reliability of the method using the data. This controlled environment allows researchers to manipulate key factors influencing the rendering process, facilitating a nuanced understanding of the interplay between incident light, object geometry, and material properties.

Boss et al. in their work [126] present a synthetic dataset made for capturing the shape of an object and its appearance (SVBRDF). They use, 1125 high-resolution SVBRDF maps. The authors randomly resize and take 768 × 768 crops of these material maps. They apply random overlays together with simple contrast, hue, and brightness changes. In the end, they obtain 11,250 material maps. They use domain-randomized object shapes to apply a random material on nine different shape primitives (spheres, cones, cylinders). They also apply six to seven random materials to the scene. For environment illumination, they collect 285 high-dynamic ranges (HDRs). To render images, they use the Mitsuba renderer to create two-shot flash and no-flash images to obtain the maximum information for the appearance estimation [127]. In total, the dataset contains 100 K generated scenes.

PolarNet is another synthetic dataset developed by Deschaintre et al., as detailed in their work [14]. This dataset has garnered significant attention in the domain of inverse rendering applications due to its comprehensive nature and high-quality data. It comprises a rich collection of 100,000 synthetic images, each captured from four distinct polarizer angles. The key highlight of the PolarNet dataset lies in its provision of ground truth data for the SVBRDF (Surface–Volume Bidirectional Reflectance Distribution Function) attributes, including normal, diffuse, roughness, and specular components, alongside the depth map ground truth. To ensure realistic conditions, a random texture and object are selected for each image. Additionally, PolarNet offers supplementary cues such as the normalized Stokes map and a color diffuse map, enhancing the richness and diversity of the dataset. Deschaintre et al. employed their proprietary model, previously published in [128], utilizing the full Fresnel equation for the creation of the materials [73].

Creating a polarimetric dataset for 3D reconstruction involves steps like selecting suitable imaging equipment, typically a polarimetric camera with polarizers. Scenes are prepared with consistent lighting, and images are taken from various angles with different polarizer orientations. The collected data are processed to extract polarization information, including degree and angle of polarization. Advanced algorithms combine this with traditional methods for enhanced reconstruction detail.

In the context of shape characterization, the PolarNet dataset emerges as an ideal choice for comparing various methods. This preference is attributed to the exhaustive information provided by the authors, particularly with regard to the availability of ground truth data, which significantly aids in comprehensive evaluations and analyses. As a result, PolarNet continues to serve as a benchmark in the field, facilitating the assessment and advancement of various techniques and algorithms.

5. Pose Estimation Using Polarimetry

Another task studied in the literature on robotic perception is 6D pose estimation. Which involves determining the position and orientation of objects relative to a reference frame. Pose estimation comprises several different types: model matching [129], learning-based approaches [130], pose refinement [131], and multi-view systems [132], each adapted to robotics applications. Traditionally, pose estimation [130,133] relies on visual cues extracted from color, texture, or geometric features (RGB-D), but these methods can produce errors with transparent or reflective surfaces, as shown in Figure 7. However, these methods can be limited in challenging environments with low lighting, occlusions, or textureless surfaces. Polarimetry offers advantages by capturing not only the intensity, but also the polarization properties of light reflected from objects. This additional information can enhance the discriminative power of pose estimation algorithms, particularly in scenarios where traditional visual cues may be insufficient. Polarimetric data provide additional cues about object materials, surface properties, and geometric features, which can enhance the precision of pose estimation. Some methods, like D.Gao [45,46], have studied the 6D object pose from a monocular point of view. This hybrid model exploits physical a priori such as angle of linear polarization (AoLP), degree of linear polarization (DoLP), and normal maps to learn object transformations between the object image and the camera image. Polarimetric cameras offer additional information, enhancing surface and reflection analysis. Its network architecture comprises two encoders processing polarimetric images and physical normals separately, followed by merging and decoding to generate object masks, normal maps, and dense correspondence maps (NOCS). In particular, the model enables accurate pose estimation of highly reflective and transparent objects by integrating polarization properties with conventional intensity-based methods.

By accurately estimating the 6D pose of objects, robotics systems can effectively interact with the physical world, enabling tasks such as object grasping, object tracking, and scene reconstruction with higher precision and efficiency.

6. Experimental Robotic Applications Using Polarimetry

Polarimetric segmentation is gaining popularity in robotic perception due to its capacity to provide nuanced insights surpassing those of conventional RGB imagery, offering enhanced scene understanding and object delineation across various applications. In material characterization, it aids non-destructive testing for quality control across plastics, ceramics, and composites [134]. Three-dimensional reconstruction by polarimetry is also valuable for non-destructive testing [135,136], defect identification [134,137], stress analysis [138,139], and quality control in various industries, including aerospace and civil engineering. By analyzing the polarization state of reflected light, these methods can determine surface characteristics such as roughness, texture, and anisotropy [140,141], aiding in material characterization. Surface normals derived from polarimetry can assist in classifying different materials [134,142], crucial for industries like manufacturing and quality control. Additionally, in tracking and object recognition, depth polarimetric sensing and surface normal estimation provide cues about surface depth and orientation, enhancing computer vision systems’ ability to identify and track objects in challenging conditions [22,112,113] or with transparent or specular objects within the scene [13,118,143].

Polarimetric-based methods improve performance in object detection under challenging lighting [22,112,113] and adverse weather conditions [52,56], as well as in medical imaging for tissue classification [144,145]. Integration of polarimetric data with other modalities, such as RGB or multispectral imagery [146,147], further improves the segmentation models’ effectiveness across various applications. Especially, polarization-based depth estimation is particularly beneficial for underwater imaging [148,149,150] due to its ability to withstand light scattering and absorption, unlike traditional methods. This resilience enables precise depth information acquisition, crucial for underwater tasks such as archaeology, marine biology, and robotics. But also, in medical imaging, polarization-based surface normals contribute to techniques like endoscopy [151,152] and microscopy [40,145,153], assisting in detecting tissue abnormalities and aiding in diagnosis and visualization of pathological conditions for dermatologists and ophthalmologists. This advancement in imaging technology allows for non-invasive visualization of tissue structures and cellular details, benefiting medical diagnostics and endoscopy by improving organ visualization and aiding in disease detection, especially cancer [36,151,154]. Finally, we highlight that polarimetric imaging is a valuable technique used for analyzing transparent objects [24,155,156] across various fields, from scientific research to industrial applications. By assessing the polarization state of light, it allows for detailed visualization of transparent materials, which is difficult with traditional imaging methods. Applications range from microscopy in biological research to quality assessment of pharmaceuticals and detection of defects in materials like glass or plastics. Its ability to reveal subtle surface characteristics and internal structures makes it suitable for diverse applications where transparency poses challenges.

7. Experimentation

This section evaluates the different methods previously outlined. We examine their usefulness in the tasks reviewed above. Concretely, we have chosen the methods CMX [26] and Vibotorch [79] on the Polabot database [25,27] for segmentation evaluation. While the methods proposed by Lei et al. [122], Smith et al. [120], Yu et al. [121], Dave et al. [63], and Deschaintre et al. [14] were used for the evaluation of shape characterization on the PolarNet from [14].

7.1. Performance Evaluation of Polarimetry Segmentation

The first experiment was conducted over the CMX method [26] using two different architectures, SwinTransformers (SwinT) and Segformer (Segf), for image segmentation tasks. In addition, to the Segf architecture, we evaluated three different input modalities (Figure 8) color RGB, AoLP, and DoLP. To understand the potentials and limitations of each method, we use the accuracy, recall, F1 score, precision and intersection over union (IoU) metrics [157]. Accuracy shows the proportion of correctly classified instances among all instances. Recall presents the ability of the model to capture all positive instances. Precision discriminates the proportion of true positive predictions among all positive predictions. F1 score balances precision and recall metrics using the harmonic mean. Finally, IoU measures the overlap between predicted and ground truth regions in tasks like object detection and segmentation.

The results presented in Table 2 indicate that the Segformer architecture outperforms SwinTransformers when it is solely used color images as input. However, significant improvements are observed when the model uses the Segformer architecture with either polarization angle or polarization degree images. The effectiveness of this approach can be attributed to the cross-modal feature rectification (CM-FR) and feature fusion (FF) modules, which facilitate feature rectification and fusion between parallel streams at each stage of feature extraction. The CM-FR module employs a dual-rectification approach, incorporating both channel-wise and spatial-wise feature rectification, while the FF module further enhances information interaction and combination through a two-stage process involving information exchange and fusion.

In contrast, the results presented in Table 2, Table 3, and Table 4, relative to the experiment using the Vibotorch method [79] show consistent improvements when the model is trained using data augmentation. Augmented data not only aids in mitigating potential overfitting but also enhances the model’s ability to generalize across diverse instances within the dataset. Consequently, the findings underscore the efficacy of data augmentation strategies in bolstering the robustness and overall performance of the Vibotorch method, as indicated by the superior results attained under augmented conditions.

We also studied the intersection over union on common segmentation classes such as glass, cars, buildings, sky, and roads; Table 4. Here, it is observed that the CMX method of [26] performs better than the Vibotorch method of [79].

A more exhaustive evaluation can be read in [26]. The study explores the impact of different data modalities on grating performance, including colorimetric data alone and combinations with polarimetric information. Comparing their CMX (Segformer) to SwiftNet [158] on colorimetric data, they find that integrating polarimetry enhances segmentation accuracy, particularly for outdoor scene analysis in autonomous vehicles.

7.2. Performance Evaluation on Shape Characterization

To assess and understand the advantages and limitations of each of the evaluated methods (Lei et al. [122], Smith et al. [120], Yu et al. [121], Dave et al. [63], and Deschaintre et al. [14]), we use the mean angular error (MAE), median angular error (MedianAE) and root mean squared error (RMSE) metrics.

With regard to the studied shape estimation option (normal vector estimation) (results in Table 5) each of the studied methods have their own particularity and are not all used in the same context. While the methods of Lei [122], Smith [120], and Yu [121] focus solely on estimating the surface normal in the scene, Dave’s PANDORA method [63], on the other hand, provides an approach that estimates the diffuse and spectral components for 3D rendering and the normal object surface too. And Deschaintre’s method [14] proposes the estimation of diffuse, specular, normal, depth, and roughness components of the object (Figure 9). Each of these methods has been designed for use with a specific database, such as Lei’s work, which is used with indoor images with no control over scene illumination; or Deschaintre’s work, which required complex control over both scenes background and illumination conditions. Similarly, certain works, such as Dave’s, require more varied views of the object. For our analysis, we studied these methods on a common dataset, the one proposed by Deschaintre, because it is the most complete of those proposed, as mentioned in Section 3.1. It provides polarimetric images and field truths for many objects with different textures. We can also note that Smith and Yu’s methods are not based on a deep learning approach.

Non-learning-based methods like those of Smith et al. [120,121] have good accuracy. Learning-based approaches, such as PANDORA [63] and Deep SVBRDF [14], offer advantages in terms of stability and adaptability. These methods offer the best compromise between accuracy, stability, and generalizability in normal vector estimation tasks.

8. Open Challenges

Traditional polarimetric cameras used a mechanically rotating polarizer in front of the sensor, causing images with different polarizer angles to be captured at different times. This made simultaneous image acquisition impossible. The introduction of a new sensor simplifies the acquisition setup by removing the need for an internal mechanical rotation component. Careful illumination control is crucial for data quality. However, achieving precise control over lighting conditions may require a more complex setup. This could involve building an opaque enclosure with a controllable, polarizable light source positioned and oriented accurately to capture detailed incident light data. Such a setup allows for optimal polarimetric information in captured images.

Polarimetric imaging offers rich insights into scene properties through analysis of reflected light, but challenges remain for practical implementation. Despite the progress made in sensors, issues persist with noise and artefact management. Pre-processing steps, like demosaicking, Section 2.3, are often necessary for refining raw polarimetric data before integration into robotic perception applications. Different methods in the field utilize polarizer angle images, which are typically generated early in the process after demosaicking.

As seen in the literature (works like [83]) one of the main problems that can arise when using polarimetry with neural networks is the lack of data. The absence of data cannot be solved by standard data augmentation methods. This is the reason certain research focused on creating specific polarimetric data augmentation methods. It should also be noted that for the segmentation methods studied [26,79], the data acquisition setup is based on many cameras, and its installation on a vehicle can be costly compared with shape estimation methods, some of which only require a single camera. The approach from Deschaintre [14] needs heavy control of the lighting conditions (flash illumination), and outdoor illumination remains an open challenge.

Each formalism of polarimetry (Section 2) presents unique advantages and is chosen depending on the particular needs of the research or application. But, real-time applications still often require reduced image resolution due to the challenge of processing complex polarimetric imaging data, which involves multiple images and channels. Unlike colorimetric imaging, which typically involves a single image with three channels, polarimetric imaging demands significant computing resources for effective processing.

Nevertheless, challenges arise in contexts like autonomous driving, where controlling scene lighting is impractical, leading to difficulties in maintaining consistent and controlled lighting conditions.

9. Discussion

To deal with the above weaknesses, today, it is proposed in the literature to integrate polarimetric imaging with various conventional modalities, including colorimetric [26], depth [159,160], multispectral [146,161], and thermal imaging [159,162]. Combining these modalities expands the range of surface characteristics and opportunities for material composition analysis in a scene. When coupled with depth modalities like LiDAR or structured-light imaging, polarimetric imaging enhances 3D reconstruction and scene comprehension, particularly in challenging environments. Furthermore, integrating polarimetry with thermal imaging offers insights into material properties, heat dissipation, and thermal signatures, which are valuable for remote sensing and environmental monitoring. The fusion of polarimetric imaging with spectral techniques enhances material classification and identification, benefiting applications such as remote sensing, vegetation analysis [140], and biomedical imaging. The future of polarimetric imaging in robotic perception shows promise for innovative developments.

Deep learning techniques tailored for polarimetric data processing are set to improve efficiency and robustness. Real-time integration of polarimetric imaging into autonomous vehicles and robotics could revolutionize navigation, mapping, and environmental perception for enhanced safety and decision-making. Advancements in polarimetric data demosaicking present exciting opportunities to improve image resolution and extract richer information from polarimetric imaging systems. As polarimetric sensors evolve, there is a growing need for robust demosaicking algorithms tailored to polarimetric data’s unique characteristics. Key research areas include noise reduction, interpolation accuracy, artefact suppression, and fusion. Exploring machine learning techniques like deep learning and neural networks holds promise for accurately reconstructing complete polarimetric images from incomplete data. Additionally, integrating adaptive filtering methods, optimized color calibration, and sophisticated optimization techniques can further enhance demosaicking algorithms for polarimetric data [163,164].

In the method presented by Yu et al. [121] the potential of the integration of a hybrid diffuse/specular polarization model is discussed. This model is designed to capture both diffuse and specular reflectance, allowing for the estimation of albedo and lighting parameters. Furthermore, the authors explore the feasibility of incorporating this approach into a CNN for depth estimation using polarization data.

The results presented in Section 7 show a general out-performance of CMX [26] against Vibotorch [79]. This is due to the incorporation of attention modules and a cross-modal strategy that improves the efficiency of the method. However, this performance improvement requires additional computing and energy resources. While these methods offer advantages in terms of accuracy, their effectiveness is also underlined by increased stability and robustness in varied data domains. However, this skill requires significant additional data volumes and computing resources, which represents a trade-off in terms of resource allocation compared to traditional linear and non-linear optimization techniques.

10. Conclusions

Polarimetric imaging addresses challenges like specular reflections encountered in robot perception, where traditional sensors struggle, by selectively capturing polarized light to enhance segmentation and depth estimation accuracy. It also improves transparent object detection by leveraging light polarization properties, aiding in perceiving and interacting with objects like glass panels or liquids. Additionally, in low-texture environments, polarimetric imaging provides meaningful depth and surface normal information, enhancing perception and navigation capabilities.

Integration of polarimetric imaging with conventional modalities enhances robotic perception by expanding surface characterization and material analysis within scenes. This fusion improves 3D reconstruction, scene comprehension, and provides insights into material properties, thermal signatures, and spectral characteristics. Advancements in polarimetric data processing, especially through deep learning, promise efficiency and robustness in applications like autonomous navigation and environmental monitoring. Recent advancements in shape characterization, particularly leveraging deep learning, offer improvements in accuracy and robustness, although they require substantial computational resources and datasets. These approaches extend beyond shape estimation, with potential applications in robotic manipulation and scene understanding, showcasing the versatility of deep learning-based methods.

The continuous evolution of polarimetric imaging and deep learning methodologies holds promise for advancing robotic perception. By addressing challenges like specular reflections, transparent object detection, and low-texture environments, polarimetric imaging enables more accurate perception and interaction with the surroundings. Integration of these technologies into robotic perception systems is expected to drive advancements in autonomous navigation, object manipulation, and scene understanding.

Author Contributions

Conceptualization, C.T., C.M. and C.S.; investigation, C.T.; writing—original draft preparation, C.T.; writing—review and editing, C.T. and C.M.; supervision, C.M. and C.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The segmentation data used in this study are available in the public domain at the following resource: [https://search-data.ubfc.fr/FR-18008901306731-2022-05-13-02_PolaBot.html#data_deposit] accessed on 25 June 2024. The reconstruction data used in this study are available in the public domain at the following resource: [https://wp.doc.ic.ac.uk/rgi/project/deep-polarization-3d-imaging/] accessed on 25 June 2024.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Azzam, R.M.A. The intertwined history of polarimetry and ellipsometry. Thin Solid Films 2011, 519, 2584–2588. [Google Scholar] [CrossRef]
Atkinson, G.A.; Hancock, E.R. Recovery of surface orientation from diffuse polarization. IEEE Trans. Image Process. 2006, 15, 1653–1664. [Google Scholar] [CrossRef]
Vašiček, A. Polarimetric Methods for the Determination of the Refractive Index and the Thickness of Thin Films on Glass. J. Opt. Soc. Am. 1947, 37, 145–153. [Google Scholar] [CrossRef]
Stokes, G.G. On the numerical calculation of a class of definite integrals and infinite series. Trans. Camb. Philos. Soc. 1851, 9, 166. [Google Scholar]
Jütte, L.; Sharma, G.; Patel, H.; Roth, B. Registration of polarimetric images for in vivo skin diagnostics. J. Biomed. Opt. 2022, 27, 96001. [Google Scholar] [CrossRef]
Picart, P.; Leval, J. General theoretical formulation of image formation in digital Fresnel holography. J. Opt. Soc. Am. A 2008, 25, 1744–1761. [Google Scholar] [CrossRef]
Huynh, C.P.; Robles-Kelly, A.; Hancock, E. Shape and refractive index recovery from single-view polarisation images. In Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, CA, USA, 13–18 June 2010; IEEE: Piscataway, NJ, USA, 2010; pp. 1229–1236. [Google Scholar] [CrossRef]
Kurt, M.; Szirmay-Kalos, L.; Křivánek, J. An anisotropic BRDF model for fitting and Monte Carlo rendering. ACM SIGGRAPH Comput. Graph. 2010, 44, 1–15. [Google Scholar] [CrossRef]
Deschaintre, V.; Drettakis, G.; Bousseau, A. Guided Fine-Tuning for Large-Scale Material Transfer. arXiv 2020, arXiv:2007.03059. [Google Scholar] [CrossRef]
Lapray, P.J.; Bigué, L. Performance comparison of division of time and division of focal plan polarimeters. In Proceedings of the Sixteenth International Conference on Quality Control by Artificial Vision, Albi, France, 28 July 2023; Orteu, J.J., Jovančević, I., Eds.; SPIE: Paris, France, 2023; p. 6. [Google Scholar] [CrossRef]
Liu, J.; Duan, J.; Hao, Y.; Chen, G.; Zhang, H.; Zheng, Y. Polarization image demosaicing and RGB image enhancement for a color polarization sparse focal plane array. Opt. Express 2023, 31, 23475–23490. [Google Scholar] [CrossRef]
Zhang, J.; Luo, H.; Hui, B.; Chang, Z. Image interpolation for division of focal plane polarimeters with intensity correlation. Opt. Express 2016, 24, 20799–20807. [Google Scholar] [CrossRef]
Drouet, F.; Stolz, C.; Laligant, O.; Aubreton, O. 3D reconstruction of external and internal surfaces of transparent objects from polarization state of highlights. Opt. Lett. 2014, 39, 2955–2958. [Google Scholar] [CrossRef]
Deschaintre, V.; Lin, Y.; Ghosh, A. Deep Polarization Imaging for 3D shape and SVBRDF Acquisition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 15567–15576. [Google Scholar]
Baek, S.H.; Jeon, D.S.; Tong, X.; Kim, M.H. Simultaneous acquisition of polarimetric SVBRDF and normals. In Proceedings of the SIGGRAPH Asia 2018 Technical Papers, SIGGRAPH Asia 2018, Tokyo, Japan, 4–7 December 2018; Association for Computing Machinery: New York, NY, USA, 2018. [Google Scholar] [CrossRef]
Hwang, I.; Jeon, D.S.; Muñoz, A.; Gutierrez, D.; Tong, X.; Kim, M.H. Sparse ellipsometry: Portable Acquisition of Polarimetric SVBRDF and Shape with Unstructured Flash Photography. ACM Trans. Graph. 2022, 41, 133. [Google Scholar] [CrossRef]
Cui, Z.; Gu, J.; Shi, B.; Tan, P.; Kautz, J. Polarimetric Multi-View Stereo. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017. [Google Scholar] [CrossRef]
Zhao, J.; Monno, Y.; Okutomi, M. Polarimetric Multi-View Inverse Rendering. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 450, 8798–8812. [Google Scholar] [CrossRef]
Yang, X.; Cheng, C.; Duan, J.; Hao, Y.F.; Zhu, Y.; Zhang, H. Polarized Object Surface Reconstruction Algorithm Based on RU-GAN Network. Sensors 2023, 23, 3638. [Google Scholar] [CrossRef]
Chen, G.; He, L.; Guan, Y.; Zhang, H. Perspective Phase Angle Model for Polarimetric 3D Reconstruction. In Proceedings of the Computer Vision—ECCV 2022, Tel Aviv, Israel, 23–27 October 2022; Avidan, S., Gabriel, B., Moustapha, C., Farinella, G.M., Hassner, T., Eds.; Springer Nature: Cham, Switzerland, 2022; pp. 398–414. [Google Scholar]
Rantoson, R.; Stolz, C.; Fofi, D.; Mériaudeau, F. 3D reconstruction by polarimetric imaging method based on perspective model. In Proceedings of the Optical Measurement Systems for Industrial Inspection VI, Munich, Germany, 17 June 2009; Lehmann, P.H., Ed.; SPIE: Paris, France, 2009; Volume 7389, p. 73890C. [Google Scholar] [CrossRef]
Shen, X.; Carnicer, A.; Javidi, B. Three-dimensional polarimetric integral imaging under low illumination conditions. Opt. Lett. 2019, 44, 3230–3233. [Google Scholar] [CrossRef]
Lei, C.; Huang, X.; Zhang, M.; Yan, Q.; Sun, W.; Chen, Q. Polarized Reflection Removal with Perfect Alignment in the Wild. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 1750–1758. [Google Scholar]
Kalra, A.; Taamazyan, V.; Rao, S.K.; Venkataraman, K.; Raskar, R.; Kadambi, A. Deep Polarization Cues for Transparent Object Segmentation. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020. [Google Scholar] [CrossRef]
Zhang, Y.; Morel, O.; Blanchon, M.; Seulin, R.; Rastgoo, M.; Sidibé, D. Exploration of Deep Learning-based Multimodal Fusion for Semantic Road Scene Segmentation. In Proceedings of the VISAPP 2019 14th International Conference on Computer Vision Theory and Applications, Prague, Czech Republic, 25–27 February 2019. [Google Scholar] [CrossRef]
Zhang, J.; Liu, H.; Yang, K.; Hu, X.; Liu, R.; Stiefelhagen, R. CMX: Cross-Modal Fusion for RGB-X Semantic Segmentation with Transformers. arXiv 2023, arXiv:2203.04838. [Google Scholar] [CrossRef]
Blanchon, M.; Morel, O.; Zhang, Y.; Seulin, R.; Crombez, N.; Sidibé, D. Outdoor Scenes Pixel-Wise Semantic Segmentation using Polarimetry and Fully Convolutional Network. In Proceedings of the 4th International Conference on Computer Vision Theory and Applications (VISAPP 2019), Prague, Czech Republic, 25–27 February 2019. [Google Scholar] [CrossRef]
Morel, O.; Stolz, C.; Meriaudeau, F.; Gorria, P. Active lighting applied to three-dimensional reconstruction of specular metallic surfaces by polarization imaging. Appl. Opt. 2006, 45, 4062–4068. [Google Scholar] [CrossRef]
Zhaole, S.; Zhou, H.; Nanbo, L.; Chen, L.; Zhu, J.; Fisher, R.B. A Robust Deformable Linear Object Perception Pipeline in 3D: From Segmentation to Reconstruction. IEEE Robot. Autom. Lett. 2024, 9, 843–850. [Google Scholar] [CrossRef]
Mora, A.; Prados, A.; González, P.; Moreno, L.; Barber, R. Intensity-Based Identification of Reflective Surfaces for Occupancy Grid Map Modification. IEEE Access 2023, 11, 23517–23530. [Google Scholar] [CrossRef]
Huy, D.Q.; Sadjoli, N.; Azam, A.B.; Elhadidi, B.; Cai, Y.; Seet, G. Object perception in underwater environments: A survey on sensors and sensing methodologies. Ocean Eng. 2023, 267, 113202. [Google Scholar] [CrossRef]
Wang, J.; Liang, W.; Yang, J.; Wang, S.; Yang, Z.X. An adaptive image enhancement approach for safety monitoring robot under insufficient illumination condition. Comput. Ind. 2023, 147, 103862. [Google Scholar] [CrossRef]
Hu, K.; Chen, Z.; Kang, H.; Tang, Y. 3D vision technologies for a self-developed structural external crack damage recognition robot. Autom. Constr. 2024, 159, 105262. [Google Scholar] [CrossRef]
He, C.; He, H.; Chang, J.; Chen, B.; Ma, H.; Booth, M.J. Polarisation optics for biomedical and clinical applications: A review. Nature 2021, 10, 194. [Google Scholar] [CrossRef]
Louie, D.C.; Tchvialeva, L.; Kalia, S.; Lui, H.; Lee, T.K. Constructing a portable optical polarimetry probe for in-vivo skin cancer detection. J. Biomed. Opt. 2021, 26, 35001. [Google Scholar] [CrossRef]
Badieyan, S.; Abedini, M.; Razzaghi, M.; Moradi, A.; Masjedi, M. Polarimetric imaging-based cancer bladder tissue’s detection: A comparative study of bulk and formalin-fixed paraffin-embedded samples. Photodiagn. Photodyn. Ther. 2023, 44, 103698. [Google Scholar] [CrossRef]
Shabayek, A.E.R.; Morel, O.; Fofi, D. Visual Behavior Based Bio-Inspired Polarization Techniques in Computer Vision and Robotics; IGI Global: Hershey, PA, USA, 2013; pp. 243–272. [Google Scholar] [CrossRef]
Kong, F.; Guo, Y.; Zhang, J.; Fan, X.; Guo, X. Review on bio-inspired polarized skylight navigation. Chin. J. Aeronaut. 2023, 36, 14–37. [Google Scholar] [CrossRef]
Ahsan, M.; Cai, Y.; Zhang, W. Information Extraction of Bionic Camera-Based Polarization Navigation Patterns Under Noisy Weather Conditions. J. Shanghai Jiaotong Univ. Sci. 2020, 25, 18–26. [Google Scholar] [CrossRef]
Gratiet, A.L.; Dubreuil, M.; Rivet, S.; Grand, Y.L. Scanning Mueller polarimetric microscopy. Opt. Lett. 2016, 41, 4336–4339. [Google Scholar] [CrossRef]
Kontenis, L.; Samim, M.; Karunendiran, A.; Krouglov, S.; Stewart, B.; Barzda, V. Second harmonic generation double stokes Mueller polarimetric microscopy of myofilaments. Biomed. Opt. Express 2016, 7, 559–569. [Google Scholar] [CrossRef]
Miyazaki, D.; Saito, M.; Sato, Y.; Ikeuchi, K. Determining surface orientations of transparent objects based on polarization degrees in visible and infrared wavelengths. J. Opt. Soc. Am. A Opt. Image Sci. Vis. 2002, 19, 687–694. [Google Scholar] [CrossRef]
Mei, H.; Dong, B.; Dong, W.; Yang, J.; Baek, S.H.; Heide, F.; Peers, P.; Wei, X.; Yang, X. Glass Segmentation using Intensity and Spectral Polarization Cues. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022. [Google Scholar]
Kondo, Y.; Ono, T.; Sun, L.; Hirasawa, Y.; Murayama, J. Accurate Polarimetric BRDF for Real Polarization Scene Rendering. In Proceedings of the Computer Vision—ECCV 2020 16th European Conference, Glasgow, UK, 23–28 August 2020. [Google Scholar] [CrossRef]
Gao, D.; Li, Y.; Ruhkamp, P.; Skobleva, I.; Wysocki, M.; Jung, H.; Wang, P.; Guridi, A.; Navab, N.; Busam, B. Polarimetric Pose Prediction. arXiv 2021, arXiv:2112.03810. [Google Scholar]
Dennis, M.; Dayton, S. Polarimetric Imagery for Object Pose Estimation; University of Dayton: Dayton, OH, USA, 2023. [Google Scholar]
Khlynov, R.D.; Ryzhova, V.A.; Konyakhin, I.A.; Korotaev, V.V. Robotic Polarimetry System Based on Image Sensors for Monitoring the Rheological Properties of Blood in Emergency Situations. In Smart Electromechanical Systems: Recognition, Identification, Modeling, Measurement Systems, Sensors; Springer International Publishing: Cham, Switzerland, 2022; pp. 201–218. [Google Scholar] [CrossRef]
Roa, C.; Le, V.N.D.; Mahendroo, M.; Saytashev, I.; Ramella-Roman, J.C. Auto-detection of cervical collagen and elastin in Mueller matrix polarimetry microscopic images using K-NN and semantic segmentation classification. Biomed. Opt. Express 2021, 12, 2236–2249. [Google Scholar] [CrossRef] [PubMed]
Wang, F.; Ainouz, S.; Lian, C.; Bensrhair, A. Multimodality Semantic Segmentation based on Polarization and color Images. Neurocomputing 2017, 253, 193–200. [Google Scholar] [CrossRef]
Blanchon, M. Polarization Based Urban Scenes Understanding. Ph.D. Thesis, Université Bourgogne Franche-Comté, Dijon, France, 2021. [Google Scholar]
Xiang, K.; Yang, K.; Wang, K. Polarization-driven Semantic Segmentation via Efficient Attention-bridged Fusion. Opt. Express 2021, 29, 4802–4820. [Google Scholar] [CrossRef] [PubMed]
Blin, R.; Ainouz, S.; Canu, S.; Meriaudeau, F. Road scenes analysis in adverse weather conditions by polarization-encoded images and adapted deep learning. In Proceedings of the 2019 IEEE Intelligent Transportation Systems Conference (ITSC), Auckland, New Zealand, 27–30 October 2019. [Google Scholar]
Liang, J.; Ren, L.; Qu, E.; Hu, B.; Wang, Y. Method for enhancing visibility of hazy images based on polarimetric imaging. Photon. Res. 2014, 2, 38–44. [Google Scholar] [CrossRef]
Zhang, W.; Liang, J.; Ren, L.; Ju, H.; Qu, E.; Bai, Z.; Tang, Y.; Wu, Z. Real-time image haze removal using an aperture-division polarimetric camera. Appl. Opt. 2017, 56, 942–947. [Google Scholar] [CrossRef] [PubMed]
Zhang, W.; Liang, J.; Ju, H.; Ren, L.; Qu, E.; Wu, Z. A robust haze-removal scheme in polarimetric dehazing imaging based on automatic identification of sky region. Opt. Laser Technol. 2016, 86, 145–151. [Google Scholar] [CrossRef]
Zhang, W.; Liang, J.; Ren, L. Haze-removal polarimetric imaging schemes with the consideration of airlight’s circular polarization effect. Optik 2019, 182, 1099–1105. [Google Scholar] [CrossRef]
Shi, Y.; Guo, E.; Bai, L.; Han, J. Polarization-based haze removal using self-supervised network. Front. Phys. 2022, 9, 746. [Google Scholar] [CrossRef]
Meriaudeau, F.; Ferraton, M.; Stolz, C.; Morel, O.; Bigué, L. Polarization imaging for industrial inspection. In Proceedings of the Image Processing: Machine Vision Applications, San Jose, CA, USA, 3 March 2008; p. 681308. [Google Scholar] [CrossRef]
Zhou, C.; Teng, M.; Lyu, Y.; Li, S.; Xu, C.; Shi, B. Polarization-Aware Low-Light Image Enhancement. Proc. AAAI Conf. Artif. Intell. 2023, 37, 3742–3750. [Google Scholar] [CrossRef]
Trippe, S. Polarization and Polarimetry: A Review. arXiv 2014, arXiv:1401.1911. [Google Scholar] [CrossRef]
Garcia, N.M.; de Erausquin, I.; Edmiston, C.; Gruev, V. Surface normal reconstruction using circularly polarized light. Opt. Express 2015, 23, 14391–14406. [Google Scholar] [CrossRef] [PubMed]
Zhu, D.; Smith, W.A.P. Depth from a polarisation + RGB stereo pair. arXiv 2019, arXiv:1903.12061. [Google Scholar]
Dave, A.; Zhao, Y.; Veeraraghavan, A. PANDORA: Polarization-Aided Neural Decomposition Of Radiance. In Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel, 23–27 October 2022; Springer: Cham, Switzerland, 2022; pp. 538–556. [Google Scholar]
Tozza, S.; Smith, W.A.P.; Zhu, D.; Ramamoorthi, R.; Hancock, E.R. Linear Differential Constraints for Photo-polarimetric Height Estimation. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017. [Google Scholar]
Ngo, T.T.; Nagahara, H.; Taniguchi, R.I. Shape and Light Directions from Shading and Polarization. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015. [Google Scholar] [CrossRef]
Yang, F.; Wei, H. Fusion of infrared polarization and intensity images using support value transform and fuzzy combination rules. Infrared Phys. Technol. 2013, 60, 235–243. [Google Scholar] [CrossRef]
Sattar, S.; Lapray, P.J.; Foulonneau, A.; Bigué, L. Review of Spectral and Polarization Imaging Systems; SPIE: Paris, France, 2020; Volume 11351, p. 113511Q. [Google Scholar] [CrossRef]
Kadambi, A.; Taamazyan, V.; Shi, B.; Raskar, R. Polarized 3D: High-Quality Depth Sensing with Polarization Cues. In Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015. [Google Scholar] [CrossRef]
Kadambi, A.; Taamazyan, V.; Shi, B.; Raskar, R. Depth Sensing Using Geometrically Constrained Polarization Normals. Int. J. Comput. Vis. 2017, 125, 34–51. [Google Scholar] [CrossRef]
Farlow, C.A.; Chenault, D.B.; Pezzaniti, J.L.; Spradley, K.D.; Gulley, M.G. Imaging Polarimeter Development and Applications; SPIE: Paris, France, 2002; Volume 4481, pp. 118–125. [Google Scholar] [CrossRef]
Lee, J.H.; Choi, H.Y.; Shin, S.K.; Chung, Y.C. A Review of the Polarization-Nulling Technique for Monitoring Optical-Signal-to-Noise Ratio in Dynamic WDM Networks. J. Light. Technol. 2006, 24, 4162–4171. [Google Scholar] [CrossRef]
Baliga, S.; Hanany, E.; Klibanoff, P. Polarization and Ambiguity. Am. Econ. Rev. 2013, 103, 3071–3083. [Google Scholar] [CrossRef]
Dupertuis, M.A.; Proctor, M.; Acklin, B. Generalization of complex Snell–Descartes and Fresnel laws. J. Opt. Soc. Am. A 1994, 11, 1159–1166. [Google Scholar] [CrossRef]
Atkinson, G.A.; Hancock, E.R. Shape Estimation Using Polarization and Shading from Two Views. IEEE Trans. Pattern Anal. Mach. Intell. 2007, 29, 2001–2017. [Google Scholar] [CrossRef]
Stolz, C.; Ferraton, M.; Meriaudeau, F. Shape from polarization: A method for solving zenithal angle ambiguity. Opt. Lett. 2012, 37, 4218–4220. [Google Scholar] [CrossRef]
Zhao, P.; Deng, Y.; Wang, W.; Liu, D.; Wang, R. Azimuth Ambiguity Suppression for Hybrid Polarimetric Synthetic Aperture Radar via Waveform Diversity. Remote Sens. 2020, 12, 1226. [Google Scholar] [CrossRef]
Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015, Munich, Germany, 5–9 October 2015; pp. 234–241. [Google Scholar]
Kirillov, A.; Mintun, E.; Ravi, N.; Mao, H.; Rolland, C.; Gustafson, L.; Xiao, T.; Whitehead, S.; Berg, A.C.; Lo, W.Y.; et al. Segment Anything. arXiv 2023, arXiv:2304.02643. [Google Scholar]
Blanchon, M.; Sidibé, D.; Morel, O.; Seulin, R.; Meriaudeau, F. Towards urban scenes understanding through polarization cues. arXiv 2021, arXiv:2106.01717. [Google Scholar]
Suárez-Bermejo, J.C.; Gorgas, J.; Pascual, S.; Santarsiero, M.; de Sande, J.C.G.; Piquero, G. Bayesian inference approach for Full Poincaré Mueller polarimetry. Opt. Laser Technol. 2024, 168, 109983. [Google Scholar] [CrossRef]
Bansal, S.; Senthilkumaran, P. Stokes polarimetry with Poincaré–Hopf index beams. Opt. Lasers Eng. 2023, 160, 107295. [Google Scholar] [CrossRef]
Yu, R.; Ren, W.; Zhao, M.; Wang, J.; Wu, D.; Xie, Y. Transparent objects segmentation based on polarization imaging and deep learning. Opt. Commun. 2024, 555, 130246. [Google Scholar] [CrossRef]
Blanchon, M.; Morel, O.; Meriaudeau, F.; Seulin, R.; Sidibé, D. Polarimetric image augmentation. In Proceedings of the 2020 25th International Conference on Pattern Recognition (ICPR), Milan, Italy, 10–15 January 2021; pp. 7365–7371. [Google Scholar] [CrossRef]
Liu, Y.; Jiang, J.; Sun, J.; Bai, L.; Wang, Q. A survey of depth estimation based on computer vision. In Proceedings of the 2020 IEEE Fifth International Conference on Data Science in Cyberspace (DSC), Hong Kong, China, 27–30 July 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 135–141. [Google Scholar]
Laga, H.; Jospin, L.V.; Boussaid, F.; Bennamoun, M. A survey on deep learning techniques for stereo-based depth estimation. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 44, 1738–1764. [Google Scholar] [CrossRef] [PubMed]
Wang, Y.; Chao, W.L.; Garg, D.; Hariharan, B.; Campbell, M.; Weinberger, K.Q. Pseudo-lidar from visual depth estimation: Bridging the gap in 3d object detection for autonomous driving. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 8445–8453. [Google Scholar]
Yang, L.; Tan, F.; Li, A.; Cui, Z.; Furukawa, Y.; Tan, P. Polarimetric dense monocular slam. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 3857–3866. [Google Scholar]
Ikemura, K.; Huang, Y.; Heide, F.; Zhang, Z.; Chen, Q.; Lei, C. Robust Depth Enhancement via Polarization Prompt Fusion Tuning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 17–21 June 2024; pp. 20710–20720. [Google Scholar]
Hochwald, B.; Nehorai, A. Polarimetric modeling and parameter estimation with applications to remote sensing. IEEE Trans. Signal Process. 1995, 43, 1923–1935. [Google Scholar] [CrossRef]
Kumar, A.C.S.; Bhandarkar, S.M.; Prasad, M. DepthNet: A Recurrent Neural Network Architecture for Monocular Depth Prediction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Salt Lake City, UT, USA, 18–22 June 2018; pp. 396–3968. [Google Scholar] [CrossRef]
Makarov, I.; Bakhanova, M.; Nikolenko, S.; Gerasimova, O. Self-supervised recurrent depth estimation with attention mechanisms. PeerJ Comput. Sci. 2022, 8, e865. [Google Scholar] [CrossRef]
Li, B.; Hua, Y.; Liu, Y.; Lu, M. Dilated Fully Convolutional Neural Network for Depth Estimation from a Single Image. arXiv 2021, arXiv:2103.07570. [Google Scholar] [CrossRef]
Shi, C.; Chen, J.; Chen, J.; Zhang, Z. Feature Enhanced Fully Convolutional Networks for Monocular Depth Estimation. In Proceedings of the 2019 IEEE International Conference on Data Science and Advanced Analytics (DSAA), Washington, DC, USA, 5–8 October 2019; pp. 270–276. [Google Scholar] [CrossRef]
Chen, S.; Tang, M.; Kan, J. Encoder–decoder with densely convolutional networks for monocular depth estimation. J. Opt. Soc. Am. A 2019, 36, 1709–1718. [Google Scholar] [CrossRef]
Sheng, H.; Cheng, K.; Jin, X.; Han, T.; Jiang, X.; Dong, C. Attention-based encoder–decoder network for depth estimation from color-coded light fields. AIP Adv. 2023, 13, 035118. [Google Scholar] [CrossRef]
Cao, Y.; Wu, Z.; Shen, C. Estimating Depth From Monocular Images as Classification Using Deep Fully Convolutional Residual Networks. IEEE Trans. Circuits Syst. Video Technol. 2018, 28, 3174–3182. [Google Scholar] [CrossRef]
Laina, I.; Rupprecht, C.; Belagiannis, V.; Tombari, F.; Navab, N. Deeper Depth Prediction with Fully Convolutional Residual Networks. In Proceedings of the 2016 Fourth International Conference on 3D Vision (3DV), Stanford, CA, USA, 25–28 October 2016; pp. 239–248. [Google Scholar] [CrossRef]
Goldman, M.; Hassner, T.; Avidan, S. Learn Stereo, Infer Mono: Siamese Networks for Self-Supervised, Monocular, Depth Estimation. arXiv 2019, arXiv:1905.00401. [Google Scholar]
Bardozzo, F.; Collins, T.; Forgione, A.; Hostettler, A.; Tagliaferri, R. StaSiS-Net: A stacked and siamese disparity estimation network for depth reconstruction in modern 3D laparoscopy. Med. Image Anal. 2022, 77, 102380. [Google Scholar] [CrossRef]
Prantl, M.; Váša, L. Estimation of differential quantities using Hermite RBF interpolation. Vis. Comput. 2018, 34, 1645–1659. [Google Scholar] [CrossRef]
Muneeswaran, V.; Rajasekaran, M.P. Gallbladder shape estimation using tree-seed optimization tuned radial basis function network for assessment of acute cholecystitis. In Intelligent Engineering Informatics; Advances in Intelligent Systems and Computing; Springer: Singapore, 2018; Volume 695, pp. 229–239. [Google Scholar] [CrossRef]
Reid, R.B.; Oxley, M.E.; Eismann, M.T.; Goda, M.E. Quantifying surface normal estimation. In Proceedings of the Polarization: Measurement, Analysis, and Remote Sensing VII, Orlando, FL, USA, 20–21 April 2006; SPIE: Paris, France, 2006; Volume 6240, p. 624001. [Google Scholar]
Wang, X.; Fouhey, D.F.; Gupta, A. Designing Deep Networks for Surface Normal Estimation. arXiv 2014, arXiv:1411.4958. [Google Scholar]
Zhan, H.; Weerasekera, C.S.; Garg, R.; Reid, I.D. Self-supervised Learning for Single View Depth and Surface Normal Estimation. arXiv 2019, arXiv:1903.00112. [Google Scholar]
Bors, A.; Pitas, I. Median Radial Basis Functions Neural Network. IEEE Trans. Neural Netw. 1996, 7, 1351–1364. [Google Scholar] [CrossRef]
Grabec, I. The Normalized Radial Basis Function Neural Network and its Relation to the Perceptron. arXiv 2007, arXiv:physics/0703229. [Google Scholar]
Kirchengast, M.; Watzenig, D. A Depth-Buffer-Based Lidar Model with Surface Normal Estimation. IEEE Trans. Intell. Transp. Syst. 2024, 1–12. [Google Scholar] [CrossRef]
Han, P.; Li, X.; Liu, F.; Cai, Y.; Yang, K.; Yan, M.; Sun, S.; Liu, Y.; Shao, X. Accurate Passive 3D Polarization Face Reconstruction under Complex Conditions Assisted with Deep Learning. Photonics 2022, 9, 924. [Google Scholar] [CrossRef]
Fangmin, L.; Ke, C.; Xinhua, L. 3D Face Reconstruction Based on Convolutional Neural Network. In Proceedings of the 10th International Conference on Intelligent Computation Technology and Automation, ICICTA 2017, Changsha, China, 9–10 October 2017; Institute of Electrical and Electronics Engineers Inc.: Piscataway, NJ, USA, 2017; Volume 2017, pp. 71–74. [Google Scholar] [CrossRef]
Fan, H.; Zhao, Y.; Su, G.; Zhao, T.; Jin, S. The Multi-View Deep Visual Adaptive Graph Convolution Network and Its Application in Point Cloud. Trait. Signal 2023, 40, 31–41. [Google Scholar] [CrossRef]
Taamazyan, V.; Kadambi, A.; Raskar, R. Shape from Mixed Polarization. arXiv 2016, arXiv:1605.02066. [Google Scholar] [CrossRef]
Usmani, K.; O’Connor, T.; Javidi, B. Three-dimensional polarimetric image restoration in low light with deep residual learning and integral imaging. Opt. Express 2021, 29, 29505–29517. [Google Scholar] [CrossRef] [PubMed]
Ning, T.; Ma, X.; Li, Y.; Li, Y.; Liu, K. Efficient acquisition of Mueller matrix via spatially modulated polarimetry at low light field. Opt. Express 2023, 31, 14532–14559. [Google Scholar] [CrossRef]
Ba, Y.; Chen, R.; Wang, Y.; Yan, L.; Shi, B.; Kadambi, A. Physics-based Neural Networks for Shape from Polarization. arXiv 2019, arXiv:1903.10210. [Google Scholar]
Mortazavi, F.S.; Dajkhosh, S.; Saadatseresht, M. Surface Normal Reconstruction Using Polarization-UNET. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2023, X-4/W1-2022, 537–543. [Google Scholar] [CrossRef]
Yaqub, M.; Jinchao, F.; Ahmed, S.; Arshid, K.; Bilal, M.A.; Akhter, M.P.; Zia, M.S. GAN-TL: Generative Adversarial Networks with Transfer Learning for MRI Reconstruction. Appl. Sci. 2022, 12, 8841. [Google Scholar] [CrossRef]
Cardoen, T.; Leroux, S.; Simoens, P. Iterative Online 3D Reconstruction from RGB Images. Sensors 2022, 22, 9782. [Google Scholar] [CrossRef]
Kang, I.; Pang, S.; Zhang, Q.; Fang, N.; Barbastathis, G. Recurrent neural network reveals transparent objects through scattering media. Opt. Express 2021, 29, 5316–5326. [Google Scholar] [CrossRef]
Heydari, M.J.; Ghidary, S.S. 3D Motion Reconstruction From 2D Motion Data Using Multimodal Conditional Deep Belief Network. IEEE Access 2019, 7, 56389–56408. [Google Scholar] [CrossRef]
Smith, W.A.P.; Ramamoorthi, R.; Tozza, S. Height-from-Polarisation with Unknown Lighting or Albedo. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 41, 2875–2888. [Google Scholar] [CrossRef]
Yu, Y.; Zhu, D.; Smith, W.A.P. Shape-from-Polarisation: A Nonlinear Least Squares Approach. In Proceedings of the 2017 IEEE International Conference on Computer Vision Workshops (ICCVW), Venice, Italy, 22–29 October 2017; pp. 2969–2976. [Google Scholar] [CrossRef]
Lei, C.; Qi, C.; Xie, J.; Fan, N.; Koltun, V.; Chen, Q. Shape From Polarization for Complex Scenes in the Wild. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 12632–12641. [Google Scholar]
Mildenhall, B.; Srinivasan, P.P.; Tancik, M.; Barron, J.T.; Ramamoorthi, R.; Ng, R. NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. Commun. ACM 2021, 65, 99–106. [Google Scholar] [CrossRef]
Kerr, J.; Fu, L.; Huang, H.; Avigal, Y.; Tancik, M.; Ichnowski, J.; Kanazawa, A.; Goldberg, K. Evo-NeRF: Evolving NeRF for Sequential Robot Grasping of Transparent Objects. In Proceedings of the 6th Conference on Robot Learning, PMLR, Auckland, New Zealand, 14–18 December 2023. [Google Scholar]
Zhu, H.; Sun, Y.; Liu, C.; Xia, L.; Luo, J.; Qiao, N.; Nevatia, R.; Kuo, C.H. Multimodal Neural Radiance Field. In Proceedings of the 2023 IEEE International Conference on Robotics and Automation (ICRA), London, UK, 29 May–2 June 2023. [Google Scholar] [CrossRef]
Boss, M.; Jampani, V.; Kim, K.; Lensch, H.P.A.; Kautz, J. Two-shot Spatially-varying BRDF and Shape Estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 3982–3991. [Google Scholar] [CrossRef]
Jakob, W.; Speierer, S.; Roussel, N.; Nimier-David, M.; Vicini, D.; Zeltner, T.; Nicolet, B.; Crespo, M.; Leroy, V.; Zhang, Z. Mitsuba 3 Renderer. 2022. Available online: https://mitsuba-renderer.org (accessed on 20 July 2022).
Deschaintre, V.; Aittala, M.; Durand, F.; Drettakis, G.; Bousseau, A. Single-image SVBRDF capture with a rendering-aware deep network. ACM Trans. Graph. 2018, 37, 1–15. [Google Scholar] [CrossRef]
He, Z.; Feng, W.; Zhao, X.; Lv, Y. 6D Pose Estimation of Objects: Recent Technologies and Challenges. Appl. Sci. 2020, 11, 228. [Google Scholar] [CrossRef]
Wang, C.; Xu, D.; Zhu, Y.; Martín-Martín, R.; Lu, C.; Fei-Fei, L.; Savarese, S. DenseFusion: 6D Object Pose Estimation by Iterative Dense Fusion. arXiv 2019, arXiv:1901.04780. [Google Scholar]
Trabelsi, A.; Chaabane, M.; Blanchard, N.; Beveridge, R. A Pose Proposal and Refinement Network for Better 6D Object Pose Estimation. In Proceedings of the 2021 IEEE Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 3–8 January 2021. [Google Scholar]
Sock, J.; Kasaei, S.H.; Lopes, L.S. Multi-view 6D Object Pose Estimation and Camera Motion Planning using RGBD Images. In Proceedings of the 2017 IEEE International Conference on Computer Vision Workshops (ICCVW), Venice, Italy, 22–29 October 2017. [Google Scholar]
Shah, S.H.; Lin, C.Y.; Tran, C.C.; Ahmad, A.R. Robot Pose Estimation and Normal Trajectory Generation on Curved Surface Using an Enhanced Non-Contact Approach. Sensors 2023, 23, 3816. [Google Scholar] [CrossRef]
Martelo, J.B.; Lundgren, J.; Andersson, M. Paperboard Coating Detection Based on Full-Stokes Imaging Polarimetry. Sensors 2020, 21, 208. [Google Scholar] [CrossRef]
Nezadal, M.; Schur, J.; Schmidt, L.P. Non-destructive testing of glass fibre reinforced plastics with a full polarimetric imaging system. In Proceedings of the 2014 39th International Conference on Infrared, Millimeter, and Terahertz waves (IRMMW-THz), Tucson, AZ, USA, 14–19 September 2014; IEEE: Piscataway, NJ, USA, 2014; pp. 1–2. [Google Scholar] [CrossRef]
Zhang, H.; Kidera, S. Polarimetric Signature CNN based Complex Permittivity Estimation for Microwave Non-destructive Testing. In Proceedings of the 2020 International Symposium on Antennas and Propagation (ISAP), Osaka, Japan, 25–28 January 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 781–782. [Google Scholar] [CrossRef]
Ding, Y.; Ye, J.; Barbalata, C.; Oubre, J.; Lemoine, C.; Agostinho, J.; Palardy, G. Next-generation perception system for automated defects detection in composite laminates via polarized computational imaging. arXiv 2021, arXiv:2108.10819. [Google Scholar]
Parkinson, J.C.; Coronato, P.A.; Greivenkamp, J.; Vukobratovich, D.; Kupinski, M. Mueller polarimetry for quantifying the stress optic coefficient in the infrared. In Proceedings of the Polarization Science and Remote Sensing XI, San Diego, CA, USA, 3 October 2023; Snik, F., Kupinski, M.K., Shaw, J.A., Eds.; SPIE: Paris, France, 2023; Volume 12690, pp. 95–107. [Google Scholar] [CrossRef]
Li, H.; Liao, R.; Zhang, H.; Ma, G.; Guo, Z.; Tu, H.; Chen, Y.; Ma, H. Stress Detection of Conical Frustum Windows in Submersibles Based on Polarization Imaging. Sensors 2022, 22, 2282. [Google Scholar] [CrossRef] [PubMed]
Harfenmeister, K.; Itzerott, S.; Weltzien, C.; Spengler, D. Agricultural Monitoring Using Polarimetric Decomposition Parameters of Sentinel-1 Data. Remote Sens. 2021, 13, 575. [Google Scholar] [CrossRef]
Yang, W.; Yang, J.; Zhao, K.; Gao, Q.; Liu, L.; Zhou, Z.; Hou, S.; Wang, X.; Shen, G.; Pang, X.; et al. Low-Noise Dual-Band Polarimetric Image Sensor Based on 1D Bi₂S₃ Nanowire. Adv. Sci. 2021, 8, 2100075. [Google Scholar] [CrossRef] [PubMed]
Usmani, K.; Krishnan, G.; O’Connor, T.; Javidi, B. Deep learning polarimetric three-dimensional integral imaging object recognition in adverse environmental conditions. Opt. Express 2021, 29, 12215–12228. [Google Scholar] [CrossRef] [PubMed]
Shao, M.; Xia, C.; Duan, D.; Wang, X. Polarimetric Inverse Rendering for Transparent Shapes Reconstruction. IEEE Trans. Multimed. 2024, 26, 7801–7811. [Google Scholar] [CrossRef]
Niemitz, L.; Sorensen, S.T.; Wang, Y.; Messina, W.; Burke, R.; Andersson-Engels, S. Towards a flexible polarimetric camera-on-tip miniature endoscope for 3 × 3 Mueller matrix measurements of biological tissue. In Proceedings of the Translational Biophotonics: Diagnostics and Therapeutics III, Munich, Germany, 25–29 June 2023; Lilge, L.D., Huang, Z., Eds.; SPIE: Paris, France, 2023; p. 110. [Google Scholar] [CrossRef]
Fernández, A.; Demczylo, R. Real-time polarimetric microscopy of biological tissue. In Proceedings of the Biophotonics Congress: Optics in the Life Sciences 2023 (OMA, NTM, BODA, OMP, BRAIN), Vancouver, BC, Canada, 24–27 April 2023; Optica Publishing Group: Washington, DC, USA, 2023; p. 1. [Google Scholar] [CrossRef]
Yang, B.; Yan, C.; Zhang, J.; Zhang, H. Refractive index and surface roughness estimation using passive multispectral and multiangular polarimetric measurements. Opt. Commun. 2016, 381, 336–345. [Google Scholar] [CrossRef]
Huynh, C.P.; Robles-Kelly, A.; Hancock, E.R. Shape and refractive index from single-view spectro-polarimetric images. Int. J. Comput. Vis. 2013, 101, 64–94. [Google Scholar] [CrossRef]
Kawahara, R.; Kuo, M.Y.J.; Okabe, T. Polarimetric Underwater Stereo. In Proceedings of the Scandinavian Conference on Image Analysis, Sirkka, Finland, 18–21 April 2023; Springer: Cham, Switzerland, 2023; pp. 534–550. [Google Scholar] [CrossRef]
Gao, J.; Wang, G.; Chen, Y.; Wang, X.; Li, Y.; Chew, K.H.; Chen, R.P. Mueller transform matrix neural network for underwater polarimetric dehazing imaging. Opt. Express 2023, 31, 27213–27222. [Google Scholar] [CrossRef]
Hu, H.; Zhang, Y.; Li, X.; Lin, Y.; Cheng, Z.; Liu, T. Polarimetric underwater image recovery via deep learning. Opt. Lasers Eng. 2020, 133, 106152. [Google Scholar] [CrossRef]
Qi, J.; Tatla, T.; Nissanka-Jayasuriya, E.; Yuan, A.Y.; Stoyanov, D.; Elson, D.S. Surgical polarimetric endoscopy for the detection of laryngeal cancer. Nat. Biomed. Eng. 2023, 7, 971–985. [Google Scholar] [CrossRef]
Qi, J.; Elson, D.S. Polarimetric endoscopy. In Polarized Light in Biomedical Imaging and Sensing: Clinical and Preclinical Applications; Springer: Cham, Switzerland, 2022; pp. 179–204. [Google Scholar]
Castaño, L.U.; Mirsanaye, K.; Kontenis, L.; Krouglov, S.; Žurauskas, E.; Navab, R.; Yasufuku, K.; Tsao, M.; Akens, M.K.; Wilson, B.C.; et al. Wide-field Stokes polarimetric microscopy for second harmonic generation imaging. J. Biophotonics 2023, 16, e202200284. [Google Scholar] [CrossRef]
Novikova, T.; Pierangelo, A.; Martino, A.D.; Benali, A.; Validire, P. Polarimetric Imaging for Cancer Diagnosis and Staging. Opt. Photon. News 2012, 23, 26–33. [Google Scholar] [CrossRef]
Hachkevych, O.R.; Matyash, I.Y.; Minaylova, I.A.; Mishchuk, O.M.; Serdega, B.K.; Terlets’kyi, R.F.; Brukhal’, M.B. Mathematical Modeling and Polarimetry of the Thermal Stressed State of a Partially Transparent Solid Subjected to the Action of Thermal Radiation. J. Math. Sci. 2023, 273, 982–998. [Google Scholar] [CrossRef]
Miyazaki, D.; Kagesawa, M.; Ikeuchi, K. Determining Shapes of Transparent Objects from Two Polarization Images. In Proceedings of the MVA, Nara, Japan, 11–13 December 2002; pp. 26–31. [Google Scholar]
Carterette, B.; Voorhees, E.M. Overview of information retrieval evaluation. In Current Challenges in Patent Information Retrieval; Springer: Berlin/Heidelberg, Germany, 2011; pp. 69–85. [Google Scholar]
Oršić, M.; Krešo, I.; Bevandić, P.; Šegvić, S. In Defense of Pre-trained ImageNet Architectures for Real-time Semantic Segmentation of Road-driving Images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 12607–12616. [Google Scholar]
Yan, R.; Yang, K.; Wang, K. NLFNet: Non-Local Fusion Towards Generalized Multimodal Semantic Segmentation across RGB-Depth, Polarization, and Thermal Images. In Proceedings of the 2021 IEEE International Conference on Robotics and Biomimetics (ROBIO), Sanya, China, 27–31 December 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 1129–1135. [Google Scholar] [CrossRef]
Shakeri, M.; Loo, S.Y.; Zhang, H. Polarimetric Monocular Dense Mapping Using Relative Deep Depth Prior. IEEE Robot. Autom. Lett. 2021, 6, 4512–4519. [Google Scholar] [CrossRef]
Alouini, M.; Goudail, F.; Refregier, P.; Grisard, A.; Lallier, E.; Dolfi, D. Multispectral polarimetric imaging with coherent illumination: Towards higher image contrast. In Proceedings of the Polarization: Measurement, Analysis, and Remote Sensing VI, Orlando, FL, USA, 15 July 2004; Goldstein, D.H., Chenault, D.B., Eds.; SPIE: Paris, France, 2004; Volume 5432, pp. 133–144. [Google Scholar] [CrossRef]
Hagen, N. Review of thermal infrared polarimetry, part 2: Experiment. Opt. Eng. 2022, 61, 080901. [Google Scholar] [CrossRef]
Mihoubi, S.; Lapray, P.J.; Bigué, L. Survey of Demosaicking Methods for Polarization Filter Array Images. Sensors 2018, 18, 3688. [Google Scholar] [CrossRef]
Pistellato, M.; Bergamasco, F.; Fatima, T.; Torsello, A. Deep Demosaicing for Polarimetric Filter Array Cameras. IEEE Trans. Image Process. 2022, 31, 2017–2026. [Google Scholar] [CrossRef]

Figure 1. SAM results over-segment all reflection areas differently and also over-segment the shadow of the object. In this context, reflection removal using polarimetry is useful. (a) Segmentation results with only RGB (b) Segmentation results with both modalities.

Figure 2. Architecture framework from CMX: Cross-Modal Fusion for RGB-X Semantic Segmentation with Transformers (source image [26]).

Figure 3. Architecture from Shape from Polarization for Complex Scenes in the Wild (source image [122]).

Figure 4. Normal estimation with and without polarization, (source image [122]). (a) RGB (b) AoLP (c) Normal from RGB (d) from RGB + AoLP.

Figure 5. Polarization-Aided Neural Decomposition of Radiance Architecture (PANDORA) [63].

Figure 6. Deep Polarization Imaging for 3D shape and SVBRDF Acquisition global architecture (source image [14]).

Figure 7. Depth image estimation from an RGBD sensor dealing with transparent objects. (a) Image acquired from the RGB lens (b) Image retrieved from the depth lens represent in jet colormap where blue color represents small distance and red ones larger distances.

Figure 8. Two examples of segmentation for autonomous driving when using polarimetric imaging. Red boxes represent the areas of interest, where it is observed the improvement when dealing with reflective surfaces (source image [26]). (a) RGB (b) AoLP (c) Results from RGB (d) Results from RGB+AoLP (e) Ground truth.

Figure 9. Normal estimation results (image source [14].) (a) Stock parameter I0 (b) Stock parameter I45 (c) Stock parameter I90 (d) Stock parameter I135 (e) Normal GT (f) Estimated normal.

Table 1. Table of different fields of application of polarimetric imaging.

Application	Condition	Papers
Transparent object	3D reconstruction	[13,42]
Transparent object	Segmentation	[24,43]
Multi-fields	Shape estimation from diffuse polarization	[2]
	Surface and shape reconstruction by inverse rendering	[14,15,44]
	Reflection removal	[23]
	Reconstruction of metallic surfaces	[28]
	Pose Estimation	[45,46]
	Navigation	[37,38,39]
Medical	Mueller microscopy	[40,41]
	Robotic imaging	[47]
	Semantic segmentation	[48]
Autonomous driving	Semantic segmentation	[25,26,27,49,50,51]
Autonomous driving	Object detection under bad weather	[52]
Landscape imaging	Haze removal	[51,53,54,55,56,57]
Industry	Piece inspection	[58]
Industry	Sorting by semantic segmentation	[24]
Low-light applications	Image restoration	[22,59]

Table 2. Evaluation results of experiment CMX using 3 different modalities (Color, AoLP, and DoLP) over 2 different architectures (SwinT and Segf).

	CMX SwinT	CMX Segf
Input	RGB	RGB	AoLP	DoLP
mIoU	0.496	0.543	0.732	0.757

Table 3. Evaluation results of experiment with Vibotorch using 4 different conditions.

	Pre-Train		Non Pre-Train
	Non-Augmented	Augmented	Non-Augmented	Augmented
Accuracy	0.766	0.831	0.743	0.809
Recall	0.772	0.826	0.729	0.809
F1-Score	0.769	0.817	0.736	0.802
Precision	0.770	0.821	0.732	0.806
mIoU	0.3143	0.4426	0.2971	0.3857

Table 4. Evaluation results for the experiment CMX vs. Vibotorch using mIoU metric. Here, the performance of segmenting 5 different classes is evaluated.

	Conditions	Glass	Car	Sky	Road	Building
CMX Segf	RGB	0.57	0.71	0.68	0.75	0.73
CMX SwinT	RGB	0.54	0.66	0.63	0.74	0.60
CMX Segf	AoLP	0.73	0.78	0.76	0.81	0.76
CMX Segf	DoLP	0.74	0.78	0.75	0.80	0.76
Vibotorch	Non-Augmented/pre-train	0.11	0.42	0.28	0.32	0.41
	Augmented/pre-train	0.30	0.44	0.39	0.41	0.48
	Non-Augmented/Non-Pre-train	0.19	0.24	0.22	0.29	0.39
	Augmented/Non-Pre-train	0.16	0.31	0.21	0.26	0.34

Table 5. Evaluation results for the experiment on normal vector image estimation. Each metric (MAE, MedianAE, and RMSE) is with respect to the 3-component normal vector.

	MAE	MedianAE	RMSE
SFP-Wild [122]	$47.3$	$32.47$	$120.4$
PANDORA [63]	$37.4$	$25.2$	$93.5$
Deep SVBRDF [14]	$36.6$	$9.43$	$119.8$
SFP non-linear [121]	$24.2$	$26.67$	$230.9$
Height from polarization [120]	$15.68$	$27.2$	474

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Taglione, C.; Mateo, C.; Stolz, C. Polarimetric Imaging for Robot Perception: A Review. Sensors 2024, 24, 4440. https://doi.org/10.3390/s24144440

AMA Style

Taglione C, Mateo C, Stolz C. Polarimetric Imaging for Robot Perception: A Review. Sensors. 2024; 24(14):4440. https://doi.org/10.3390/s24144440

Chicago/Turabian Style

Taglione, Camille, Carlos Mateo, and Christophe Stolz. 2024. "Polarimetric Imaging for Robot Perception: A Review" Sensors 24, no. 14: 4440. https://doi.org/10.3390/s24144440

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Polarimetric Imaging for Robot Perception: A Review

Abstract

1. Introduction

2. Principles of Polarization

2.1. Stokes Parameters

2.2. Light Characterization Using Stokes Parameters

2.3. Polarimetric Image Acquisition

2.4. Weaknesses and Drawbacks

3. Image Segmentation Using Polarimetry

3.1. State-of-the-Art Methods in Polarimetric Segmentation

3.2. Datasets for Polarimetric Image Segmentation

4. Shape Characterization

4.1. Depth from Polarimetry

4.2. Normal from Polarimetry

4.3. Three-Dimensional Reconstruction by Polarization

4.4. State-of-the-Art Methods for Shape Characterization

4.5. Datasets for Polarimetric Shape Estimation

5. Pose Estimation Using Polarimetry

6. Experimental Robotic Applications Using Polarimetry

7. Experimentation

7.1. Performance Evaluation of Polarimetry Segmentation

7.2. Performance Evaluation on Shape Characterization

8. Open Challenges

9. Discussion

10. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI