Scale-Adaptive High-Resolution Imaging Using a Rotating-Prism-Guided Variable-Boresight Camera

Deng, Zhaojun; Li, Anhu; Zhao, Xin; Lai, Yonghao; Jin, Jialiang

doi:10.3390/s25206313

Open AccessArticle

Scale-Adaptive High-Resolution Imaging Using a Rotating-Prism-Guided Variable-Boresight Camera

by

Zhaojun Deng

^1,*

,

Anhu Li

^2,*,

Xin Zhao

²,

Yonghao Lai

² and

Jialiang Jin

²

¹

College of Surveying and Geo-Informatics, Tongji University, Shanghai 200092, China

²

School of Mechanical Engineering, Tongji University, Shanghai 201804, China

^*

Authors to whom correspondence should be addressed.

Sensors 2025, 25(20), 6313; https://doi.org/10.3390/s25206313 (registering DOI)

Submission received: 2 September 2025 / Revised: 3 October 2025 / Accepted: 11 October 2025 / Published: 12 October 2025

(This article belongs to the Collection 3D Imaging and Sensing System)

Download

Browse Figures

Versions Notes

Abstract

Large-field-of-view (FOV) and high-resolution imaging have always been the goals pursued by imaging technology. A scale-adaptive high-resolution imaging architecture is established using a rotating-prism-embedded variable-boresight camera. By planning to prism motion, the multi-view images with rich information are combined to form a large-scale FOV image. The boresight is guided towards the region of interest (ROI) in the combined FOV to reconstruct super-resolution (SR) images with the desired information. A novel distortion correction method is proposed using virtual symmetrical prisms with rotation angles that are complementary. Based on light reverse tracing, the dispersion induced by monochromatic lights with different refractive indices can be eliminated by accurate pixel-level position compensation. For resolution enhancement, we provide a new scheme for SR imaging consisting of the residual removal network and information enhancement network by multi-view image fusion. The experiments show that the proposed architecture can achieve both large-FOV scene imaging for situational awareness and SR ROI display to acquire details, effectively perform distortion and dispersion correction, and alleviate the occlusion to a certain extent. It also provides higher image clarity compared to the traditional SR methods and overcomes the problem of balancing large-scale imaging and high-resolution imaging.

Keywords:

large FOV; super-resolution images; variable-boresight camera; rotating prisms; distortion correction

1. Introduction

Field of view (FOV) and resolution are key parameters to evaluate the performance of sensors [1]. High-resolution imaging with a large FOV has a wide range of applications in medical diagnosis, military reconnaissance, and deep space exploration [2,3]. In fact, wide-field high-resolution imaging is a challenge because these parameters are inherently contradictory in theory. The improvement in resolution mainly stems from two limitations [4]. Firstly, the size of the pixels used to construct the sensor cannot be infinitely small. In addition, the imaging resolution is limited by the optical diffraction limit. At present, super-resolution (SR) technology is regarded as an effective way to enforce the imaging resolution close to the diffraction limit [5].

SR imaging technology, restoring high-resolution images from low-resolution images through image processing methods, has received key research and extensive attention in recent years [6,7]. According to the number of original images, it mainly includes single-frame SR imaging methods and multi-frame SR imaging methods. The single-frame SR imaging technology achieves high-resolution information recovery by establishing a mapping relationship between low-resolution and SR images in advance [8]. Zhang proposed an SR imaging method based on a single image, which can learn local dictionaries and non-local similar structures from the input image to reconstruct high-resolution details [9]. BSRGAN, a generative adversarial network (GAN) for super-resolution (SR) imaging, was proposed by Zhang [10]. Their work enables the recovery of realistic textures from sampled images in public datasets. However, the above algorithms require prior knowledge to build mathematical models and have insufficient universality. SR technology based on multi-frame images can recover the perceived scene in high resolution by fusing multiple low-resolution images [11]. Compared with the super-resolution technology using a single image, it does not require prior information and has better adaptability and generalization ability. For this type of technology, sub-pixel imaging is the key to improving image resolution [12]. Therefore, a series of sub-pixel imaging methods have been proposed, such as multi-time/view imaging [13], optical scanning imaging [14], and camera array imaging [15]. In addition, microscopic imaging techniques are also used to improve image quality, such as compact lens-free microscopic imaging techniques [16]. Although the perceptual resolution has improved, the perceptual range is limited.

Flexible imaging systems consisting of a camera and mirrors, as a common wide-range imaging system, have received significant attention [17]. Typical systems include flow detection scanners [18], multi-mirror scanning systems [19], and micro-electro-driven variable boresight systems [20]. Regrettably, these large-scale perception systems based on the principle of optical reflection have limitations such as large physical size, sensitivity to errors in processing and assembly, and large moment of inertia. Notably, Carles et al. proposed array cameras with wedge prisms to expand the imaging FOV [21]. Compared with array cameras, it has certain advantages in terms of integration and economy. In fact, Risley prisms have been widely used in beam control due to their high pointing accuracy, compact configuration, and good dynamics [22]. In particular, the forward and inverse solutions of rotating Risley prisms have undergone systematic theoretical research to support their practical application [23]. Recently, rotating double prisms have been employed in beam scanners for lidar to achieve wide-range control of multi-beam lasers [24]. Rotating Risley prisms are embedded into an infrared camera to achieve large-scale thermal imaging and monitoring [25]. This means that Risley prisms still have great potential in large-scale and super-resolution imaging.

In this paper, we present a scale-adaptive high-resolution imaging architecture using a rotating-prism-embedded camera. By planning to prism motion, the multi-view images are combined to form a large-scale FOV. If a region of interest (ROI) exists in the combined FOV, the boresight is guided to stare at the ROI to reconstruct the super-resolution (SR) images with the desired information. Our architecture can effectively address the challenge of balancing the requirements of a large FOV and high-resolution imaging, perform distortion and dispersion correction, and alleviate the occlusion to a certain extent. The rest of this paper is summarized as follows. In Section 2, the model of scale-adaptive high-resolution imaging is established based on a rotating-prism-embedded variable-boresight camera. In Section 3, distortion correction, dispersion elimination, and super-resolution imaging are investigated to achieve image enhancement. In Section 4, the experiments are performed to verify the feasibility of our method. Conclusions are drawn in the end.

2. The Model of Large-Scale High-Resolution Imaging

Large-scale high-resolution imaging has wide applications in various scenarios, such as military reconnaissance, security rescue, and scene monitoring. Figure 1a shows the model of large-scale high-resolution imaging. It is composed of a camera to acquire raw images and rotating double prisms for flexible boresight adjustment. By planning the prism motion, the multi-view images are combined to form a large-scale perception FOV for situational awareness. If a region of interest exists in the combined FOV, the boresight is adjusted to stare at the ROI from different viewpoints. As a result, the SR image of the ROI is provided with photo-realistic detailed information by fusing multi-view images. Compared with the traditional multi-camera imaging model, the proposed model can achieve the imaging effect of an infinite number of cameras with just a single camera. The viewpoint adjustment is more flexible, the boresight pointing is more accurate, and the structural configuration is more compact. Figure 1b illustrates the process of light propagation. The coordinate system O-XYZ is established with the optical center of the camera as the origin. According to Snell’s law, we can obtain outgoing lights with A₁=[H_h, H_v, f] as the incident light:

A_{i + 1} = \frac{n_{i}}{n_{i + 1}} A_{i} + \{\sqrt{1 - {(\frac{n_{i}}{n_{i + 1}})}^{2} \cdot [1 - {(A_{i} \cdot N_{i})}^{2}]} - \frac{n_{i}}{n_{i + 1}} (A_{i, v} \cdot N_{i})\} N_{i} = {[x_{i + 1}, y_{i + 1}, z_{i + 1}]}^{T}

(1)

where i = 0, 1, 2, 3. n₁ and n₃ are the prism refractive index; namely, n₁ = n₃ = n. n₀, n₂, and n₄ are the air refractive indices, and namely, n₀ = n₂ = n₄ = 1. H_h and H_v are the sizes of the sensor chip in horizontal and vertical directions. f is the focal length of the camera. N₁, N₂, N₃, and N₄ are the normal vectors of the four planes of the prisms:

\{\begin{cases} N_{1} = {(0, 0, 1)}^{T} \\ N_{2} = {(\cos θ_{1} \sin α, \sin θ_{1} \sin α, \cos α)}^{T} \\ N_{3} = {(- \cos θ_{2} \sin α, - \sin θ_{2} \sin α, \cos α)}^{T} \\ N_{4} = {(0, 0, 1)}^{T} \end{cases}

(2)

where (θ₁, θ₂) represent the rotating angles. α denotes the wedge angle of the prism. Therefore, the FOV angle φ of the combined FOV can be deduced as:

φ = 2 a r c \cos (\frac{z_{5}}{\sqrt{x_{5}^{2} + y_{5}^{2}}})

(3)

3. Super-Resolution Imaging with Prism-Induced Distortion Correction

Figure 2 shows the basic architecture of rotating-prism-based super-resolution imaging, mainly including multi-view image acquisition using a rotating-prism-embedded variable-boresight camera, multi-view image preprocessing consisting of distortion correction and dispersion correction, and super-resolution imaging by multi-view image fusion.

As for multi-view imaging based on the variable-boresight camera, the camera boresight is adjusted to capture the multi-view ROI by rotating the double prisms. In terms of multi-viewpoint image preprocessing, it is inevitable that the quality degradation of images will occur, such as distortion and dispersion, because the lights propagate in the non-uniform prisms. Based on the principle that the direction of a light passing through a symmetrical prism remains unchanged, the distortion of multi-viewpoint images is corrected. In addition, the offsets of the three-channel lights are compensated based on Snell’s law to eliminate image dispersion. The deep learning network is utilized to fuse multi-view sequence images to output super-resolution images.

3.1. Multi-View Image Preprocessing

3.1.1. Distortion Correction Using Virtual Symmetrical Prisms

The propagation of light in a non-uniform prism does not follow the linear law, which leads to image distortion. Figure 3 shows the image distortion under different prism rotation angles. Each pixel on the camera corresponds to a definite area in the scene, and all pixels form a regular checkerboard image. However, the non-uniform prism has inconsistent refraction capabilities for lights in the FOV, resulting in the originally rectangular checkerboard image no longer being a regular checkerboard image. As the changes to prism rotation angles are undergoing, the distortion of the checkerboard is also constantly changing, including bending deformation, tensile deformation, and compressive deformation.

The directions of the incident and emergent rays are the same when a pair of prisms is symmetrically arranged, and their rotation angles satisfy θ₂ = θ₁ + 180°. Based on this fundamental principle, a pair of virtual symmetrical prisms can be constructed to eliminate the imaging distortion caused by the double prisms as shown in Figure 4.

Two virtual prisms are symmetrically arranged with the image plane as the symmetry plane. The parameters of virtual prisms 1 and 2 are the same as those of real prisms 1 and 2. The rotation angle θ_1v of the virtual prism 1 is θ₁ + 180°, and the rotation angle θ_2v of the virtual prism 2 is θ₂ + 180°. Based on Snell’s law, the incident ray A₂ of prism 1 is in the same direction as the emergent ray A_2v of virtual prism 1. Similarly, the incident ray A₄ of prism 2 is in the same direction as the emergent ray A_4v of virtual prism 2. Furthermore, the distance between the virtual prism 2 and the virtual image plane should be the focal length to obtain the corrected image with the same scale as the original image. To obtain a complete corrected image, each pixel of the distorted image is corrected in sequence. Supposing the image coordinates of a certain point on the distorted image are (u, v), and the focal length is f. So, the incident ray A₀ of virtual prism 1 can be expressed by:

A_{0} = \frac{{(u, v, f)}^{T}}{\sqrt{(u^{2} + v^{2} + f^{2})}}

(4)

N_1,v, N_2,v, N_3,v, and N_4,v are the normal vectors of the four planes of the virtual prisms:

\{\begin{cases} N_{1, v} = {(\sin θ_{1 v} \sin α, - \cos θ_{1 v} \sin α, \cos α)}^{T} \\ N_{2, v} = {(0, 0, 1)}^{T} \\ N_{3, v} = {(0, 0, 1)}^{T} \\ N_{4, v} = {(- \sin θ_{2 v} \sin α, \cos θ_{2 v} \sin α, \cos α)}^{T} \end{cases}

(5)

According to Equations (1), (4) and (5), we can obtain A_1v(A_1vx, A_1vy, A_1vz), A_2v(A_2vx, A_2vy, A_2vz), A_3v(A_3vx, A_3vy, A_3vz), and A_4v(A_4vx, A_4vy, A_4vz). Any light passing through the virtual double prisms can be determined by its direction vector and a point on the light:

\frac{x - x_{p}}{A_{i v x}} = \frac{y - y_{p}}{A_{i v y}} = \frac{z - z_{p}}{A_{i v z}}, i = 1, 2, 3, 4

(6)

where (x_p, y_p, z_p) is the point on the light. The virtual imaging plane and planes of the virtual prisms can be expressed by:

(x - x_{b}) N_{x} - (y - y_{b}) N_{y} + (z - z_{b}) N_{z} = 0

(7)

where [N_x, N_y, N_z] is the normal vector of a plane. (x_b, y_b, z_b) is a point on the plane. Combining Equations (6) and (7), we can obtain the intersection between the virtual imaging plane and the light determined by A_4v, namely corrected image coordinates P_v = (x_v, y_v, f)^T. The pixel coordinates corresponding to P_v are as follows:

(u_{v}, v_{v}) = (\frac{x_{v}}{d_{x}} + u_{0}, \frac{y_{v}}{d_{y}} + v_{0})

(8)

where d_x and d_y are the physical dimensions of pixels in the horizontal and vertical directions. (u₀, v₀) is the principal point of the camera.

3.1.2. Dispersion Elimination Based on Reverse Tracing

Image dispersion can lead to reduced contrast and color distortion, all of which directly reduce the visual effect of images. During the imaging process of an RGB camera, the three basic colors (red, blue, and green) are mixed in a certain proportion to form lights of any color. The lights with different colors have different wavelengths; namely, their refractive indices are different. Therefore, the camera combined with the double prisms will cause the image dispersion as shown in Figure 5.

As shown in Figure 5a, pixel-level offsets occur on the three-channel layers because the wavelengths of the red, green, and blue lights provide the different refractive indices in the prisms. Figure 5b shows the 5 × 5-pixel area in the upper right corner of the imaging plane. Because the refractive index of the wavelength of blue light is the largest, the offset of the blue channel in the image plane is the largest. This means that the three-channel offsets on the image plane can be determined under the condition that the parameters of the imaging system are known. Therefore, these offsets can be compensated to eliminate dispersion. According to Section 3.1, the pixel coordinates of each channel after distortion correction can be obtained:

\{\begin{cases} p_{v r} = {(u_{v r}, v_{v r})}^{T} \\ p_{v g} = {(u_{v g}, v_{v g})}^{T} \\ p_{v b} = {(u_{v b}, v_{v b})}^{T} \end{cases}

(9)

Taking the red channel as the reference, the dispersion elimination adjustment amounts for the green and blue channels are:

\{\begin{cases} Δ p_{v g} = |p_{v g} - p_{v r}| \\ Δ p_{v b} = |p_{v b} - p_{v r}| \end{cases}

(10)

3.2. Super-Resolution Imaging by Multi-Viewpoint Image Fusion

Figure 6 shows the basic schematic diagram of super-resolution imaging by multi-view image fusion. The imaging system adjusts the prism rotation angle to capture multi-frame sequence images from different perspectives. The acquired multi-view images are independently input to the residual removal network for degradation removal. Subsequently, the clean image without artifacts enters the information enhancement network to output the super-resolution images.

Since there are inevitably some noises in the image, such as artifacts, and residual distortions, these image degradations need to be removed before they are input into the network. Let x_i be the captured raw image from the i-th viewpoint, and R be the residual removal network that is 20 residual blocks, that is:

{\tilde{x}}_{i} = R (x_{i})

(11)

where

{\tilde{x}}_{i}

are the images with degradation removal. The network architecture for the residual block is shown in Figure 7.

Then, the clean sequence images are passed to the information enhancement network consisting of backward propagations, forward propagations, and upsampling modules. The network architectures for the backward propagation and forward propagation are shown in Figure 8. Specifically, the neighboring frame images of x_i are x_i−1 and x_i+1. In addition, the corresponding features from neighboring images are h_{f, i−1} and h_{f, i+1}. We can obtain:

\{\begin{cases} h_{f, i} = F_{f} (x_{i,} x_{i - 1,} h_{f, i - 1}) \\ h_{b, i} = F_{b} (x_{i,} x_{i + 1,} h_{b, i - 1}) \end{cases}

(12)

where F_b is the backward propagation, and F_f is the forward propagation. The backward and forward propagations include the flow estimation module W [26], the spatial warping module S [27], and the residual blocks R. The propagation process of the backward and forward propagations can be expressed by:

\{\begin{cases} s_{b / f, i} = S (x_{i}, x_{i \pm 1}) \\ h_{w, b / f, i} = W (h_{w, b / f, i \pm 1}, s_{b / f, i}) \\ h_{b / f, i} = R (x_{i,} h_{w, b / f, i}) \end{cases}

(13)

The upsampling module consists of multiple convolutions and pixel-shuffle [28]. Here, it is named V:

\{y_{i}\} = V (\{{\tilde{x}}_{i}\})

(14)

A low-resolution true value is used to constrain the output of the residual removal network:

η = \sum_{i = 1}^{n} δ ({\tilde{x}}_{i} - m (b_{i}))

(15)

where b_i represents the true value of the original high-resolution image, m represents the downsampling operator, and

δ

represents Charbonnier loss function. In many cases, introducing the residual removal network only once cannot effectively eliminate excessive image degradation. The repeated introduction of the residual removal network is prone to causing image distortion. For this reason, a dynamic optimization scheme is proposed:

\{\begin{array}{l} {\tilde{x}}_{i}^{j + 1} = R ({\tilde{x}}_{i}^{j}) i f \sum_{j = 1}^{n} |{\tilde{x}}_{i}^{j} - {\tilde{x}}_{i}^{j - 1}| / n \geq σ, \\ {\tilde{x}}_{i} = {\tilde{x}}_{i}^{j} o t h e r w i s e \end{array}

(16)

where σ is a pre-determined stop threshold. After conducting multiple tests, it is determined that setting σ to 1.5 is appropriate.

As for architecture, a convolution is employed to capture the shallow features in the image. Additionally, the deep features in the image are extracted by 20 residual blocks. Finally, the convolution layer is used to generate clean images. In term of training settings, the REDS dataset [29] is employed to train this network. Adam optimizer [30] is adopted with constant learning rates. The patch size of the input low-resolution image is set to 64 × 64. In the process of the training, the pre-train is performed with output loss and cleaning loss. Specifically, the number of iterations is 300 K, the batch size is 16, and the learning rate is 10⁻⁴. Then, the network is finetuned with perceptual loss [31]. Specifically, the number of iterations is 150 K, the batch size is 8, and the learning rate is 5 × 10⁻⁵.

4. Experiment

4.1. Simulation Experiment

To further demonstrate the superiority of the system, simulation experiments were performed in expanding FOV and boresight pointing ranges. The parameters of the simulation system are as follows: α = 10°, n = 1.517, the thin end thickness of the prism is D₀ = 3 mm, spacing between the double prisms is D = 3 mm, and the sampling plane distance from the double prisms is 500 mm.

The pitch angle of boresight is the largest when the rotation angles of the two prisms in the double prisms are the same, that is, the deflection effect on the imaging light is the most significant. Therefore, by driving the rotation angles to be the same and taking 45° as the step size, the imaging range of the system is displayed as shown in Figure 9. The red area in the figure represents the original FOV of the camera. Specifically, the angle of horizontal FOV φ_H₀ is 8.84°, while the angle of vertical FOV φ_V₀ is 6.64°. The horizontal combined FOV can obtain its extreme value when the rotation angles reach 90° and 270°, respectively. Similarly, their vertical combined FOV reaches its extreme value when the rotation angles are 0° and 180°. According to Equation (3), it is calculated that the horizontal and vertical combined FOVs are 26° and 23.67°. The combined FOV has been improved by approximately three times compared to the camera’s original FOV.

In order to deeply analyze the effect of system parameters on the imaging range expansion, the control variable method is adopted for quantitative analysis. The key analysis focuses on the variation trends of the horizontal combined FOV φ_H and the vertical combined FOV φ_V with α and n. To directly demonstrate the capabilities of these two parameters on the FOV expansion, the horizontal and vertical FOV magnification are defined as K_H = φ_H/φ_H₀ and K_V = φ_V/φ_V₀.

Figure 10 shows the variation laws of n and α on the FOV expansion. It can be seen from Figure 10a that the physical properties of the prism are similar to those of air when the refractive index approaches 1, that is, the imaging light passing through the prism hardly undergoes refraction. Meanwhile, K_H and K_V approach 1; namely, the effect of expanding the FOV is not obvious. However, φ_H, φ_V, K_H, and K_V keep increasing with the increase of n, roughly showing a linear relationship. Similarly, φ_H and φ_V keep expanding as α increases as shown in Figure 10b. K_H and K_V also increase with the increase of α, that is, the FOV expansion ability is stronger. Compared with n, α has a more significant effect on expanding FOV. The reason for this variation rule is that the refraction level of light in a prism has a nonlinear relationship with α.

Figure 11a shows the boresight pointing area on the sampling plane. There is a maximum pointing boundary when the double prisms with a 0° angle difference rotate synchronously. In addition, the refraction effects of the two prisms on the lights cancel each other out when their angle difference is 180°. As a result, the emergent light vector A₄ and the incident light vector A₀ are in the same direction and both along the optical axis. However, there is a certain gap D between the two prisms due to the system installation requirements as shown in Figure 11b. Therefore, there must be a certain area on the sampling plane that the boresight cannot reach, namely the pointing blind zone. Specifically, the radius of the pointing blind zone is represented by R_min. The influences of n, α, and D on the pointing blind zone were analyzed.

Figure 12 shows the blind zone distribution under different α, n, and D. It can be seen from Figure 12a that R_min gradually increases with an increase in n when D does not change. In addition, R_min also increases along with the increase of D under the condition that n is constant. Similarly, R_min also keeps increasing with the increase of α when D is a constant value as shown in Figure 12b. R_min also gradually increases with an increase in D given that α remains constant. In particular, the degree of blind zone expansion is becoming increasingly significant at a high level of constant D. By further comparing Figure 10a, Figure 10b, Figure 12a and Figure 12b comprehensively, we can conclude that increasing α and n can effectively expand the imaging range, but it will also expand the blind area. Therefore, the gap between the double prisms must be minimized as much as possible to reduce the blind zone expansion caused by the increase in α and n for expanding FOV.

4.2. Real Experiment

Figure 13a shows the experimental setup, mainly including (1) a camera, (2) rotating double prisms, (3) a control system, (4) an object, and (5) a calibration board. The rotating double prisms are coaxially arranged with the camera, and the double prisms can be rotated to adjust the boresight pointing arbitrarily. The control system is capable of enabling the double prism to rotate freely within a 360° range. The objects to be imaged are set within the combined FOV of the camera and the double prisms. The camera captures multi-view images of the calibration board to calibrate the parameters of the camera based on the previous work [32]. Specifically, the parameters of the camera and the double prisms are shown in Table 1.

To verify the large-FOV imaging feasibility of the proposed architecture, large-scale imaging experiments were carried out. The first object and the second object were imaged and combined from different perspectives as shown in Figure 13b. Usually, we want to obtain more feature information to assist in object recognition. However, no more information from the side of the target can be obtained from a fixed viewpoint due to the object self-occlusion as shown in the red dashed box in Figure 13c. It is notable that the occlusion is overcome by viewpoint adjustment as shown in the red solid line box in Figure 13c; namely, the rich information from the side of the target can be obtained.

To demonstrate the feasibility of the proposed distortion correction method, a distortion correction experiment was carried out. The calibration board was captured by the camera embedded in the rotating double prisms as shown in Figure 14a. We can see that there is an obvious distortion in the calibration board image. The regular black and white rectangles have turned into rhombuses. The horizontal and the vertical lines have turned into curves. The distortion correction method proposed in Section 3.1 was adopted to correct the distorted image of the calibration board, of which the results are shown in Figure 14b. We can see that the black and white rhombuses have been corrected to regular rectangles, and the horizontal and vertical lines have basically been restored to straight lines. This means that our method can correct the distortion caused by the nonlinear propagation of light in uneven prisms.

Another important experiment is the SR imaging of the object of interest. The object of interest was captured under different rotation angle combinations of prisms, such as (0°, 180°), (0.1°, 180.1°), (0.1°, 179.9°), (180°, 0°), (180.1°, 0.1°), and (179.9°, 0.1°). The collected multi-viewpoint images are shown in Figure 15.

The collected sequence images are preprocessed by distortion correction and dispersion elimination. Subsequently, the corrected images were input into the proposed super-resolution model to output the SR image as shown in Figure 16. The resolution magnification factor K = 4, that is, the image resolution is (320 × K) × (240 × K) = 1280 × 960.

It can be clearly seen in Figure 16a,b that the details and textures in the SR image have been significantly improved compared with the original image. More texture and feature information can be identified in the SR image. Specifically, the aliased green and black backgrounds in the original image are restored and distinguishable in the SR image. In addition, the branches in the purple box become recognizable in the SR image, while the branches are blurred or even unrecognizable in the original image. In particular, the artifacts and dispersion in the SR image are significantly eliminated compared with the original image as shown in the green box in Figure 16c.

To quantitatively analyze the optimization effect of this method on image clarity and texture features, a comparative verification experiment was carried out. The comparison results are shown in Table 2. Nearest Super-Resolution (Nearest SR) [33], Bilinear Super-Resolution (Bilinear SR) [34], and Bicubic Super-Resolution (Bicubic SR) [35] methods were adopted to perform super-resolution imaging on the above-mentioned original images at the same magnification, respectively. Quantitative analysis was conducted based on the three common evaluation indicators for image clarity, including the Brenner gradient, Tenengrad gradient, and discrete cosine transform (DCT). Bilinear SR and Bicubic SR have similar imaging quality in the three evaluation indicators, but the imaging quality of the two methods is the poorest, followed by Nearest SR. Our method outperforms other methods in all three evaluation indicators and particularly stands out in the Brenner gradient. In addition, the proposed method is compared with the two methods (BSRGAN [10] and Real-ESRGAN [36]) based on deep learning. NIQE [37] and BRISQUE [38] are used for qualitative comparison. The comparison results are shown in Table 3. The imaging quality of BSRGAN and Real-ESRGAN is similar in terms of NIQE and BRISQUE. However, the proposed method outperforms the other two methods in both of these metrics, especially in BRISQUE.

5. Conclusions

In this paper, we present a scale-adaptive high-resolution imaging architecture using a rotating-prism-embedded camera. By planning to prism motion, the multi-view images are combined to form a large-scale FOV. If a region of interest (ROI) exists in the combined FOV, the boresight is guided to capture the ROI from different viewpoints to get the super-resolution (SR) image with the desired information. A novel distortion correction method is proposed using virtual symmetrical prisms with rotating supplementary angles, which can eliminate image distortion caused by non-uniform refraction. Based on light reverse tracing, the dispersion induced by monochromatic lights with different refractive indices passing through prisms can be eliminated by accurate pixel-level position compensation. For resolution enhancement, we provide a new scheme for SR imaging consisting of the residual removal network for artifact removal and the information enhancement network to improve resolution by multi-view image fusion. The influence of system parameters on expanding FOV and boresight pointing ranges was analyzed to guide the system design. The experiments show that the proposed architecture can achieve both large-FOV scene imaging for situational awareness and SR ROI display to acquire details, effectively perform distortion and dispersion correction, and alleviate occlusion to some extent. It also provides higher image clarity compared to the traditional SR methods. However, the proposed method still has certain limitations. The process of multi-viewpoint image acquisition and distortion correction for large-scale imaging still requires a certain amount of time. In the future, we will continue to explore strategies for enhancing the efficiency of this method, such as algorithm lightweighting, increases in system rotation speeds, and improvement of imaging frame rates.

Author Contributions

Conceptualization, Z.D.; methodology, Z.D.; software, Z.D. and J.J.; validation, Z.D.; formal analysis, X.Z. and J.J.; investigation, Z.D.; data curation, Z.D.; writing—original draft preparation, Z.D.; writing—review and editing, Z.D. and A.L.; visualization, X.Z. and Y.L.; supervision, A.L.; project administration, Z.D.; funding acquisition, Z.D. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by [the National Natural Science Foundation of China] grant number [62403364], [China Postdoctoral Science Foundation] grant number [GZC20241214, 2024M752421], and [Chinese Society of Construction Machinery Young Talent Lifting Project] grant number [CCMS-YESS2023001].

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding authors.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Joy, J.; Santhi, N.; Ramar, K.; Bama, S.B. Spatial frequency discrete wavelet transform image fusion technique for remote sensing applications. Eng. Sci. Technol. Int. J. 2019, 22, 715–726. [Google Scholar] [CrossRef]
Brady, D.J.; Gehm, M.E.; Stack, R.A.; Marks, D.L.; Kittle, D.S.; Golish, D.R.; Vera, E.M.; Feller, S.D. Multiscale gigapixel photography. Nature 2012, 486, 386–389. [Google Scholar] [CrossRef]
Yue, L.W.; Shen, H.F.; Li, J.; Yuan, Q.Q.; Zhang, H.Y.; Zhang, L.P. Image super-resolution: The techniques, applications, and future. Signal Process 2016, 128, 389–408. [Google Scholar] [CrossRef]
Aguilar, A.; García-Márquez, J.; Landgrave, J.E.A. Super-resolution with a complex-amplitude pupil mask encoded in the first diffraction order of a phase grating. Opt. Lasers Eng. 2020, 134, 106247. [Google Scholar] [CrossRef]
Carles, G.; Downing, J.; Harvey, A.R. Super-resolution imaging using a camera array. Opt. Lett. 2014, 39, 1889–1892. [Google Scholar] [CrossRef]
Wang, H.; Gao, X.; Zhang, K.; Li, J. Fast single image super-resolution using sparse Gaussian process regression. Signal Process 2017, 134, 52–62. [Google Scholar] [CrossRef]
Zhang, K.; Tao, D.; Gao, X.; Li, X.; Xiong, Z. Learning multiple linear mappings for efficient single image super-resolution. IEEE Trans. Image Process. 2015, 24, 846–861. [Google Scholar] [CrossRef] [PubMed]
Shukla, A.; Merugu, S.; Jain, K. A Technical Review on Image Super-Resolution Techniques. In Advances in Cybernetics, Cognition, and Machine Learning for Communication Technologies; Springer: Berlin/Heidelberg, Germany, 2020; pp. 543–565. [Google Scholar] [CrossRef]
Zhang, K.; Li, J.; Wang, H.; Liu, X.; Gao, X. Learning local dictionaries and similarity structures for single image super-resolution. Signal Process. 2018, 142, 231–243. [Google Scholar] [CrossRef]
Zhang, K.; Liang, J.; Van Gool, L.; Timofte, R. Designing a Practical Degradation Model for Deep Blind Image Super-Resolution. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10–17 October 2021; pp. 4771–4780. [Google Scholar] [CrossRef]
Quevedo, E.; Delory, E.; Callicó, G.M.; Tobajas, F.; Sarmiento, R. Underwater video enhancement using multi-camera super-resolution. Opt. Commun. 2017, 404, 94–102. [Google Scholar] [CrossRef]
Song, Y.; Xie, Y.; Viktor, M.; Xiao, J.; Jung, I.; Ki-Joong, C.; Liu, Z.; Hyunsung, P.; Lu, C.; Rak-Hwan, K.; et al. Digital cameras with designs inspired by the arthropod eye. Nature 2013, 497, 95–99. [Google Scholar] [CrossRef] [PubMed]
Zhang, H.; Yang, Z.; Zhang, L.; Shen, H. Super-resolution reconstruction for multi-angle remote sensing images considering resolution differences. Remote Sens. 2014, 6, 637–657. [Google Scholar] [CrossRef]
Sun, M.; Yu, K. A sur-pixel scan method for super-resolution reconstruction. Optik 2013, 124, 6905–6909. [Google Scholar] [CrossRef]
Pu, X.; Wang, X.; Shi, L.; Ma, Y.; Wei, C.; Gao, X.; Gao, J. Computational imaging and occluded objects percep-tion method based on polarization camera array. Opt. Express 2023, 31, 24633–24651. [Google Scholar] [CrossRef]
Tian, D.; Yu, N.; Yu, J.; Zhang, H.; Sun, J.; Bai, X. Research on dual-line array sub-pixel scanning imaging for IoMT-based blood cell analysis system. IEEE Internet Things J. 2022, 10, 367–377. [Google Scholar] [CrossRef]
Kosuke, T.; Nobuhara, S.; Matsuyama, T. Mirror-based Camera Pose Estimation Using an Orthogonality Constraint. IPSJ Trans. Comput. Vis. Appl. 2016, 8, 11–19. [Google Scholar] [CrossRef]
Duma, V.F.; Lee, K.; Meemon, P.; Rolland, J.P. Experimental investigations of the scanning functions of galvanometer-based scanners with applications in OCT. Appl. Opt. 2011, 50, 5735–5749. [Google Scholar] [CrossRef] [PubMed]
Roessler, F.; Streek, A. Accelerating laser processes with a smart two-dimensional polygon mirror scanner for ultra-fast beam deflection. Adv. Opt. Technol. 2021, 10, 297–304. [Google Scholar] [CrossRef]
Cheng, H.; Liu, S.; Hsu, C.; Lin, H.; Shih, F.; Wu, M.; Liang, K.; Lai, M.; Fang, W. On the design of piezoelectric MEMS scanning mirror for large reflection area and wide scan angle. Sens. Actuators A Phys. 2023, 349, 114010. [Google Scholar] [CrossRef]
Carles, G.; Chen, S.; Bustin, N.; Downing, J.; McCall, D.; Wood, A.; Harvey, A.R. Multi-aperture foveated imaging. Opt. Lett. 2016, 41, 1869–1872. [Google Scholar] [CrossRef]
Garcia-Torales, G. Risley prisms applications: An overview. Adv. 3OM Opto-Mechatron. Opto-Mech. Opt. Metrol. 2022, 12170, 136–146. [Google Scholar] [CrossRef]
Duma, V.F.; Dimb, A.L. Exact scan patterns of rotational Risley prisms obtained with a graphical method: Multi-parameter analysis and design. Appl. Sci. 2021, 11, 8451. [Google Scholar] [CrossRef]
Brazeal, R.G.; Wilkinson, B.E.; Hochmair, H.H. A rigorous observation model for the risley prism-based livox mid-40 lidar sensor. Sensors 2021, 21, 4722. [Google Scholar] [CrossRef]
Lai, S.; Lee, C. Double-wedge prism scanner for application in thermal imaging systems. Appl. Opt. 2018, 57, 6290–6299. [Google Scholar] [CrossRef] [PubMed]
Chan, K.C.K.; Wang, X.; Yu, K.; Dong, C.; Loy, C.C. Basicvsr: The search for essential components in video super-resolution and beyond. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 4947–4956. [Google Scholar] [CrossRef]
Ranjan, A.; Black, M.J. Optical flow estimation using a spatial pyramid network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4161–4170. [Google Scholar] [CrossRef]
Wang, X.; Chan, K.C.K.; Yu, K.; Dong, C.; Loy, C.C. Edvr: Video restoration with enhanced deformable convolutional networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–17 June 2019; pp. 1–10. [Google Scholar] [CrossRef]
Nah, S.; Baik, S.; Hong, S.; Moon, G.; Son, S.; Timofte, R.; Lee, K.M. NTIRE 2019 challenge on video deblurring and super resolution: Dataset and study. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 16–17 June 2019; pp. 1–10. [Google Scholar] [CrossRef]
Kingma, D.; Ba, J. Adam: A Method for Stochastic Optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar] [CrossRef]
Johnson, J.; Alahi, A.; Fei-Fei, L. Perceptual Losses for Real-Time Style Transfer and Super-Resolution. arXiv 2016, arXiv:1603.08155. [Google Scholar] [CrossRef]
Zhang, Z. A flexible new technique for camera calibration. IEEE Trans. Pattern Anal. Mach. Intell. 2002, 22, 1330–1334. [Google Scholar] [CrossRef]
Brown, M.; Lowe, D.G. Automatic panoramic image stitching using invariant features. Int. J. Comput. Vis. 2007, 74, 59–73. [Google Scholar] [CrossRef]
Zhang, X. A new kind of super-resolution reconstruction algorithm based on the ICM and the bilinear interpolation. In Proceedings of the International Seminar on Future BioMedical Information Engineering, Shanghai, China, 21–22 December 2008; pp. 183–186. [Google Scholar] [CrossRef]
Hou, H.; Andrews, H. Cubic splines for image interpolation and digital filtering. IEEE Trans. Acoust. Speech Signal Process. 1978, 26, 508–517. [Google Scholar] [CrossRef]
Wang, X.; Xie, L.; Dong, C.; Shan, Y. Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), Montreal, BC, Canada, 11–17 October 2021; pp. 1905–1914. [Google Scholar] [CrossRef]
Mittal, A.; Soundararajan, R.; Bovik, A.C. Making a “Completely Blind” Image Quality Analyzer. IEEE Signal Process. Lett. 2013, 20, 209–212. [Google Scholar] [CrossRef]
Mittal, A.; Moorthy, A.K.; Bovik, A.C. Blind/Referenceless Image Spatial Quality Evaluator. In Proceedings of the 2011 Conference Record of the Forty Fifth Asilomar Conference on Signals, Systems and Computers (ASILOMAR), Pacific Grove, CA, USA, 6–9 November 2011; pp. 723–727. [Google Scholar] [CrossRef]

Figure 1. The model of large-scale high-resolution imaging. (a) Imaging system. (b) Light propagation.

Figure 2. The basic architecture of super-resolution imaging.

Figure 3. Image distortion under different prism rotation angles. (a) Imaging without rotating prisms. (b) θ₁ = 0°, θ₂ = 0°. (c) θ₁ = 45°, θ₂ = 45°. (d) θ₁ = 0°, θ₂ = −45°. (e) θ₁ = 0°, θ₂ = 45°. (f) θ₁ = 45°, θ₂ = 90°.

Figure 4. Schematic diagram of the distortion correction by virtual symmetrical prisms.

Figure 5. Schematic diagram of image dispersion. (a) Overall dispersion effect. (b) Magnified local view.

Figure 6. The basic schematic diagram of super-resolution imaging by multi-view image fusion.

Figure 7. Network architecture for the residual block.

Figure 8. Network architecture for the forward propagation and backward propagation.

Figure 9. The imaging FOV in different rotation angles.

Figure 10. The variation law of FOV. (a) The variation of FOV with the increase in α. (b) The variation of FOV with the increase in n.

Figure 11. Schematic diagram of the scanning region. (a) Scanning region. (b) Causes of the blind area.

Figure 12. Blind zone distribution. (a) The influence of α and D on the blind zone. (b) The influence of n and D on the blind zone.

Figure 13. Experimental setup and scene perception. (a) Experimental setup. (b) FOV expansion. (c) Viewpoint adjustment.

Figure 14. Correction image comparison. (a) Distortion image. (b) Correction image.

Figure 15. Multi-viewpoint image sequence.

Figure 16. Image comparison. (a) Original image. (b) SR image. (c) Image comparison.

Table 1. System parameters.

Prism Diameter (mm)	D₀ (mm)	n	α (°)	Camera Internal Parameters	Distortion Coefficient
80	5	1.517	10	[7321.2, 0, 0; 0, 7427.4, 0; 579.1, 562.5, 1]	(−0.67, 8.26)

Table 2. Comparison of the interpolation-based Methods.

Method	Nearest SR	Bilinear SR	Bicubic SR	Proposed method
Brenner (∙10⁹)	1.03	0.30	0.45	1.31
Tenengrad (∙10⁷)	5.16	2.03	2.72	6.20
DCT (∙10⁶)	1.99	1.39	1.49	2.24

Table 3. Comparison of the methods based on deep learning.

Method	BSRGAN	Real-ESRGAN	Proposed Method
NIQE	5.43	5.39	4.45
BRISQUE	34.71	35.12	31.24

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Deng, Z.; Li, A.; Zhao, X.; Lai, Y.; Jin, J. Scale-Adaptive High-Resolution Imaging Using a Rotating-Prism-Guided Variable-Boresight Camera. Sensors 2025, 25, 6313. https://doi.org/10.3390/s25206313

AMA Style

Deng Z, Li A, Zhao X, Lai Y, Jin J. Scale-Adaptive High-Resolution Imaging Using a Rotating-Prism-Guided Variable-Boresight Camera. Sensors. 2025; 25(20):6313. https://doi.org/10.3390/s25206313

Chicago/Turabian Style

Deng, Zhaojun, Anhu Li, Xin Zhao, Yonghao Lai, and Jialiang Jin. 2025. "Scale-Adaptive High-Resolution Imaging Using a Rotating-Prism-Guided Variable-Boresight Camera" Sensors 25, no. 20: 6313. https://doi.org/10.3390/s25206313

APA Style

Deng, Z., Li, A., Zhao, X., Lai, Y., & Jin, J. (2025). Scale-Adaptive High-Resolution Imaging Using a Rotating-Prism-Guided Variable-Boresight Camera. Sensors, 25(20), 6313. https://doi.org/10.3390/s25206313

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Scale-Adaptive High-Resolution Imaging Using a Rotating-Prism-Guided Variable-Boresight Camera

Abstract

1. Introduction

2. The Model of Large-Scale High-Resolution Imaging

3. Super-Resolution Imaging with Prism-Induced Distortion Correction

3.1. Multi-View Image Preprocessing

3.1.1. Distortion Correction Using Virtual Symmetrical Prisms

3.1.2. Dispersion Elimination Based on Reverse Tracing

3.2. Super-Resolution Imaging by Multi-Viewpoint Image Fusion

4. Experiment

4.1. Simulation Experiment

4.2. Real Experiment

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI